Sample records for build statistical models

  1. Statistical considerations on prognostic models for glioma

    PubMed Central

    Molinaro, Annette M.; Wrensch, Margaret R.; Jenkins, Robert B.; Eckel-Passow, Jeanette E.

    2016-01-01

    Given the lack of beneficial treatments in glioma, there is a need for prognostic models for therapeutic decision making and life planning. Recently several studies defining subtypes of glioma have been published. Here, we review the statistical considerations of how to build and validate prognostic models, explain the models presented in the current glioma literature, and discuss advantages and disadvantages of each model. The 3 statistical considerations to establishing clinically useful prognostic models are: study design, model building, and validation. Careful study design helps to ensure that the model is unbiased and generalizable to the population of interest. During model building, a discovery cohort of patients can be used to choose variables, construct models, and estimate prediction performance via internal validation. Via external validation, an independent dataset can assess how well the model performs. It is imperative that published models properly detail the study design and methods for both model building and validation. This provides readers the information necessary to assess the bias in a study, compare other published models, and determine the model's clinical usefulness. As editors, reviewers, and readers of the relevant literature, we should be cognizant of the needed statistical considerations and insist on their use. PMID:26657835

  2. Building damage assessment from PolSAR data using texture parameters of statistical model

    NASA Astrophysics Data System (ADS)

    Li, Linlin; Liu, Xiuguo; Chen, Qihao; Yang, Shuai

    2018-04-01

    Accurate building damage assessment is essential in providing decision support for disaster relief and reconstruction. Polarimetric synthetic aperture radar (PolSAR) has become one of the most effective means of building damage assessment, due to its all-day/all-weather ability and richer backscatter information of targets. However, intact buildings that are not parallel to the SAR flight pass (termed oriented buildings) and collapsed buildings share similar scattering mechanisms, both of which are dominated by volume scattering. This characteristic always leads to misjudgments between assessments of collapsed buildings and oriented buildings from PolSAR data. Because the collapsed buildings and the intact buildings (whether oriented or parallel buildings) have different textures, a novel building damage assessment method is proposed in this study to address this problem by introducing texture parameters of statistical models. First, the logarithms of the estimated texture parameters of different statistical models are taken as a new texture feature to describe the collapse of the buildings. Second, the collapsed buildings and intact buildings are distinguished using an appropriate threshold. Then, the building blocks are classified into three levels based on the building block collapse rate. Moreover, this paper also discusses the capability for performing damage assessment using texture parameters from different statistical models or using different estimators. The RADARSAT-2 and ALOS-1 PolSAR images are used to present and analyze the performance of the proposed method. The results show that using the texture parameters avoids the problem of confusing collapsed and oriented buildings and improves the assessment accuracy. The results assessed by using the K/G0 distribution texture parameters estimated based on the second moment obtain the highest extraction accuracies. For the RADARSAT-2 and ALOS-1 data, the overall accuracy (OA) for these three types of buildings is 73.39% and 68.45%, respectively.

  3. Illuminating Tradespace Decisions Using Efficient Experimental Space-Filling Designs for the Engineered Resilient System Architecture

    DTIC Science & Technology

    2015-06-30

    7. Building Statistical Metamodels using Simulation Experimental Designs ............................................... 34 7.1. Statistical Design...system design drivers across several different domain models, our methodology uses statistical metamodeling to approximate the simulations’ behavior. A...output. We build metamodels using a number of statistical methods that include stepwise regression, boosted trees, neural nets, and bootstrap forest

  4. Illuminating Tradespace Decisions Using Efficient Experimental Space-Filling Designs for the Engineered Resilient System Architecture

    DTIC Science & Technology

    2015-06-01

    7. Building Statistical Metamodels using Simulation Experimental Designs ............................................... 34 7.1. Statistical Design...system design drivers across several different domain models, our methodology uses statistical metamodeling to approximate the simulations’ behavior. A...output. We build metamodels using a number of statistical methods that include stepwise regression, boosted trees, neural nets, and bootstrap forest

  5. 10 CFR 436.31 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... systems, building load simulation models, statistical regression analysis, or some combination of these..., excluding any cogeneration process for other than a federally owned building or buildings or other federally...

  6. Statistical distribution of building lot frontage: application for Tokyo downtown districts

    NASA Astrophysics Data System (ADS)

    Usui, Hiroyuki

    2018-03-01

    The frontage of a building lot is the determinant factor of the residential environment. The statistical distribution of building lot frontages shows how the perimeters of urban blocks are shared by building lots for a given density of buildings and roads. For practitioners in urban planning, this is indispensable to identify potential districts which comprise a high percentage of building lots with narrow frontage after subdivision and to reconsider the appropriate criteria for the density of buildings and roads as residential environment indices. In the literature, however, the statistical distribution of building lot frontages and the density of buildings and roads has not been fully researched. In this paper, based on the empirical study in the downtown districts of Tokyo, it is found that (1) a log-normal distribution fits the observed distribution of building lot frontages better than a gamma distribution, which is the model of the size distribution of Poisson Voronoi cells on closed curves; (2) the statistical distribution of building lot frontages statistically follows a log-normal distribution, whose parameters are the gross building density, road density, average road width, the coefficient of variation of building lot frontage, and the ratio of the number of building lot frontages to the number of buildings; and (3) the values of the coefficient of variation of building lot frontages, and that of the ratio of the number of building lot frontages to that of buildings are approximately equal to 0.60 and 1.19, respectively.

  7. The application of the statistical classifying models for signal evaluation of the gas sensors analyzing mold contamination of the building materials

    NASA Astrophysics Data System (ADS)

    Majerek, Dariusz; Guz, Łukasz; Suchorab, Zbigniew; Łagód, Grzegorz; Sobczuk, Henryk

    2017-07-01

    Mold that develops on moistened building barriers is a major cause of the Sick Building Syndrome (SBS). Fungal contamination is normally evaluated using standard biological methods which are time-consuming and require a lot of manual labor. Fungi emit Volatile Organic Compounds (VOC) that can be detected in the indoor air using several techniques of detection e.g. chromatography. VOCs can be also detected using gas sensors arrays. All array sensors generate particular voltage signals that ought to be analyzed using properly selected statistical methods of interpretation. This work is focused on the attempt to apply statistical classifying models in evaluation of signals from gas sensors matrix to analyze the air sampled from the headspace of various types of the building materials at different level of contamination but also clean reference materials.

  8. The epistemology of mathematical and statistical modeling: a quiet methodological revolution.

    PubMed

    Rodgers, Joseph Lee

    2010-01-01

    A quiet methodological revolution, a modeling revolution, has occurred over the past several decades, almost without discussion. In contrast, the 20th century ended with contentious argument over the utility of null hypothesis significance testing (NHST). The NHST controversy may have been at least partially irrelevant, because in certain ways the modeling revolution obviated the NHST argument. I begin with a history of NHST and modeling and their relation to one another. Next, I define and illustrate principles involved in developing and evaluating mathematical models. Following, I discuss the difference between using statistical procedures within a rule-based framework and building mathematical models from a scientific epistemology. Only the former is treated carefully in most psychology graduate training. The pedagogical implications of this imbalance and the revised pedagogy required to account for the modeling revolution are described. To conclude, I discuss how attention to modeling implies shifting statistical practice in certain progressive ways. The epistemological basis of statistics has moved away from being a set of procedures, applied mechanistically, and moved toward building and evaluating statistical and scientific models. Copyrigiht 2009 APA, all rights reserved.

  9. Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard.

    PubMed

    Terwilliger, Thomas C; Grosse-Kunstleve, Ralf W; Afonine, Pavel V; Moriarty, Nigel W; Zwart, Peter H; Hung, Li Wei; Read, Randy J; Adams, Paul D

    2008-01-01

    The PHENIX AutoBuild wizard is a highly automated tool for iterative model building, structure refinement and density modification using RESOLVE model building, RESOLVE statistical density modification and phenix.refine structure refinement. Recent advances in the AutoBuild wizard and phenix.refine include automated detection and application of NCS from models as they are built, extensive model-completion algorithms and automated solvent-molecule picking. Model-completion algorithms in the AutoBuild wizard include loop building, crossovers between chains in different models of a structure and side-chain optimization. The AutoBuild wizard has been applied to a set of 48 structures at resolutions ranging from 1.1 to 3.2 A, resulting in a mean R factor of 0.24 and a mean free R factor of 0.29. The R factor of the final model is dependent on the quality of the starting electron density and is relatively independent of resolution.

  10. Incorporating GIS building data and census housing statistics for sub-block-level population estimation

    USGS Publications Warehouse

    Wu, S.-S.; Wang, L.; Qiu, X.

    2008-01-01

    This article presents a deterministic model for sub-block-level population estimation based on the total building volumes derived from geographic information system (GIS) building data and three census block-level housing statistics. To assess the model, we generated artificial blocks by aggregating census block areas and calculating the respective housing statistics. We then applied the model to estimate populations for sub-artificial-block areas and assessed the estimates with census populations of the areas. Our analyses indicate that the average percent error of population estimation for sub-artificial-block areas is comparable to those for sub-census-block areas of the same size relative to associated blocks. The smaller the sub-block-level areas, the higher the population estimation errors. For example, the average percent error for residential areas is approximately 0.11 percent for 100 percent block areas and 35 percent for 5 percent block areas.

  11. 10 CFR 420.2 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... Planning Organization means that organization required by the Department of Transportation, and designated... planning provisions in a Standard Metropolitan Statistical Area. Model Energy Code, 1993, including Errata, means the model building code published by the Council of American Building Officials, which is...

  12. Development and evaluation of statistical shape modeling for principal inner organs on torso CT images.

    PubMed

    Zhou, Xiangrong; Xu, Rui; Hara, Takeshi; Hirano, Yasushi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Hoshi, Hiroaki; Kido, Shoji; Fujita, Hiroshi

    2014-07-01

    The shapes of the inner organs are important information for medical image analysis. Statistical shape modeling provides a way of quantifying and measuring shape variations of the inner organs in different patients. In this study, we developed a universal scheme that can be used for building the statistical shape models for different inner organs efficiently. This scheme combines the traditional point distribution modeling with a group-wise optimization method based on a measure called minimum description length to provide a practical means for 3D organ shape modeling. In experiments, the proposed scheme was applied to the building of five statistical shape models for hearts, livers, spleens, and right and left kidneys by use of 50 cases of 3D torso CT images. The performance of these models was evaluated by three measures: model compactness, model generalization, and model specificity. The experimental results showed that the constructed shape models have good "compactness" and satisfied the "generalization" performance for different organ shape representations; however, the "specificity" of these models should be improved in the future.

  13. The Use of Modelling for Theory Building in Qualitative Analysis

    ERIC Educational Resources Information Center

    Briggs, Ann R. J.

    2007-01-01

    The purpose of this article is to exemplify and enhance the place of modelling as a qualitative process in educational research. Modelling is widely used in quantitative research as a tool for analysis, theory building and prediction. Statistical data lend themselves to graphical representation of values, interrelationships and operational…

  14. Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard

    PubMed Central

    Terwilliger, Thomas C.; Grosse-Kunstleve, Ralf W.; Afonine, Pavel V.; Moriarty, Nigel W.; Zwart, Peter H.; Hung, Li-Wei; Read, Randy J.; Adams, Paul D.

    2008-01-01

    The PHENIX AutoBuild wizard is a highly automated tool for iterative model building, structure refinement and density modification using RESOLVE model building, RESOLVE statistical density modification and phenix.refine structure refinement. Recent advances in the AutoBuild wizard and phenix.refine include automated detection and application of NCS from models as they are built, extensive model-completion algorithms and automated solvent-molecule picking. Model-completion algorithms in the AutoBuild wizard include loop building, crossovers between chains in different models of a structure and side-chain optimization. The AutoBuild wizard has been applied to a set of 48 structures at resolutions ranging from 1.1 to 3.2 Å, resulting in a mean R factor of 0.24 and a mean free R factor of 0.29. The R factor of the final model is dependent on the quality of the starting electron density and is relatively independent of resolution. PMID:18094468

  15. End-use energy consumption estimates for US commercial buildings, 1989

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Belzer, D.B.; Wrench, L.E.; Marsh, T.L.

    An accurate picture of how energy is used in the nation`s stock of commercial buildings can serve a variety of program planning and policy needs within the Department of Energy, by utilities, and other groups seeking to improve the efficiency of energy use in the building sector. This report describes an estimation of energy consumption by end use based upon data from the 1989 Commercial Building Energy Consumption Survey (CBECS). The methodology used in the study combines elements of engineering simulations and statistical analysis to estimate end-use intensities for heating, cooling, ventilation, lighting, refrigeration, hot water, cooking, and miscellaneous equipment.more » Billing data for electricity and natural gas were first decomposed into weather and nonweather dependent loads. Subsequently, Statistical Adjusted Engineering (SAE) models were estimated by building type with annual data. The SAE models used variables such as building size, vintage, climate region, weekly operating hours, and employee density to adjust the engineering model predicted loads to the observed consumption. End-use consumption by fuel was estimated for each of the 5,876 buildings in the 1989 CBECS. The report displays the summary results for eleven separate building types as well as for the total US commercial building stock.« less

  16. Estimating regional plant biodiversity with GIS modelling

    Treesearch

    Louis R. Iverson; Anantha M. Prasad; Anantha M. Prasad

    1998-01-01

    We analyzed a statewide species database together with a county-level geographic information system to build a model based on well-surveyed areas to estimate species richness in less surveyed counties. The model involved GIS (Arc/Info) and statistics (S-PLUS), including spatial statistics (S+SpatialStats).

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Los Alamos National Laboratory, Mailstop M888, Los Alamos, NM 87545, USA; Lawrence Berkeley National Laboratory, One Cyclotron Road, Building 64R0121, Berkeley, CA 94720, USA; Department of Haematology, University of Cambridge, Cambridge CB2 0XY, England

    The PHENIX AutoBuild Wizard is a highly automated tool for iterative model-building, structure refinement and density modification using RESOLVE or TEXTAL model-building, RESOLVE statistical density modification, and phenix.refine structure refinement. Recent advances in the AutoBuild Wizard and phenix.refine include automated detection and application of NCS from models as they are built, extensive model completion algorithms, and automated solvent molecule picking. Model completion algorithms in the AutoBuild Wizard include loop-building, crossovers between chains in different models of a structure, and side-chain optimization. The AutoBuild Wizard has been applied to a set of 48 structures at resolutions ranging from 1.1 {angstrom} tomore » 3.2 {angstrom}, resulting in a mean R-factor of 0.24 and a mean free R factor of 0.29. The R-factor of the final model is dependent on the quality of the starting electron density, and relatively independent of resolution.« less

  18. Constructing and Modifying Sequence Statistics for relevent Using informR in 𝖱

    PubMed Central

    Marcum, Christopher Steven; Butts, Carter T.

    2015-01-01

    The informR package greatly simplifies the analysis of complex event histories in 𝖱 by providing user friendly tools to build sufficient statistics for the relevent package. Historically, building sufficient statistics to model event sequences (of the form a→b) using the egocentric generalization of Butts’ (2008) relational event framework for modeling social action has been cumbersome. The informR package simplifies the construction of the complex list of arrays needed by the rem() model fitting for a variety of cases involving egocentric event data, multiple event types, and/or support constraints. This paper introduces these tools using examples from real data extracted from the American Time Use Survey. PMID:26185488

  19. a Statistical Texture Feature for Building Collapse Information Extraction of SAR Image

    NASA Astrophysics Data System (ADS)

    Li, L.; Yang, H.; Chen, Q.; Liu, X.

    2018-04-01

    Synthetic Aperture Radar (SAR) has become one of the most important ways to extract post-disaster collapsed building information, due to its extreme versatility and almost all-weather, day-and-night working capability, etc. In view of the fact that the inherent statistical distribution of speckle in SAR images is not used to extract collapsed building information, this paper proposed a novel texture feature of statistical models of SAR images to extract the collapsed buildings. In the proposed feature, the texture parameter of G0 distribution from SAR images is used to reflect the uniformity of the target to extract the collapsed building. This feature not only considers the statistical distribution of SAR images, providing more accurate description of the object texture, but also is applied to extract collapsed building information of single-, dual- or full-polarization SAR data. The RADARSAT-2 data of Yushu earthquake which acquired on April 21, 2010 is used to present and analyze the performance of the proposed method. In addition, the applicability of this feature to SAR data with different polarizations is also analysed, which provides decision support for the data selection of collapsed building information extraction.

  20. Incorporating GIS and remote sensing for census population disaggregation

    NASA Astrophysics Data System (ADS)

    Wu, Shuo-Sheng'derek'

    Census data are the primary source of demographic data for a variety of researches and applications. For confidentiality issues and administrative purposes, census data are usually released to the public by aggregated areal units. In the United States, the smallest census unit is census blocks. Due to data aggregation, users of census data may have problems in visualizing population distribution within census blocks and estimating population counts for areas not coinciding with census block boundaries. The main purpose of this study is to develop methodology for estimating sub-block areal populations and assessing the estimation errors. The City of Austin, Texas was used as a case study area. Based on tax parcel boundaries and parcel attributes derived from ancillary GIS and remote sensing data, detailed urban land use classes were first classified using a per-field approach. After that, statistical models by land use classes were built to infer population density from other predictor variables, including four census demographic statistics (the Hispanic percentage, the married percentage, the unemployment rate, and per capita income) and three physical variables derived from remote sensing images and building footprints vector data (a landscape heterogeneity statistics, a building pattern statistics, and a building volume statistics). In addition to statistical models, deterministic models were proposed to directly infer populations from building volumes and three housing statistics, including the average space per housing unit, the housing unit occupancy rate, and the average household size. After population models were derived or proposed, how well the models predict populations for another set of sample blocks was assessed. The results show that deterministic models were more accurate than statistical models. Further, by simulating the base unit for modeling from aggregating blocks, I assessed how well the deterministic models estimate sub-unit-level populations. I also assessed the aggregation effects and the resealing effects on sub-unit estimates. Lastly, from another set of mixed-land-use sample blocks, a mixed-land-use model was derived and compared with a residential-land-use model. The results of per-field land use classification are satisfactory with a Kappa accuracy statistics of 0.747. Model Assessments by land use show that population estimates for multi-family land use areas have higher errors than those for single-family land use areas, and population estimates for mixed land use areas have higher errors than those for residential land use areas. The assessments of sub-unit estimates using a simulation approach indicate that smaller areas show higher estimation errors, estimation errors do not relate to the base unit size, and resealing improves all levels of sub-unit estimates.

  1. A regression-based approach to estimating retrofit savings using the Building Performance Database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Walter, Travis; Sohn, Michael D.

    Retrofitting building systems is known to provide cost-effective energy savings. This article addresses how the Building Performance Database is used to help identify potential savings. Currently, prioritizing retrofits and computing their expected energy savings and cost/benefits can be a complicated, costly, and an uncertain effort. Prioritizing retrofits for a portfolio of buildings can be even more difficult if the owner must determine different investment strategies for each of the buildings. Meanwhile, we are seeing greater availability of data on building energy use, characteristics, and equipment. These data provide opportunities for the development of algorithms that link building characteristics and retrofitsmore » empirically. In this paper we explore the potential of using such data for predicting the expected energy savings from equipment retrofits for a large number of buildings. We show that building data with statistical algorithms can provide savings estimates when detailed energy audits and physics-based simulations are not cost- or time-feasible. We develop a multivariate linear regression model with numerical predictors (e.g., operating hours, occupant density) and categorical indicator variables (e.g., climate zone, heating system type) to predict energy use intensity. The model quantifies the contribution of building characteristics and systems to energy use, and we use it to infer the expected savings when modifying particular equipment. We verify the model using residual analysis and cross-validation. We demonstrate the retrofit analysis by providing a probabilistic estimate of energy savings for several hypothetical building retrofits. We discuss the ways understanding the risk associated with retrofit investments can inform decision making. The contributions of this work are the development of a statistical model for estimating energy savings, its application to a large empirical building dataset, and a discussion of its use in informing building retrofit decisions.« less

  2. Predicting Energy Performance of a Net-Zero Energy Building: A Statistical Approach

    PubMed Central

    Kneifel, Joshua; Webb, David

    2016-01-01

    Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid climate zone, and compares these estimates to the results from already existing EnergyPlus whole building energy simulations. This regression model exhibits agreement with EnergyPlus predictive trends in energy production and net consumption, but differs greatly in energy consumption. The model can be used as a framework for alternative and more complex models based on the experimental data collected from the NZERTF. PMID:27956756

  3. Predicting Energy Performance of a Net-Zero Energy Building: A Statistical Approach.

    PubMed

    Kneifel, Joshua; Webb, David

    2016-09-01

    Performance-based building requirements have become more prevalent because it gives freedom in building design while still maintaining or exceeding the energy performance required by prescriptive-based requirements. In order to determine if building designs reach target energy efficiency improvements, it is necessary to estimate the energy performance of a building using predictive models and different weather conditions. Physics-based whole building energy simulation modeling is the most common approach. However, these physics-based models include underlying assumptions and require significant amounts of information in order to specify the input parameter values. An alternative approach to test the performance of a building is to develop a statistically derived predictive regression model using post-occupancy data that can accurately predict energy consumption and production based on a few common weather-based factors, thus requiring less information than simulation models. A regression model based on measured data should be able to predict energy performance of a building for a given day as long as the weather conditions are similar to those during the data collection time frame. This article uses data from the National Institute of Standards and Technology (NIST) Net-Zero Energy Residential Test Facility (NZERTF) to develop and validate a regression model to predict the energy performance of the NZERTF using two weather variables aggregated to the daily level, applies the model to estimate the energy performance of hypothetical NZERTFs located in different cities in the Mixed-Humid climate zone, and compares these estimates to the results from already existing EnergyPlus whole building energy simulations. This regression model exhibits agreement with EnergyPlus predictive trends in energy production and net consumption, but differs greatly in energy consumption. The model can be used as a framework for alternative and more complex models based on the experimental data collected from the NZERTF.

  4. An Assessment of Actual and Potential Building Climate Zone Change and Variability From the Last 30 Years Through 2100 Using NASA's MERRA and CMIP5 Simulations

    NASA Technical Reports Server (NTRS)

    Stackhouse, Paul W., Jr.; Chandler, William S.; Hoell, James M.; Westberg, David; Zhang, Taiping

    2015-01-01

    Background: In the US, residential and commercial building infrastructure combined consumes about 40% of total energy usage and emits about 39% of total CO2 emission (DOE/EIA "Annual Energy Outlook 2013"). Building codes, as used by local and state enforcement entities are typically tied to the dominant climate within an enforcement jurisdiction classified according to various climate zones. These climate zones are based upon a 30-year average of local surface observations and are developed by DOE and ASHRAE. Establishing the current variability and potential changes to future building climate zones is very important for increasing the energy efficiency of buildings and reducing energy costs and emissions in the future. Objectives: This paper demonstrates the usefulness of using NASA's Modern Era Retrospective-analysis for Research and Applications (MERRA) atmospheric data assimilation to derive the DOE/ASHRAE building climate zone maps and then using MERRA to define the last 30 years of variability in climate zones for the Continental US. An atmospheric assimilation is a global atmospheric model optimized to satellite, atmospheric and surface in situ measurements. Using MERRA as a baseline, we then evaluate the latest Climate Model Inter-comparison Project (CMIP) climate model Version 5 runs to assess potential variability in future climate zones under various assumptions. Methods: We derive DOE/ASHRAE building climate zones using surface and temperature data products from MERRA. We assess these zones using the uncertainties derived by comparison to surface measurements. Using statistical tests, we evaluate variability of the climate zones in time and assess areas in the continental US for statistically significant trends by region. CMIP 5 produced a data base of over two dozen detailed climate model runs under various greenhouse gas forcing assumptions. We evaluate the variation in building climate zones for 3 different decades using an ensemble and quartile statistics to provide an assessment of potential building climate zone changes relative to the uncertainties demonstrated using MERRA. Findings and Conclusions: These results show that there is a statistically significant increase in the area covered by warmer climate zones and a tendency for a reduction of area in colder climate zones in some limited regions. The CMIP analysis shows that models vary from relatively little building climate zone change for the least sensitive and conservation assumptions to a warming of at most 3 zones for certain areas, particularly the north central US by the end of the 21st century.

  5. Air-flow distortion and turbulence statistics near an animal facility

    NASA Astrophysics Data System (ADS)

    Prueger, J. H.; Eichinger, W. E.; Hipps, L. E.; Hatfield, J. L.; Cooper, D. I.

    The emission and dispersion of particulates and gases from concentrated animal feeding operations (CAFO) at local to regional scales is a current issue in science and society. The transport of particulates, odors and toxic chemical species from the source into the local and eventually regional atmosphere is largely determined by turbulence. Any models that attempt to simulate the dispersion of particles must either specify or assume various statistical properties of the turbulence field. Statistical properties of turbulence are well documented for idealized boundary layers above uniform surfaces. However, an animal production facility is a complex surface with structures that act as bluff bodies that distort the turbulence intensity near the buildings. As a result, the initial release and subsequent dispersion of effluents in the region near a facility will be affected by the complex nature of the surface. Previous Lidar studies of plume dispersion over the facility used in this study indicated that plumes move in complex yet organized patterns that would not be explained by the properties of turbulence generally assumed in models. The objective of this study was to characterize the near-surface turbulence statistics in the flow field around an array of animal confinement buildings. Eddy covariance towers were erected in the upwind, within the building array and downwind regions of the flow field. Substantial changes in turbulence intensity statistics and turbulence-kinetic energy (TKE) were observed as the mean wind flow encountered the building structures. Spectra analysis demonstrated unique distribution of the spectral energy in the vertical profile above the buildings.

  6. Applying Regression Analysis to Problems in Institutional Research.

    ERIC Educational Resources Information Center

    Bohannon, Tom R.

    1988-01-01

    Regression analysis is one of the most frequently used statistical techniques in institutional research. Principles of least squares, model building, residual analysis, influence statistics, and multi-collinearity are described and illustrated. (Author/MSE)

  7. Pre-Service Mathematics Teachers' Use of Probability Models in Making Informal Inferences about a Chance Game

    ERIC Educational Resources Information Center

    Kazak, Sibel; Pratt, Dave

    2017-01-01

    This study considers probability models as tools for both making informal statistical inferences and building stronger conceptual connections between data and chance topics in teaching statistics. In this paper, we aim to explore pre-service mathematics teachers' use of probability models for a chance game, where the sum of two dice matters in…

  8. A new multiple regression model to identify multi-family houses with a high prevalence of sick building symptoms "SBS", within the healthy sustainable house study in Stockholm (3H).

    PubMed

    Engvall, Karin; Hult, M; Corner, R; Lampa, E; Norbäck, D; Emenius, G

    2010-01-01

    The aim was to develop a new model to identify residential buildings with higher frequencies of "SBS" than expected, "risk buildings". In 2005, 481 multi-family buildings with 10,506 dwellings in Stockholm were studied by a new stratified random sampling. A standardised self-administered questionnaire was used to assess "SBS", atopy and personal factors. The response rate was 73%. Statistical analysis was performed by multiple logistic regressions. Dwellers owning their building reported less "SBS" than those renting. There was a strong relationship between socio-economic factors and ownership. The regression model, ended up with high explanatory values for age, gender, atopy and ownership. Applying our model, 9% of all residential buildings in Stockholm were classified as "risk buildings" with the highest proportion in houses built 1961-1975 (26%) and lowest in houses built 1985-1990 (4%). To identify "risk buildings", it is necessary to adjust for ownership and population characteristics.

  9. Organism-level models: When mechanisms and statistics fail us

    NASA Astrophysics Data System (ADS)

    Phillips, M. H.; Meyer, J.; Smith, W. P.; Rockhill, J. K.

    2014-03-01

    Purpose: To describe the unique characteristics of models that represent the entire course of radiation therapy at the organism level and to highlight the uses to which such models can be put. Methods: At the level of an organism, traditional model-building runs into severe difficulties. We do not have sufficient knowledge to devise a complete biochemistry-based model. Statistical model-building fails due to the vast number of variables and the inability to control many of them in any meaningful way. Finally, building surrogate models, such as animal-based models, can result in excluding some of the most critical variables. Bayesian probabilistic models (Bayesian networks) provide a useful alternative that have the advantages of being mathematically rigorous, incorporating the knowledge that we do have, and being practical. Results: Bayesian networks representing radiation therapy pathways for prostate cancer and head & neck cancer were used to highlight the important aspects of such models and some techniques of model-building. A more specific model representing the treatment of occult lymph nodes in head & neck cancer were provided as an example of how such a model can inform clinical decisions. A model of the possible role of PET imaging in brain cancer was used to illustrate the means by which clinical trials can be modelled in order to come up with a trial design that will have meaningful outcomes. Conclusions: Probabilistic models are currently the most useful approach to representing the entire therapy outcome process.

  10. Physics-based statistical model and simulation method of RF propagation in urban environments

    DOEpatents

    Pao, Hsueh-Yuan; Dvorak, Steven L.

    2010-09-14

    A physics-based statistical model and simulation/modeling method and system of electromagnetic wave propagation (wireless communication) in urban environments. In particular, the model is a computationally efficient close-formed parametric model of RF propagation in an urban environment which is extracted from a physics-based statistical wireless channel simulation method and system. The simulation divides the complex urban environment into a network of interconnected urban canyon waveguides which can be analyzed individually; calculates spectral coefficients of modal fields in the waveguides excited by the propagation using a database of statistical impedance boundary conditions which incorporates the complexity of building walls in the propagation model; determines statistical parameters of the calculated modal fields; and determines a parametric propagation model based on the statistical parameters of the calculated modal fields from which predictions of communications capability may be made.

  11. A Statistical Analysis of the Economic Drivers of Battery Energy Storage in Commercial Buildings: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Long, Matthew; Simpkins, Travis; Cutler, Dylan

    There is significant interest in using battery energy storage systems (BESS) to reduce peak demand charges, and therefore the life cycle cost of electricity, in commercial buildings. This paper explores the drivers of economic viability of BESS in commercial buildings through statistical analysis. A sample population of buildings was generated, a techno-economic optimization model was used to size and dispatch the BESS, and the resulting optimal BESS sizes were analyzed for relevant predictor variables. Explanatory regression analyses were used to demonstrate that peak demand charges are the most significant predictor of an economically viable battery, and that the shape ofmore » the load profile is the most significant predictor of the size of the battery.« less

  12. A Statistical Analysis of the Economic Drivers of Battery Energy Storage in Commercial Buildings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Long, Matthew; Simpkins, Travis; Cutler, Dylan

    There is significant interest in using battery energy storage systems (BESS) to reduce peak demand charges, and therefore the life cycle cost of electricity, in commercial buildings. This paper explores the drivers of economic viability of BESS in commercial buildings through statistical analysis. A sample population of buildings was generated, a techno-economic optimization model was used to size and dispatch the BESS, and the resulting optimal BESS sizes were analyzed for relevant predictor variables. Explanatory regression analyses were used to demonstrate that peak demand charges are the most significant predictor of an economically viable battery, and that the shape ofmore » the load profile is the most significant predictor of the size of the battery.« less

  13. Population activity statistics dissect subthreshold and spiking variability in V1.

    PubMed

    Bányai, Mihály; Koman, Zsombor; Orbán, Gergő

    2017-07-01

    Response variability, as measured by fluctuating responses upon repeated performance of trials, is a major component of neural responses, and its characterization is key to interpret high dimensional population recordings. Response variability and covariability display predictable changes upon changes in stimulus and cognitive or behavioral state, providing an opportunity to test the predictive power of models of neural variability. Still, there is little agreement on which model to use as a building block for population-level analyses, and models of variability are often treated as a subject of choice. We investigate two competing models, the doubly stochastic Poisson (DSP) model assuming stochasticity at spike generation, and the rectified Gaussian (RG) model tracing variability back to membrane potential variance, to analyze stimulus-dependent modulation of both single-neuron and pairwise response statistics. Using a pair of model neurons, we demonstrate that the two models predict similar single-cell statistics. However, DSP and RG models have contradicting predictions on the joint statistics of spiking responses. To test the models against data, we build a population model to simulate stimulus change-related modulations in pairwise response statistics. We use single-unit data from the primary visual cortex (V1) of monkeys to show that while model predictions for variance are qualitatively similar to experimental data, only the RG model's predictions are compatible with joint statistics. These results suggest that models using Poisson-like variability might fail to capture important properties of response statistics. We argue that membrane potential-level modeling of stochasticity provides an efficient strategy to model correlations. NEW & NOTEWORTHY Neural variability and covariability are puzzling aspects of cortical computations. For efficient decoding and prediction, models of information encoding in neural populations hinge on an appropriate model of variability. Our work shows that stimulus-dependent changes in pairwise but not in single-cell statistics can differentiate between two widely used models of neuronal variability. Contrasting model predictions with neuronal data provides hints on the noise sources in spiking and provides constraints on statistical models of population activity. Copyright © 2017 the American Physiological Society.

  14. Spatial-temporal analysis of building surface temperatures in Hung Hom

    NASA Astrophysics Data System (ADS)

    Zeng, Ying; Shen, Yueqian

    2015-12-01

    This thesis presents a study on spatial-temporal analysis of building surface temperatures in Hung Hom. Observations were collected from Aug 2013 to Oct 2013 at a 30-min interval, using iButton sensors (N=20) covering twelve locations in Hung Hom. And thermal images were captured in PolyU from 05 Aug 2013 to 06 Aug 2013. A linear regression model of iButton and thermal records is established to calibrate temperature data. A 3D modeling system is developed based on Visual Studio 2010 development platform, using ArcEngine10.0 component, Microsoft Access 2010 database and C# programming language. The system realizes processing data, spatial analysis, compound query and 3D face temperature rendering and so on. After statistical analyses, building face azimuths are found to have a statistically significant relationship with sun azimuths at peak time. And seasonal building temperature changing also corresponds to the sun angle and sun azimuth variations. Building materials are found to have a significant effect on building surface temperatures. Buildings with lower albedo materials tend to have higher temperatures and larger thermal conductivity material have significant diurnal variations. For the geographical locations, the peripheral faces of campus have higher temperatures than the inner faces during day time and buildings located at the southeast are cooler than the western. Furthermore, human activity is found to have a strong relationship with building surface temperatures through weekday and weekend comparison.

  15. On entropy, financial markets and minority games

    NASA Astrophysics Data System (ADS)

    Zapart, Christopher A.

    2009-04-01

    The paper builds upon an earlier statistical analysis of financial time series with Shannon information entropy, published in [L. Molgedey, W. Ebeling, Local order, entropy and predictability of financial time series, European Physical Journal B-Condensed Matter and Complex Systems 15/4 (2000) 733-737]. A novel generic procedure is proposed for making multistep-ahead predictions of time series by building a statistical model of entropy. The approach is first demonstrated on the chaotic Mackey-Glass time series and later applied to Japanese Yen/US dollar intraday currency data. The paper also reinterprets Minority Games [E. Moro, The minority game: An introductory guide, Advances in Condensed Matter and Statistical Physics (2004)] within the context of physical entropy, and uses models derived from minority game theory as a tool for measuring the entropy of a model in response to time series. This entropy conditional upon a model is subsequently used in place of information-theoretic entropy in the proposed multistep prediction algorithm.

  16. Development of algorithms for building inventory compilation through remote sensing and statistical inferencing

    NASA Astrophysics Data System (ADS)

    Sarabandi, Pooya

    Building inventories are one of the core components of disaster vulnerability and loss estimations models, and as such, play a key role in providing decision support for risk assessment, disaster management and emergency response efforts. In may parts of the world inclusive building inventories, suitable for the use in catastrophe models cannot be found. Furthermore, there are serious shortcomings in the existing building inventories that include incomplete or out-dated information on critical attributes as well as missing or erroneous values for attributes. In this dissertation a set of methodologies for updating spatial and geometric information of buildings from single and multiple high-resolution optical satellite images are presented. Basic concepts, terminologies and fundamentals of 3-D terrain modeling from satellite images are first introduced. Different sensor projection models are then presented and sources of optical noise such as lens distortions are discussed. An algorithm for extracting height and creating 3-D building models from a single high-resolution satellite image is formulated. The proposed algorithm is a semi-automated supervised method capable of extracting attributes such as longitude, latitude, height, square footage, perimeter, irregularity index and etc. The associated errors due to the interactive nature of the algorithm are quantified and solutions for minimizing the human-induced errors are proposed. The height extraction algorithm is validated against independent survey data and results are presented. The validation results show that an average height modeling accuracy of 1.5% can be achieved using this algorithm. Furthermore, concept of cross-sensor data fusion for the purpose of 3-D scene reconstruction using quasi-stereo images is developed in this dissertation. The developed algorithm utilizes two or more single satellite images acquired from different sensors and provides the means to construct 3-D building models in a more economical way. A terrain-dependent-search algorithm is formulated to facilitate the search for correspondences in a quasi-stereo pair of images. The calculated heights for sample buildings using cross-sensor data fusion algorithm show an average coefficient of variation 1.03%. In order to infer structural-type and occupancy-type, i.e. engineering attributes, of buildings from spatial and geometric attributes of 3-D models, a statistical data analysis framework is formulated. Applications of "Classification Trees" and "Multinomial Logistic Models" in modeling the marginal probabilities of class-membership of engineering attributes are investigated. Adaptive statistical models to incorporate different spatial and geometric attributes of buildings---while inferring the engineering attributes---are developed in this dissertation. The inferred engineering attributes in conjunction with the spatial and geometric attributes derived from the imagery can be used to augment regional building inventories and therefore enhance the result of catastrophe models. In the last part of the dissertation, a set of empirically-derived motion-damage relationships based on the correlation of observed building performance with measured ground-motion parameters from 1994 Northridge and 1999 Chi-Chi Taiwan earthquakes are developed. Fragility functions in the form of cumulative lognormal distributions and damage probability matrices for several classes of buildings (wood, steel and concrete), as well as number of ground-motion intensity measures are developed and compared to currently-used motion-damage relationships.

  17. Advanced building energy management system demonstration for Department of Defense buildings.

    PubMed

    O'Neill, Zheng; Bailey, Trevor; Dong, Bing; Shashanka, Madhusudana; Luo, Dong

    2013-08-01

    This paper presents an advanced building energy management system (aBEMS) that employs advanced methods of whole-building performance monitoring combined with statistical methods of learning and data analysis to enable identification of both gradual and discrete performance erosion and faults. This system assimilated data collected from multiple sources, including blueprints, reduced-order models (ROM) and measurements, and employed advanced statistical learning algorithms to identify patterns of anomalies. The results were presented graphically in a manner understandable to facilities managers. A demonstration of aBEMS was conducted in buildings at Naval Station Great Lakes. The facility building management systems were extended to incorporate the energy diagnostics and analysis algorithms, producing systematic identification of more efficient operation strategies. At Naval Station Great Lakes, greater than 20% savings were demonstrated for building energy consumption by improving facility manager decision support to diagnose energy faults and prioritize alternative, energy-efficient operation strategies. The paper concludes with recommendations for widespread aBEMS success. © 2013 New York Academy of Sciences.

  18. Analysis of 3d Building Models Accuracy Based on the Airborne Laser Scanning Point Clouds

    NASA Astrophysics Data System (ADS)

    Ostrowski, W.; Pilarska, M.; Charyton, J.; Bakuła, K.

    2018-05-01

    Creating 3D building models in large scale is becoming more popular and finds many applications. Nowadays, a wide term "3D building models" can be applied to several types of products: well-known CityGML solid models (available on few Levels of Detail), which are mainly generated from Airborne Laser Scanning (ALS) data, as well as 3D mesh models that can be created from both nadir and oblique aerial images. City authorities and national mapping agencies are interested in obtaining the 3D building models. Apart from the completeness of the models, the accuracy aspect is also important. Final accuracy of a building model depends on various factors (accuracy of the source data, complexity of the roof shapes, etc.). In this paper the methodology of inspection of dataset containing 3D models is presented. The proposed approach check all building in dataset with comparison to ALS point clouds testing both: accuracy and level of details. Using analysis of statistical parameters for normal heights for reference point cloud and tested planes and segmentation of point cloud provides the tool that can indicate which building and which roof plane in do not fulfill requirement of model accuracy and detail correctness. Proposed method was tested on two datasets: solid and mesh model.

  19. A Comparison of Two Balance Calibration Model Building Methods

    NASA Technical Reports Server (NTRS)

    DeLoach, Richard; Ulbrich, Norbert

    2007-01-01

    Simulated strain-gage balance calibration data is used to compare the accuracy of two balance calibration model building methods for different noise environments and calibration experiment designs. The first building method obtains a math model for the analysis of balance calibration data after applying a candidate math model search algorithm to the calibration data set. The second building method uses stepwise regression analysis in order to construct a model for the analysis. Four balance calibration data sets were simulated in order to compare the accuracy of the two math model building methods. The simulated data sets were prepared using the traditional One Factor At a Time (OFAT) technique and the Modern Design of Experiments (MDOE) approach. Random and systematic errors were introduced in the simulated calibration data sets in order to study their influence on the math model building methods. Residuals of the fitted calibration responses and other statistical metrics were compared in order to evaluate the calibration models developed with different combinations of noise environment, experiment design, and model building method. Overall, predicted math models and residuals of both math model building methods show very good agreement. Significant differences in model quality were attributable to noise environment, experiment design, and their interaction. Generally, the addition of systematic error significantly degraded the quality of calibration models developed from OFAT data by either method, but MDOE experiment designs were more robust with respect to the introduction of a systematic component of the unexplained variance.

  20. Hybrid Model-Based and Data-Driven Fault Detection and Diagnostics for Commercial Buildings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Frank, Stephen; Heaney, Michael; Jin, Xin

    Commercial buildings often experience faults that produce undesirable behavior in building systems. Building faults waste energy, decrease occupants' comfort, and increase operating costs. Automated fault detection and diagnosis (FDD) tools for buildings help building owners discover and identify the root causes of faults in building systems, equipment, and controls. Proper implementation of FDD has the potential to simultaneously improve comfort, reduce energy use, and narrow the gap between actual and optimal building performance. However, conventional rule-based FDD requires expensive instrumentation and valuable engineering labor, which limit deployment opportunities. This paper presents a hybrid, automated FDD approach that combines building energymore » models and statistical learning tools to detect and diagnose faults noninvasively, using minimal sensors, with little customization. We compare and contrast the performance of several hybrid FDD algorithms for a small security building. Our results indicate that the algorithms can detect and diagnose several common faults, but more work is required to reduce false positive rates and improve diagnosis accuracy.« less

  1. Hybrid Model-Based and Data-Driven Fault Detection and Diagnostics for Commercial Buildings: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Frank, Stephen; Heaney, Michael; Jin, Xin

    Commercial buildings often experience faults that produce undesirable behavior in building systems. Building faults waste energy, decrease occupants' comfort, and increase operating costs. Automated fault detection and diagnosis (FDD) tools for buildings help building owners discover and identify the root causes of faults in building systems, equipment, and controls. Proper implementation of FDD has the potential to simultaneously improve comfort, reduce energy use, and narrow the gap between actual and optimal building performance. However, conventional rule-based FDD requires expensive instrumentation and valuable engineering labor, which limit deployment opportunities. This paper presents a hybrid, automated FDD approach that combines building energymore » models and statistical learning tools to detect and diagnose faults noninvasively, using minimal sensors, with little customization. We compare and contrast the performance of several hybrid FDD algorithms for a small security building. Our results indicate that the algorithms can detect and diagnose several common faults, but more work is required to reduce false positive rates and improve diagnosis accuracy.« less

  2. Review of Methods for Buildings Energy Performance Modelling

    NASA Astrophysics Data System (ADS)

    Krstić, Hrvoje; Teni, Mihaela

    2017-10-01

    Research presented in this paper gives a brief review of methods used for buildings energy performance modelling. This paper gives also a comprehensive review of the advantages and disadvantages of available methods as well as the input parameters used for modelling buildings energy performance. European Directive EPBD obliges the implementation of energy certification procedure which gives an insight on buildings energy performance via exiting energy certificate databases. Some of the methods for buildings energy performance modelling mentioned in this paper are developed by employing data sets of buildings which have already undergone an energy certification procedure. Such database is used in this paper where the majority of buildings in the database have already gone under some form of partial retrofitting - replacement of windows or installation of thermal insulation but still have poor energy performance. The case study presented in this paper utilizes energy certificates database obtained from residential units in Croatia (over 400 buildings) in order to determine the dependence between buildings energy performance and variables from database by using statistical dependencies tests. Building energy performance in database is presented with building energy efficiency rate (from A+ to G) which is based on specific annual energy needs for heating for referential climatic data [kWh/(m2a)]. Independent variables in database are surfaces and volume of the conditioned part of the building, building shape factor, energy used for heating, CO2 emission, building age and year of reconstruction. Research results presented in this paper give an insight in possibilities of methods used for buildings energy performance modelling. Further on it gives an analysis of dependencies between buildings energy performance as a dependent variable and independent variables from the database. Presented results could be used for development of new building energy performance predictive model.

  3. Modelling of Rail Vehicles and Track for Calculation of Ground-Vibration Transmission Into Buildings

    NASA Astrophysics Data System (ADS)

    Hunt, H. E. M.

    1996-05-01

    A methodology for the calculation of vibration transmission from railways into buildings is presented. The method permits existing models of railway vehicles and track to be incorporated and it has application to any model of vibration transmission through the ground. Special attention is paid to the relative phasing between adjacent axle-force inputs to the rail, so that vibration transmission may be calculated as a random process. The vehicle-track model is used in conjunction with a building model of infinite length. The tracking and building are infinite and parallel to each other and forces applied are statistically stationary in space so that vibration levels at any two points along the building are the same. The methodology is two-dimensional for the purpose of application of random process theory, but fully three-dimensional for calculation of vibration transmission from the track and through the ground into the foundations of the building. The computational efficiency of the method will interest engineers faced with the task of reducing vibration levels in buildings. It is possible to assess the relative merits of using rail pads, under-sleeper pads, ballast mats, floating-slab track or base isolation for particular applications.

  4. Development of hazard-compatible building fragility and vulnerability models

    USGS Publications Warehouse

    Karaca, E.; Luco, N.

    2008-01-01

    We present a methodology for transforming the structural and non-structural fragility functions in HAZUS into a format that is compatible with conventional seismic hazard analysis information. The methodology makes use of the building capacity (or pushover) curves and related building parameters provided in HAZUS. Instead of the capacity spectrum method applied in HAZUS, building response is estimated by inelastic response history analysis of corresponding single-degree-of-freedom systems under a large number of earthquake records. Statistics of the building response are used with the damage state definitions from HAZUS to derive fragility models conditioned on spectral acceleration values. Using the developed fragility models for structural and nonstructural building components, with corresponding damage state loss ratios from HAZUS, we also derive building vulnerability models relating spectral acceleration to repair costs. Whereas in HAZUS the structural and nonstructural damage states are treated as if they are independent, our vulnerability models are derived assuming "complete" nonstructural damage whenever the structural damage state is complete. We show the effects of considering this dependence on the final vulnerability models. The use of spectral acceleration (at selected vibration periods) as the ground motion intensity parameter, coupled with the careful treatment of uncertainty, makes the new fragility and vulnerability models compatible with conventional seismic hazard curves and hence useful for extensions to probabilistic damage and loss assessment.

  5. Localized Smart-Interpretation

    NASA Astrophysics Data System (ADS)

    Lundh Gulbrandsen, Mats; Mejer Hansen, Thomas; Bach, Torben; Pallesen, Tom

    2014-05-01

    The complex task of setting up a geological model consists not only of combining available geological information into a conceptual plausible model, but also requires consistency with availably data, e.g. geophysical data. However, in many cases the direct geological information, e.g borehole samples, are very sparse, so in order to create a geological model, the geologist needs to rely on the geophysical data. The problem is however, that the amount of geophysical data in many cases are so vast that it is practically impossible to integrate all of them in the manual interpretation process. This means that a lot of the information available from the geophysical surveys are unexploited, which is a problem, due to the fact that the resulting geological model does not fulfill its full potential and hence are less trustworthy. We suggest an approach to geological modeling that 1. allow all geophysical data to be considered when building the geological model 2. is fast 3. allow quantification of geological modeling. The method is constructed to build a statistical model, f(d,m), describing the relation between what the geologists interpret, d, and what the geologist knows, m. The para- meter m reflects any available information that can be quantified, such as geophysical data, the result of a geophysical inversion, elevation maps, etc... The parameter d reflects an actual interpretation, such as for example the depth to the base of a ground water reservoir. First we infer a statistical model f(d,m), by examining sets of actual interpretations made by a geological expert, [d1, d2, ...], and the information used to perform the interpretation; [m1, m2, ...]. This makes it possible to quantify how the geological expert performs interpolation through f(d,m). As the geological expert proceeds interpreting, the number of interpreted datapoints from which the statistical model is inferred increases, and therefore the accuracy of the statistical model increases. When a model f(d,m) successfully has been inferred, we are able to simulate how the geological expert would perform an interpretation given some external information m, through f(d|m). We will demonstrate this method applied on geological interpretation and densely sampled airborne electromagnetic data. In short, our goal is to build a statistical model describing how a geological expert performs geological interpretation given some geophysical data. We then wish to use this statistical model to perform semi automatic interpretation, everywhere where such geophysical data exist, in a manner consistent with the choices made by a geological expert. Benefits of such a statistical model are that 1. it provides a quantification of how a geological expert performs interpretation based on available diverse data 2. all available geophysical information can be used 3. it allows much faster interpretation of large data sets.

  6. Statistical validation of normal tissue complication probability models.

    PubMed

    Xu, Cheng-Jian; van der Schaaf, Arjen; Van't Veld, Aart A; Langendijk, Johannes A; Schilstra, Cornelis

    2012-09-01

    To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use. Copyright © 2012 Elsevier Inc. All rights reserved.

  7. Encoding Dissimilarity Data for Statistical Model Building.

    PubMed

    Wahba, Grace

    2010-12-01

    We summarize, review and comment upon three papers which discuss the use of discrete, noisy, incomplete, scattered pairwise dissimilarity data in statistical model building. Convex cone optimization codes are used to embed the objects into a Euclidean space which respects the dissimilarity information while controlling the dimension of the space. A "newbie" algorithm is provided for embedding new objects into this space. This allows the dissimilarity information to be incorporated into a Smoothing Spline ANOVA penalized likelihood model, a Support Vector Machine, or any model that will admit Reproducing Kernel Hilbert Space components, for nonparametric regression, supervised learning, or semi-supervised learning. Future work and open questions are discussed. The papers are: F. Lu, S. Keles, S. Wright and G. Wahba 2005. A framework for kernel regularization with application to protein clustering. Proceedings of the National Academy of Sciences 102, 12332-1233.G. Corrada Bravo, G. Wahba, K. Lee, B. Klein, R. Klein and S. Iyengar 2009. Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models. Proceedings of the National Academy of Sciences 106, 8128-8133F. Lu, Y. Lin and G. Wahba. Robust manifold unfolding with kernel regularization. TR 1008, Department of Statistics, University of Wisconsin-Madison.

  8. Set-free Markov state model building

    NASA Astrophysics Data System (ADS)

    Weber, Marcus; Fackeldey, Konstantin; Schütte, Christof

    2017-03-01

    Molecular dynamics (MD) simulations face challenging problems since the time scales of interest often are much longer than what is possible to simulate; and even if sufficiently long simulations are possible the complex nature of the resulting simulation data makes interpretation difficult. Markov State Models (MSMs) help to overcome these problems by making experimentally relevant time scales accessible via coarse grained representations that also allow for convenient interpretation. However, standard set-based MSMs exhibit some caveats limiting their approximation quality and statistical significance. One of the main caveats results from the fact that typical MD trajectories repeatedly re-cross the boundary between the sets used to build the MSM which causes statistical bias in estimating the transition probabilities between these sets. In this article, we present a set-free approach to MSM building utilizing smooth overlapping ansatz functions instead of sets and an adaptive refinement approach. This kind of meshless discretization helps to overcome the recrossing problem and yields an adaptive refinement procedure that allows us to improve the quality of the model while exploring state space and inserting new ansatz functions into the MSM.

  9. Cognitive-Developmental Hierarchies: A Search for Structure Using Item-Level Data.

    ERIC Educational Resources Information Center

    Martinez, Michael E.; Simpson, R. Scott

    Item-level statistics from ability and achievement tests have been underutilized as sources of data for building models of cognitive development. How item data can be used to build a cognitive-developmental map of proportional reasoning is demonstrated. The product of the analysis is a cognitive hierarchy with levels corresponding to categories of…

  10. Performance evaluation of an agent-based occupancy simulation model

    DOE PAGES

    Luo, Xuan; Lam, Khee Poh; Chen, Yixing; ...

    2017-01-17

    Occupancy is an important factor driving building performance. Static and homogeneous occupant schedules, commonly used in building performance simulation, contribute to issues such as performance gaps between simulated and measured energy use in buildings. Stochastic occupancy models have been recently developed and applied to better represent spatial and temporal diversity of occupants in buildings. However, there is very limited evaluation of the usability and accuracy of these models. This study used measured occupancy data from a real office building to evaluate the performance of an agent-based occupancy simulation model: the Occupancy Simulator. The occupancy patterns of various occupant types weremore » first derived from the measured occupant schedule data using statistical analysis. Then the performance of the simulation model was evaluated and verified based on (1) whether the distribution of observed occupancy behavior patterns follows the theoretical ones included in the Occupancy Simulator, and (2) whether the simulator can reproduce a variety of occupancy patterns accurately. Results demonstrated the feasibility of applying the Occupancy Simulator to simulate a range of occupancy presence and movement behaviors for regular types of occupants in office buildings, and to generate stochastic occupant schedules at the room and individual occupant levels for building performance simulation. For future work, model validation is recommended, which includes collecting and using detailed interval occupancy data of all spaces in an office building to validate the simulated occupant schedules from the Occupancy Simulator.« less

  11. Performance evaluation of an agent-based occupancy simulation model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Luo, Xuan; Lam, Khee Poh; Chen, Yixing

    Occupancy is an important factor driving building performance. Static and homogeneous occupant schedules, commonly used in building performance simulation, contribute to issues such as performance gaps between simulated and measured energy use in buildings. Stochastic occupancy models have been recently developed and applied to better represent spatial and temporal diversity of occupants in buildings. However, there is very limited evaluation of the usability and accuracy of these models. This study used measured occupancy data from a real office building to evaluate the performance of an agent-based occupancy simulation model: the Occupancy Simulator. The occupancy patterns of various occupant types weremore » first derived from the measured occupant schedule data using statistical analysis. Then the performance of the simulation model was evaluated and verified based on (1) whether the distribution of observed occupancy behavior patterns follows the theoretical ones included in the Occupancy Simulator, and (2) whether the simulator can reproduce a variety of occupancy patterns accurately. Results demonstrated the feasibility of applying the Occupancy Simulator to simulate a range of occupancy presence and movement behaviors for regular types of occupants in office buildings, and to generate stochastic occupant schedules at the room and individual occupant levels for building performance simulation. For future work, model validation is recommended, which includes collecting and using detailed interval occupancy data of all spaces in an office building to validate the simulated occupant schedules from the Occupancy Simulator.« less

  12. Use of statistical and neural net approaches in predicting toxicity of chemicals.

    PubMed

    Basak, S C; Grunwald, G D; Gute, B D; Balasubramanian, K; Opitz, D

    2000-01-01

    Hierarchical quantitative structure-activity relationships (H-QSAR) have been developed as a new approach in constructing models for estimating physicochemical, biomedicinal, and toxicological properties of interest. This approach uses increasingly more complex molecular descriptors in a graduated approach to model building. In this study, statistical and neural network methods have been applied to the development of H-QSAR models for estimating the acute aquatic toxicity (LC50) of 69 benzene derivatives to Pimephales promelas (fathead minnow). Topostructural, topochemical, geometrical, and quantum chemical indices were used as the four levels of the hierarchical method. It is clear from both the statistical and neural network models that topostructural indices alone cannot adequately model this set of congeneric chemicals. Not surprisingly, topochemical indices greatly increase the predictive power of both statistical and neural network models. Quantum chemical indices also add significantly to the modeling of this set of acute aquatic toxicity data.

  13. Development of an automated energy audit protocol for office buildings

    NASA Astrophysics Data System (ADS)

    Deb, Chirag

    This study aims to enhance the building energy audit process, and bring about reduction in time and cost requirements in the conduction of a full physical audit. For this, a total of 5 Energy Service Companies in Singapore have collaborated and provided energy audit reports for 62 office buildings. Several statistical techniques are adopted to analyse these reports. These techniques comprise cluster analysis and development of prediction models to predict energy savings for buildings. The cluster analysis shows that there are 3 clusters of buildings experiencing different levels of energy savings. To understand the effect of building variables on the change in EUI, a robust iterative process for selecting the appropriate variables is developed. The results show that the 4 variables of GFA, non-air-conditioning energy consumption, average chiller plant efficiency and installed capacity of chillers should be taken for clustering. This analysis is extended to the development of prediction models using linear regression and artificial neural networks (ANN). An exhaustive variable selection algorithm is developed to select the input variables for the two energy saving prediction models. The results show that the ANN prediction model can predict the energy saving potential of a given building with an accuracy of +/-14.8%.

  14. Building mental models by dissecting physical models.

    PubMed

    Srivastava, Anveshna

    2016-01-01

    When students build physical models from prefabricated components to learn about model systems, there is an implicit trade-off between the physical degrees of freedom in building the model and the intensity of instructor supervision needed. Models that are too flexible, permitting multiple possible constructions require greater supervision to ensure focused learning; models that are too constrained require less supervision, but can be constructed mechanically, with little to no conceptual engagement. We propose "model-dissection" as an alternative to "model-building," whereby instructors could make efficient use of supervisory resources, while simultaneously promoting focused learning. We report empirical results from a study conducted with biology undergraduate students, where we demonstrate that asking them to "dissect" out specific conceptual structures from an already built 3D physical model leads to a significant improvement in performance than asking them to build the 3D model from simpler components. Using questionnaires to measure understanding both before and after model-based interventions for two cohorts of students, we find that both the "builders" and the "dissectors" improve in the post-test, but it is the latter group who show statistically significant improvement. These results, in addition to the intrinsic time-efficiency of "model dissection," suggest that it could be a valuable pedagogical tool. © 2015 The International Union of Biochemistry and Molecular Biology.

  15. Statistical appearance models based on probabilistic correspondences.

    PubMed

    Krüger, Julia; Ehrhardt, Jan; Handels, Heinz

    2017-04-01

    Model-based image analysis is indispensable in medical image processing. One key aspect of building statistical shape and appearance models is the determination of one-to-one correspondences in the training data set. At the same time, the identification of these correspondences is the most challenging part of such methods. In our earlier work, we developed an alternative method using correspondence probabilities instead of exact one-to-one correspondences for a statistical shape model (Hufnagel et al., 2008). In this work, a new approach for statistical appearance models without one-to-one correspondences is proposed. A sparse image representation is used to build a model that combines point position and appearance information at the same time. Probabilistic correspondences between the derived multi-dimensional feature vectors are used to omit the need for extensive preprocessing of finding landmarks and correspondences as well as to reduce the dependence of the generated model on the landmark positions. Model generation and model fitting can now be expressed by optimizing a single global criterion derived from a maximum a-posteriori (MAP) approach with respect to model parameters that directly affect both shape and appearance of the considered objects inside the images. The proposed approach describes statistical appearance modeling in a concise and flexible mathematical framework. Besides eliminating the demand for costly correspondence determination, the method allows for additional constraints as topological regularity in the modeling process. In the evaluation the model was applied for segmentation and landmark identification in hand X-ray images. The results demonstrate the feasibility of the model to detect hand contours as well as the positions of the joints between finger bones for unseen test images. Further, we evaluated the model on brain data of stroke patients to show the ability of the proposed model to handle partially corrupted data and to demonstrate a possible employment of the correspondence probabilities to indicate these corrupted/pathological areas. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Adapting to change: The role of the right hemisphere in mental model building and updating.

    PubMed

    Filipowicz, Alex; Anderson, Britt; Danckert, James

    2016-09-01

    We recently proposed that the right hemisphere plays a crucial role in the processes underlying mental model building and updating. Here, we review the evidence we and others have garnered to support this novel account of right hemisphere function. We begin by presenting evidence from patient work that suggests a critical role for the right hemisphere in the ability to learn from the statistics in the environment (model building) and adapt to environmental change (model updating). We then provide a review of neuroimaging research that highlights a network of brain regions involved in mental model updating. Next, we outline specific roles for particular regions within the network such that the anterior insula is purported to maintain the current model of the environment, the medial prefrontal cortex determines when to explore new or alternative models, and the inferior parietal lobule represents salient and surprising information with respect to the current model. We conclude by proposing some future directions that address some of the outstanding questions in the field of mental model building and updating. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  17. Revised Models and Conceptualisation of Successful School Principalship for Improved Student Outcomes

    ERIC Educational Resources Information Center

    Mulford, Bill; Silins, Halia

    2011-01-01

    Purpose: This study aims to present revised models and a reconceptualisation of successful school principalship for improved student outcomes. Design/methodology/approach: The study's approach is qualitative and quantitative, culminating in model building and multi-level statistical analyses. Findings: Principals who promote both capacity building…

  18. Statistical Ensemble of Large Eddy Simulations

    NASA Technical Reports Server (NTRS)

    Carati, Daniele; Rogers, Michael M.; Wray, Alan A.; Mansour, Nagi N. (Technical Monitor)

    2001-01-01

    A statistical ensemble of large eddy simulations (LES) is run simultaneously for the same flow. The information provided by the different large scale velocity fields is used to propose an ensemble averaged version of the dynamic model. This produces local model parameters that only depend on the statistical properties of the flow. An important property of the ensemble averaged dynamic procedure is that it does not require any spatial averaging and can thus be used in fully inhomogeneous flows. Also, the ensemble of LES's provides statistics of the large scale velocity that can be used for building new models for the subgrid-scale stress tensor. The ensemble averaged dynamic procedure has been implemented with various models for three flows: decaying isotropic turbulence, forced isotropic turbulence, and the time developing plane wake. It is found that the results are almost independent of the number of LES's in the statistical ensemble provided that the ensemble contains at least 16 realizations.

  19. Tropical geometry of statistical models.

    PubMed

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    This article presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. Here, we address the question of how the solutions to various inference problems depend on the model parameters. The proposed answer is expressed in terms of tropical algebraic geometry. The Newton polytope of a statistical model plays a key role. Our results are applied to the hidden Markov model and the general Markov model on a binary tree.

  20. An automated process for building reliable and optimal in vitro/in vivo correlation models based on Monte Carlo simulations.

    PubMed

    Sutton, Steven C; Hu, Mingxiu

    2006-05-05

    Many mathematical models have been proposed for establishing an in vitro/in vivo correlation (IVIVC). The traditional IVIVC model building process consists of 5 steps: deconvolution, model fitting, convolution, prediction error evaluation, and cross-validation. This is a time-consuming process and typically a few models at most are tested for any given data set. The objectives of this work were to (1) propose a statistical tool to screen models for further development of an IVIVC, (2) evaluate the performance of each model under different circumstances, and (3) investigate the effectiveness of common statistical model selection criteria for choosing IVIVC models. A computer program was developed to explore which model(s) would be most likely to work well with a random variation from the original formulation. The process used Monte Carlo simulation techniques to build IVIVC models. Data-based model selection criteria (Akaike Information Criteria [AIC], R2) and the probability of passing the Food and Drug Administration "prediction error" requirement was calculated. To illustrate this approach, several real data sets representing a broad range of release profiles are used to illustrate the process and to demonstrate the advantages of this automated process over the traditional approach. The Hixson-Crowell and Weibull models were often preferred over the linear. When evaluating whether a Level A IVIVC model was possible, the model selection criteria AIC generally selected the best model. We believe that the approach we proposed may be a rapid tool to determine which IVIVC model (if any) is the most applicable.

  1. Modeling and Measurement Constraints in Fault Diagnostics for HVAC Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Najafi, Massieh; Auslander, David M.; Bartlett, Peter L.

    2010-05-30

    Many studies have shown that energy savings of five to fifteen percent are achievable in commercial buildings by detecting and correcting building faults, and optimizing building control systems. However, in spite of good progress in developing tools for determining HVAC diagnostics, methods to detect faults in HVAC systems are still generally undeveloped. Most approaches use numerical filtering or parameter estimation methods to compare data from energy meters and building sensors to predictions from mathematical or statistical models. They are effective when models are relatively accurate and data contain few errors. In this paper, we address the case where models aremore » imperfect and data are variable, uncertain, and can contain error. We apply a Bayesian updating approach that is systematic in managing and accounting for most forms of model and data errors. The proposed method uses both knowledge of first principle modeling and empirical results to analyze the system performance within the boundaries defined by practical constraints. We demonstrate the approach by detecting faults in commercial building air handling units. We find that the limitations that exist in air handling unit diagnostics due to practical constraints can generally be effectively addressed through the proposed approach.« less

  2. Data mining and statistical inference in selective laser melting

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kamath, Chandrika

    Selective laser melting (SLM) is an additive manufacturing process that builds a complex three-dimensional part, layer-by-layer, using a laser beam to fuse fine metal powder together. The design freedom afforded by SLM comes associated with complexity. As the physical phenomena occur over a broad range of length and time scales, the computational cost of modeling the process is high. At the same time, the large number of parameters that control the quality of a part make experiments expensive. In this paper, we describe ways in which we can use data mining and statistical inference techniques to intelligently combine simulations andmore » experiments to build parts with desired properties. We start with a brief summary of prior work in finding process parameters for high-density parts. We then expand on this work to show how we can improve the approach by using feature selection techniques to identify important variables, data-driven surrogate models to reduce computational costs, improved sampling techniques to cover the design space adequately, and uncertainty analysis for statistical inference. Here, our results indicate that techniques from data mining and statistics can complement those from physical modeling to provide greater insight into complex processes such as selective laser melting.« less

  3. Data mining and statistical inference in selective laser melting

    DOE PAGES

    Kamath, Chandrika

    2016-01-11

    Selective laser melting (SLM) is an additive manufacturing process that builds a complex three-dimensional part, layer-by-layer, using a laser beam to fuse fine metal powder together. The design freedom afforded by SLM comes associated with complexity. As the physical phenomena occur over a broad range of length and time scales, the computational cost of modeling the process is high. At the same time, the large number of parameters that control the quality of a part make experiments expensive. In this paper, we describe ways in which we can use data mining and statistical inference techniques to intelligently combine simulations andmore » experiments to build parts with desired properties. We start with a brief summary of prior work in finding process parameters for high-density parts. We then expand on this work to show how we can improve the approach by using feature selection techniques to identify important variables, data-driven surrogate models to reduce computational costs, improved sampling techniques to cover the design space adequately, and uncertainty analysis for statistical inference. Here, our results indicate that techniques from data mining and statistics can complement those from physical modeling to provide greater insight into complex processes such as selective laser melting.« less

  4. Models of dyadic social interaction.

    PubMed Central

    Griffin, Dale; Gonzalez, Richard

    2003-01-01

    We discuss the logic of research designs for dyadic interaction and present statistical models with parameters that are tied to psychologically relevant constructs. Building on Karl Pearson's classic nineteenth-century statistical analysis of within-organism similarity, we describe several approaches to indexing dyadic interdependence and provide graphical methods for visualizing dyadic data. We also describe several statistical and conceptual solutions to the 'levels of analytic' problem in analysing dyadic data. These analytic strategies allow the researcher to examine and measure psychological questions of interdependence and social influence. We provide illustrative data from casually interacting and romantic dyads. PMID:12689382

  5. Statistical Surrogate Modeling of Atmospheric Dispersion Events Using Bayesian Adaptive Splines

    NASA Astrophysics Data System (ADS)

    Francom, D.; Sansó, B.; Bulaevskaya, V.; Lucas, D. D.

    2016-12-01

    Uncertainty in the inputs of complex computer models, including atmospheric dispersion and transport codes, is often assessed via statistical surrogate models. Surrogate models are computationally efficient statistical approximations of expensive computer models that enable uncertainty analysis. We introduce Bayesian adaptive spline methods for producing surrogate models that capture the major spatiotemporal patterns of the parent model, while satisfying all the necessities of flexibility, accuracy and computational feasibility. We present novel methodological and computational approaches motivated by a controlled atmospheric tracer release experiment conducted at the Diablo Canyon nuclear power plant in California. Traditional methods for building statistical surrogate models often do not scale well to experiments with large amounts of data. Our approach is well suited to experiments involving large numbers of model inputs, large numbers of simulations, and functional output for each simulation. Our approach allows us to perform global sensitivity analysis with ease. We also present an approach to calibration of simulators using field data.

  6. Sensitivity of the Hydrogen Epoch of Reionization Array and its build-out stages to one-point statistics from redshifted 21 cm observations

    NASA Astrophysics Data System (ADS)

    Kittiwisit, Piyanat; Bowman, Judd D.; Jacobs, Daniel C.; Beardsley, Adam P.; Thyagarajan, Nithyanandan

    2018-03-01

    We present a baseline sensitivity analysis of the Hydrogen Epoch of Reionization Array (HERA) and its build-out stages to one-point statistics (variance, skewness, and kurtosis) of redshifted 21 cm intensity fluctuation from the Epoch of Reionization (EoR) based on realistic mock observations. By developing a full-sky 21 cm light-cone model, taking into account the proper field of view and frequency bandwidth, utilizing a realistic measurement scheme, and assuming perfect foreground removal, we show that HERA will be able to recover statistics of the sky model with high sensitivity by averaging over measurements from multiple fields. All build-out stages will be able to detect variance, while skewness and kurtosis should be detectable for HERA128 and larger. We identify sample variance as the limiting constraint of the measurements at the end of reionization. The sensitivity can also be further improved by performing frequency windowing. In addition, we find that strong sample variance fluctuation in the kurtosis measured from an individual field of observation indicates the presence of outlying cold or hot regions in the underlying fluctuations, a feature that can potentially be used as an EoR bubble indicator.

  7. Statistical Techniques to Explore the Quality of Constraints in Constraint-Based Modeling Environments

    ERIC Educational Resources Information Center

    Gálvez, Jaime; Conejo, Ricardo; Guzmán, Eduardo

    2013-01-01

    One of the most popular student modeling approaches is Constraint-Based Modeling (CBM). It is an efficient approach that can be easily applied inside an Intelligent Tutoring System (ITS). Even with these characteristics, building new ITSs requires carefully designing the domain model to be taught because different sources of errors could affect…

  8. Modeling longitudinal data, I: principles of multivariate analysis.

    PubMed

    Ravani, Pietro; Barrett, Brendan; Parfrey, Patrick

    2009-01-01

    Statistical models are used to study the relationship between exposure and disease while accounting for the potential role of other factors' impact on outcomes. This adjustment is useful to obtain unbiased estimates of true effects or to predict future outcomes. Statistical models include a systematic component and an error component. The systematic component explains the variability of the response variable as a function of the predictors and is summarized in the effect estimates (model coefficients). The error element of the model represents the variability in the data unexplained by the model and is used to build measures of precision around the point estimates (confidence intervals).

  9. Estimating rooftop solar technical potential across the US using a combination of GIS-based methods, lidar data, and statistical modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gagnon, Pieter; Margolis, Robert; Melius, Jennifer

    We provide a detailed estimate of the technical potential of rooftop solar photovoltaic (PV) electricity generation throughout the contiguous United States. This national estimate is based on an analysis of select US cities that combines light detection and ranging (lidar) data with a validated analytical method for determining rooftop PV suitability employing geographic information systems. We use statistical models to extend this analysis to estimate the quantity and characteristics of roofs in areas not covered by lidar data. Finally, we model PV generation for all rooftops to yield technical potential estimates. At the national level, 8.13 billion m 2 ofmore » suitable roof area could host 1118 GW of PV capacity, generating 1432 TWh of electricity per year. This would equate to 38.6% of the electricity that was sold in the contiguous United States in 2013. This estimate is substantially higher than a previous estimate made by the National Renewable Energy Laboratory. The difference can be attributed to increases in PV module power density, improved estimation of building suitability, higher estimates of total number of buildings, and improvements in PV performance simulation tools that previously tended to underestimate productivity. Also notable, the nationwide percentage of buildings suitable for at least some PV deployment is high—82% for buildings smaller than 5000 ft 2 and over 99% for buildings larger than that. In most states, rooftop PV could enable small, mostly residential buildings to offset the majority of average household electricity consumption. Even in some states with a relatively poor solar resource, such as those in the Northeast, the residential sector has the potential to offset around 100% of its total electricity consumption with rooftop PV.« less

  10. Estimating rooftop solar technical potential across the US using a combination of GIS-based methods, lidar data, and statistical modeling

    DOE PAGES

    Gagnon, Pieter; Margolis, Robert; Melius, Jennifer; ...

    2018-01-05

    We provide a detailed estimate of the technical potential of rooftop solar photovoltaic (PV) electricity generation throughout the contiguous United States. This national estimate is based on an analysis of select US cities that combines light detection and ranging (lidar) data with a validated analytical method for determining rooftop PV suitability employing geographic information systems. We use statistical models to extend this analysis to estimate the quantity and characteristics of roofs in areas not covered by lidar data. Finally, we model PV generation for all rooftops to yield technical potential estimates. At the national level, 8.13 billion m 2 ofmore » suitable roof area could host 1118 GW of PV capacity, generating 1432 TWh of electricity per year. This would equate to 38.6% of the electricity that was sold in the contiguous United States in 2013. This estimate is substantially higher than a previous estimate made by the National Renewable Energy Laboratory. The difference can be attributed to increases in PV module power density, improved estimation of building suitability, higher estimates of total number of buildings, and improvements in PV performance simulation tools that previously tended to underestimate productivity. Also notable, the nationwide percentage of buildings suitable for at least some PV deployment is high—82% for buildings smaller than 5000 ft 2 and over 99% for buildings larger than that. In most states, rooftop PV could enable small, mostly residential buildings to offset the majority of average household electricity consumption. Even in some states with a relatively poor solar resource, such as those in the Northeast, the residential sector has the potential to offset around 100% of its total electricity consumption with rooftop PV.« less

  11. Estimating rooftop solar technical potential across the US using a combination of GIS-based methods, lidar data, and statistical modeling

    NASA Astrophysics Data System (ADS)

    Gagnon, Pieter; Margolis, Robert; Melius, Jennifer; Phillips, Caleb; Elmore, Ryan

    2018-02-01

    We provide a detailed estimate of the technical potential of rooftop solar photovoltaic (PV) electricity generation throughout the contiguous United States. This national estimate is based on an analysis of select US cities that combines light detection and ranging (lidar) data with a validated analytical method for determining rooftop PV suitability employing geographic information systems. We use statistical models to extend this analysis to estimate the quantity and characteristics of roofs in areas not covered by lidar data. Finally, we model PV generation for all rooftops to yield technical potential estimates. At the national level, 8.13 billion m2 of suitable roof area could host 1118 GW of PV capacity, generating 1432 TWh of electricity per year. This would equate to 38.6% of the electricity that was sold in the contiguous United States in 2013. This estimate is substantially higher than a previous estimate made by the National Renewable Energy Laboratory. The difference can be attributed to increases in PV module power density, improved estimation of building suitability, higher estimates of total number of buildings, and improvements in PV performance simulation tools that previously tended to underestimate productivity. Also notable, the nationwide percentage of buildings suitable for at least some PV deployment is high—82% for buildings smaller than 5000 ft2 and over 99% for buildings larger than that. In most states, rooftop PV could enable small, mostly residential buildings to offset the majority of average household electricity consumption. Even in some states with a relatively poor solar resource, such as those in the Northeast, the residential sector has the potential to offset around 100% of its total electricity consumption with rooftop PV.

  12. Stochastic Modeling of Overtime Occupancy and Its Application in Building Energy Simulation and Calibration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sun, Kaiyu; Yan, Da; Hong, Tianzhen

    2014-02-28

    Overtime is a common phenomenon around the world. Overtime drives both internal heat gains from occupants, lighting and plug-loads, and HVAC operation during overtime periods. Overtime leads to longer occupancy hours and extended operation of building services systems beyond normal working hours, thus overtime impacts total building energy use. Current literature lacks methods to model overtime occupancy because overtime is stochastic in nature and varies by individual occupants and by time. To address this gap in the literature, this study aims to develop a new stochastic model based on the statistical analysis of measured overtime occupancy data from an officemore » building. A binomial distribution is used to represent the total number of occupants working overtime, while an exponential distribution is used to represent the duration of overtime periods. The overtime model is used to generate overtime occupancy schedules as an input to the energy model of a second office building. The measured and simulated cooling energy use during the overtime period is compared in order to validate the overtime model. A hybrid approach to energy model calibration is proposed and tested, which combines ASHRAE Guideline 14 for the calibration of the energy model during normal working hours, and a proposed KS test for the calibration of the energy model during overtime. The developed stochastic overtime model and the hybrid calibration approach can be used in building energy simulations to improve the accuracy of results, and better understand the characteristics of overtime in office buildings.« less

  13. Waste generated in high-rise buildings construction: a quantification model based on statistical multiple regression.

    PubMed

    Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana

    2015-05-01

    Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    Assessing the impact of energy efficiency technologies at a district or city scale is of great interest to local governments, real estate developers, utility companies, and policymakers. This paper describes a flexible framework that can be used to create and run district and city scale building energy simulations. The framework is built around the new OpenStudio City Database (CityDB). Building footprints, building height, building type, and other data can be imported from public records or other sources. Missing data can be inferred or assigned from a statistical sampling of other datasets. Once all required data is available, OpenStudio Measures aremore » used to create starting point energy models and to model energy efficiency measures for each building. Together this framework allows a user to pose several scenarios such as 'what if 30% of the commercial retail buildings added rooftop solar' or 'what if all elementary schools converted to ground source heat pumps' and then visualize the impacts at a district or city scale. This paper focuses on modeling existing building stock using public records. However, the framework is capable of supporting the evaluation of new construction, district systems, and the use of proprietary data sources.« less

  15. Building Capacity for Developing Statistical Literacy in a Developing Country: Lessons Learned from an Intervention

    ERIC Educational Resources Information Center

    North, Delia; Gal, Iddo; Zewotir, Temesgen

    2014-01-01

    This paper aims to contribute to the emerging literature on capacity-building in statistics education by examining issues pertaining to the readiness of teachers in a developing country to teach basic statistical topics. The paper reflects on challenges and barriers to building statistics capacity at grass-roots level in a developing country,…

  16. A new concept in seismic landslide hazard analysis for practical application

    NASA Astrophysics Data System (ADS)

    Lee, Chyi-Tyi

    2017-04-01

    A seismic landslide hazard model could be constructed using deterministic approach (Jibson et al., 2000) or statistical approach (Lee, 2014). Both approaches got landslide spatial probability under a certain return-period earthquake. In the statistical approach, our recent study found that there are common patterns among different landslide susceptibility models of the same region. The common susceptibility could reflect relative stability of slopes at a region; higher susceptibility indicates lower stability. Using the common susceptibility together with an earthquake event landslide inventory and a map of topographically corrected Arias intensity, we can build the relationship among probability of failure, Arias intensity and the susceptibility. This relationship can immediately be used to construct a seismic landslide hazard map for the region that the empirical relationship built. If the common susceptibility model is further normalized and the empirical relationship built with normalized susceptibility, then the empirical relationship may be practically applied to different region with similar tectonic environments and climate conditions. This could be feasible, when a region has no existing earthquake-induce landslide data to train the susceptibility model and to build the relationship. It is worth mentioning that a rain-induced landslide susceptibility model has common pattern similar to earthquake-induced landslide susceptibility in the same region, and is usable to build the relationship with an earthquake event landslide inventory and a map of Arias intensity. These will be introduced with examples in the meeting.

  17. Probabilistic Modeling and Visualization of the Flexibility in Morphable Models

    NASA Astrophysics Data System (ADS)

    Lüthi, M.; Albrecht, T.; Vetter, T.

    Statistical shape models, and in particular morphable models, have gained widespread use in computer vision, computer graphics and medical imaging. Researchers have started to build models of almost any anatomical structure in the human body. While these models provide a useful prior for many image analysis task, relatively little information about the shape represented by the morphable model is exploited. We propose a method for computing and visualizing the remaining flexibility, when a part of the shape is fixed. Our method, which is based on Probabilistic PCA, not only leads to an approach for reconstructing the full shape from partial information, but also allows us to investigate and visualize the uncertainty of a reconstruction. To show the feasibility of our approach we performed experiments on a statistical model of the human face and the femur bone. The visualization of the remaining flexibility allows for greater insight into the statistical properties of the shape.

  18. Machine Learning Predictions of a Multiresolution Climate Model Ensemble

    NASA Astrophysics Data System (ADS)

    Anderson, Gemma J.; Lucas, Donald D.

    2018-05-01

    Statistical models of high-resolution climate models are useful for many purposes, including sensitivity and uncertainty analyses, but building them can be computationally prohibitive. We generated a unique multiresolution perturbed parameter ensemble of a global climate model. We use a novel application of a machine learning technique known as random forests to train a statistical model on the ensemble to make high-resolution model predictions of two important quantities: global mean top-of-atmosphere energy flux and precipitation. The random forests leverage cheaper low-resolution simulations, greatly reducing the number of high-resolution simulations required to train the statistical model. We demonstrate that high-resolution predictions of these quantities can be obtained by training on an ensemble that includes only a small number of high-resolution simulations. We also find that global annually averaged precipitation is more sensitive to resolution changes than to any of the model parameters considered.

  19. Decompression models: review, relevance and validation capabilities.

    PubMed

    Hugon, J

    2014-01-01

    For more than a century, several types of mathematical models have been proposed to describe tissue desaturation mechanisms in order to limit decompression sickness. These models are statistically assessed by DCS cases, and, over time, have gradually included bubble formation biophysics. This paper proposes to review this evolution and discuss its limitations. This review is organized around the comparison of decompression model biophysical criteria and theoretical foundations. Then, the DCS-predictive capability was analyzed to assess whether it could be improved by combining different approaches. Most of the operational decompression models have a neo-Haldanian form. Nevertheless, bubble modeling has been gaining popularity, and the circulating bubble amount has become a major output. By merging both views, it seems possible to build a relevant global decompression model that intends to simulate bubble production while predicting DCS risks for all types of exposures and decompression profiles. A statistical approach combining both DCS and bubble detection databases has to be developed to calibrate a global decompression model. Doppler ultrasound and DCS data are essential: i. to make correlation and validation phases reliable; ii. to adjust biophysical criteria to fit at best the observed bubble kinetics; and iii. to build a relevant risk function.

  20. Assessment of Automated Measurement and Verification (M&V) Methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Granderson, Jessica; Touzani, Samir; Custodio, Claudine

    This report documents the application of a general statistical methodology to assess the accuracy of baseline energy models, focusing on its application to Measurement and Verification (M&V) of whole-building energy savings.

  1. Modeling climate change impact in hospitality sector, using building resources consumption signature

    NASA Astrophysics Data System (ADS)

    Pinto, Armando; Bernardino, Mariana; Silva Santos, António; Pimpão Silva, Álvaro; Espírito Santo, Fátima

    2016-04-01

    Hotels are one of building types that consumes more energy and water per person and are vulnerable to climate change because in the occurrence of extreme events (heat waves, water stress) same failures could compromise the hotel services (comfort) and increase energy cost or compromise the landscape and amenities due to water use restrictions. Climate impact assessments and the development of adaptation strategies require the knowledge about critical climatic variables and also the behaviour of building. To study the risk and vulnerability of buildings and hotels to climate change regarding resources consumption (energy and water), previous studies used building energy modelling simulation (BEMS) tools to study the variation in energy and water consumption. In general, the climate change impact in building is evaluated studying the energy and water demand of the building for future climate scenarios. But, hotels are complex buildings, quite different from each other and assumption done in simplified BEMS aren't calibrated and usually neglect some important hotel features leading to projected estimates that do not usually match hotel sector understanding and practice. Taking account all uncertainties, the use of building signature (statistical method) could be helpful to assess, in a more clear way, the impact of Climate Change in the hospitality sector and using a broad sample. Statistical analysis of the global energy consumption obtained from bills shows that the energy consumption may be predicted within 90% confidence interval only with the outdoor temperature. In this article a simplified methodology is presented and applied to identify the climate change impact in hospitality sector using the building energy and water signature. This methodology is applied to sixteen hotels (nine in Lisbon and seven in Algarve) with four and five stars rating. The results show that is expect an increase in water and electricity consumption (manly due to the increase in cooling) and a decrease in gas consumption (for heating). The hotels in Algarve are more vulnerable than Lisbon hotels.

  2. Use of Machine Learning Algorithms to Propose a New Methodology to Conduct, Critique and Validate Urban Scale Building Energy Modeling

    NASA Astrophysics Data System (ADS)

    Pathak, Maharshi

    City administrators and real-estate developers have been setting up rather aggressive energy efficiency targets. This, in turn, has led the building science research groups across the globe to focus on urban scale building performance studies and level of abstraction associated with the simulations of the same. The increasing maturity of the stakeholders towards energy efficiency and creating comfortable working environment has led researchers to develop methodologies and tools for addressing the policy driven interventions whether it's urban level energy systems, buildings' operational optimization or retrofit guidelines. Typically, these large-scale simulations are carried out by grouping buildings based on their design similarities i.e. standardization of the buildings. Such an approach does not necessarily lead to potential working inputs which can make decision-making effective. To address this, a novel approach is proposed in the present study. The principle objective of this study is to propose, to define and evaluate the methodology to utilize machine learning algorithms in defining representative building archetypes for the Stock-level Building Energy Modeling (SBEM) which are based on operational parameter database. The study uses "Phoenix- climate" based CBECS-2012 survey microdata for analysis and validation. Using the database, parameter correlations are studied to understand the relation between input parameters and the energy performance. Contrary to precedence, the study establishes that the energy performance is better explained by the non-linear models. The non-linear behavior is explained by advanced learning algorithms. Based on these algorithms, the buildings at study are grouped into meaningful clusters. The cluster "mediod" (statistically the centroid, meaning building that can be represented as the centroid of the cluster) are established statistically to identify the level of abstraction that is acceptable for the whole building energy simulations and post that the retrofit decision-making. Further, the methodology is validated by conducting Monte-Carlo simulations on 13 key input simulation parameters. The sensitivity analysis of these 13 parameters is utilized to identify the optimum retrofits. From the sample analysis, the envelope parameters are found to be more sensitive towards the EUI of the building and thus retrofit packages should also be directed to maximize the energy usage reduction.

  3. Tailored high-resolution numerical weather forecasts for energy efficient predictive building control

    NASA Astrophysics Data System (ADS)

    Stauch, V. J.; Gwerder, M.; Gyalistras, D.; Oldewurtel, F.; Schubiger, F.; Steiner, P.

    2010-09-01

    The high proportion of the total primary energy consumption by buildings has increased the public interest in the optimisation of buildings' operation and is also driving the development of novel control approaches for the indoor climate. In this context, the use of weather forecasts presents an interesting and - thanks to advances in information and predictive control technologies and the continuous improvement of numerical weather prediction (NWP) models - an increasingly attractive option for improved building control. Within the research project OptiControl (www.opticontrol.ethz.ch) predictive control strategies for a wide range of buildings, heating, ventilation and air conditioning (HVAC) systems, and representative locations in Europe are being investigated with the aid of newly developed modelling and simulation tools. Grid point predictions for radiation, temperature and humidity of the high-resolution limited area NWP model COSMO-7 (see www.cosmo-model.org) and local measurements are used as disturbances and inputs into the building system. The control task considered consists in minimizing energy consumption whilst maintaining occupant comfort. In this presentation, we use the simulation-based OptiControl methodology to investigate the impact of COSMO-7 forecasts on the performance of predictive building control and the resulting energy savings. For this, we have selected building cases that were shown to benefit from a prediction horizon of up to 3 days and therefore, are particularly suitable for the use of numerical weather forecasts. We show that the controller performance is sensitive to the quality of the weather predictions, most importantly of the incident radiation on differently oriented façades. However, radiation is characterised by a high temporal and spatial variability in part caused by small scale and fast changing cloud formation and dissolution processes being only partially represented in the COSMO-7 grid point predictions. On the other hand, buildings are affected by particularly local weather conditions at the building site. To overcome this discrepancy, we make use of local measurements to statistically adapt the COSMO-7 model output to the meteorological conditions at the building. For this, we have developed a general correction algorithm that exploits systematic properties of the COSMO-7 prediction error and explicitly estimates the degree of temporal autocorrelation using online recursive estimation. The resulting corrected predictions are improved especially for the first few hours being the most crucial for the predictive controller and, ultimately for the reduction of primary energy consumption using predictive control. The use of numerical weather forecasts in predictive building automation is one example in a wide field of weather dependent advanced energy saving technologies. Our work particularly highlights the need for the development of specifically tailored weather forecast products by (statistical) postprocessing in order to meet the requirements of specific applications.

  4. Environmental and Energy Aspects of Construction Industry and Green Buildings

    NASA Astrophysics Data System (ADS)

    Kauskale, L.; Geipele, I.; Zeltins, N.; Lecis, I.

    2017-04-01

    Green building is an important component of sustainable real estate market development, and one of the reasons is that the construction industry consumes a high amount of resources. Energy consumption of construction industry results in greenhouse gas emissions, so green buildings, energy systems, building technologies and other aspects play an important role in sustainable development of real estate market, construction and environmental development. The aim of the research is to analyse environmental aspects of sustainable real estate market development, focusing on importance of green buildings at the industry level and related energy aspects. Literature review, historical, statistical data analysis and logical access methods have been used in the research. The conducted research resulted in high environmental rationale and importance of environment-friendly buildings, and there are many green building benefits during the building life cycle. Future research direction is environmental information process and its models.

  5. Reproducible research in vadose zone sciences

    USDA-ARS?s Scientific Manuscript database

    A significant portion of present-day soil and Earth science research is computational, involving complex data analysis pipelines, advanced mathematical and statistical models, and sophisticated computer codes. Opportunities for scientific progress are greatly diminished if reproducing and building o...

  6. What do we gain from simplicity versus complexity in species distribution models?

    USGS Publications Warehouse

    Merow, Cory; Smith, Matthew J.; Edwards, Thomas C.; Guisan, Antoine; McMahon, Sean M.; Normand, Signe; Thuiller, Wilfried; Wuest, Rafael O.; Zimmermann, Niklaus E.; Elith, Jane

    2014-01-01

    Species distribution models (SDMs) are widely used to explain and predict species ranges and environmental niches. They are most commonly constructed by inferring species' occurrence–environment relationships using statistical and machine-learning methods. The variety of methods that can be used to construct SDMs (e.g. generalized linear/additive models, tree-based models, maximum entropy, etc.), and the variety of ways that such models can be implemented, permits substantial flexibility in SDM complexity. Building models with an appropriate amount of complexity for the study objectives is critical for robust inference. We characterize complexity as the shape of the inferred occurrence–environment relationships and the number of parameters used to describe them, and search for insights into whether additional complexity is informative or superfluous. By building ‘under fit’ models, having insufficient flexibility to describe observed occurrence–environment relationships, we risk misunderstanding the factors shaping species distributions. By building ‘over fit’ models, with excessive flexibility, we risk inadvertently ascribing pattern to noise or building opaque models. However, model selection can be challenging, especially when comparing models constructed under different modeling approaches. Here we argue for a more pragmatic approach: researchers should constrain the complexity of their models based on study objective, attributes of the data, and an understanding of how these interact with the underlying biological processes. We discuss guidelines for balancing under fitting with over fitting and consequently how complexity affects decisions made during model building. Although some generalities are possible, our discussion reflects differences in opinions that favor simpler versus more complex models. We conclude that combining insights from both simple and complex SDM building approaches best advances our knowledge of current and future species ranges.

  7. A polynomial-chaos-expansion-based building block approach for stochastic analysis of photonic circuits

    NASA Astrophysics Data System (ADS)

    Waqas, Abi; Melati, Daniele; Manfredi, Paolo; Grassi, Flavia; Melloni, Andrea

    2018-02-01

    The Building Block (BB) approach has recently emerged in photonic as a suitable strategy for the analysis and design of complex circuits. Each BB can be foundry related and contains a mathematical macro-model of its functionality. As well known, statistical variations in fabrication processes can have a strong effect on their functionality and ultimately affect the yield. In order to predict the statistical behavior of the circuit, proper analysis of the uncertainties effects is crucial. This paper presents a method to build a novel class of Stochastic Process Design Kits for the analysis of photonic circuits. The proposed design kits directly store the information on the stochastic behavior of each building block in the form of a generalized-polynomial-chaos-based augmented macro-model obtained by properly exploiting stochastic collocation and Galerkin methods. Using this approach, we demonstrate that the augmented macro-models of the BBs can be calculated once and stored in a BB (foundry dependent) library and then used for the analysis of any desired circuit. The main advantage of this approach, shown here for the first time in photonics, is that the stochastic moments of an arbitrary photonic circuit can be evaluated by a single simulation only, without the need for repeated simulations. The accuracy and the significant speed-up with respect to the classical Monte Carlo analysis are verified by means of classical photonic circuit example with multiple uncertain variables.

  8. Lessons learned while integrating habitat, dispersal, disturbance, and life-history traits into species habitat models under climate change

    Treesearch

    Louis R. Iverson; Anantha M. Prasad; Stephen N. Matthews; Matthew P. Peters

    2011-01-01

    We present an approach to modeling potential climate-driven changes in habitat for tree and bird species in the eastern United States. First, we took an empirical-statistical modeling approach, using randomForest, with species abundance data from national inventories combined with soil, climate, and landscape variables, to build abundance-based habitat models for 134...

  9. Patch-Based Generative Shape Model and MDL Model Selection for Statistical Analysis of Archipelagos

    NASA Astrophysics Data System (ADS)

    Ganz, Melanie; Nielsen, Mads; Brandt, Sami

    We propose a statistical generative shape model for archipelago-like structures. These kind of structures occur, for instance, in medical images, where our intention is to model the appearance and shapes of calcifications in x-ray radio graphs. The generative model is constructed by (1) learning a patch-based dictionary for possible shapes, (2) building up a time-homogeneous Markov model to model the neighbourhood correlations between the patches, and (3) automatic selection of the model complexity by the minimum description length principle. The generative shape model is proposed as a probability distribution of a binary image where the model is intended to facilitate sequential simulation. Our results show that a relatively simple model is able to generate structures visually similar to calcifications. Furthermore, we used the shape model as a shape prior in the statistical segmentation of calcifications, where the area overlap with the ground truth shapes improved significantly compared to the case where the prior was not used.

  10. A prediction model of signal degradation in LMSS for urban areas

    NASA Technical Reports Server (NTRS)

    Matsudo, Takashi; Minamisono, Kenichi; Karasawa, Yoshio; Shiokawa, Takayasu

    1993-01-01

    A prediction model of signal degradation in a Land Mobile Satellite Service (LMSS) for urban areas is proposed. This model treats shadowing effects caused by buildings statistically and can predict a Cumulative Distribution Function (CDF) of signal diffraction losses in urban areas as a function of system parameters such as frequency and elevation angle and environmental parameters such as number of building stories and so on. In order to examine the validity of the model, we compared the percentage of locations where diffraction losses were smaller than 6 dB obtained by the CDF with satellite visibility measured by a radiometer. As a result, it was found that this proposed model is useful for estimating the feasibility of providing LMSS in urban areas.

  11. Demographic Accounting and Model-Building. Education and Development Technical Reports.

    ERIC Educational Resources Information Center

    Stone, Richard

    This report describes and develops a model for coordinating a variety of demographic and social statistics within a single framework. The framework proposed, together with its associated methods of analysis, serves both general and specific functions. The general aim of these functions is to give numerical definition to the pattern of society and…

  12. Proceedings: Conference on Computers in Chemical Education and Research, Dekalb, Illinois, 19-23 July 1971.

    ERIC Educational Resources Information Center

    1971

    Computers have effected a comprehensive transformation of chemistry. Computers have greatly enhanced the chemist's ability to do model building, simulations, data refinement and reduction, analysis of data in terms of models, on-line data logging, automated control of experiments, quantum chemistry and statistical and mechanical calculations, and…

  13. Building out a Measurement Model to Incorporate Complexities of Testing in the Language Domain

    ERIC Educational Resources Information Center

    Wilson, Mark; Moore, Stephen

    2011-01-01

    This paper provides a summary of a novel and integrated way to think about the item response models (most often used in measurement applications in social science areas such as psychology, education, and especially testing of various kinds) from the viewpoint of the statistical theory of generalized linear and nonlinear mixed models. In addition,…

  14. Statistically Modeling I-V Characteristics of CNT-FET with LASSO

    NASA Astrophysics Data System (ADS)

    Ma, Dongsheng; Ye, Zuochang; Wang, Yan

    2017-08-01

    With the advent of internet of things (IOT), the need for studying new material and devices for various applications is increasing. Traditionally we build compact models for transistors on the basis of physics. But physical models are expensive and need a very long time to adjust for non-ideal effects. As the vision for the application of many novel devices is not certain or the manufacture process is not mature, deriving generalized accurate physical models for such devices is very strenuous, whereas statistical modeling is becoming a potential method because of its data oriented property and fast implementation. In this paper, one classical statistical regression method, LASSO, is used to model the I-V characteristics of CNT-FET and a pseudo-PMOS inverter simulation based on the trained model is implemented in Cadence. The normalized relative mean square prediction error of the trained model versus experiment sample data and the simulation results show that the model is acceptable for digital circuit static simulation. And such modeling methodology can extend to general devices.

  15. An Object-Relational Ifc Storage Model Based on Oracle Database

    NASA Astrophysics Data System (ADS)

    Li, Hang; Liu, Hua; Liu, Yong; Wang, Yuan

    2016-06-01

    With the building models are getting increasingly complicated, the levels of collaboration across professionals attract more attention in the architecture, engineering and construction (AEC) industry. In order to adapt the change, buildingSMART developed Industry Foundation Classes (IFC) to facilitate the interoperability between software platforms. However, IFC data are currently shared in the form of text file, which is defective. In this paper, considering the object-based inheritance hierarchy of IFC and the storage features of different database management systems (DBMS), we propose a novel object-relational storage model that uses Oracle database to store IFC data. Firstly, establish the mapping rules between data types in IFC specification and Oracle database. Secondly, design the IFC database according to the relationships among IFC entities. Thirdly, parse the IFC file and extract IFC data. And lastly, store IFC data into corresponding tables in IFC database. In experiment, three different building models are selected to demonstrate the effectiveness of our storage model. The comparison of experimental statistics proves that IFC data are lossless during data exchange.

  16. Gender Integration on U.S. Navy Submarines: Views of the First Wave

    DTIC Science & Technology

    2015-06-01

    Radiological Controls Assistant DACOWITS Defense Advisory Committee on Women in the Services DH Department Head DINQ delinquent DO Division...Previous studies have attempted to build statistical models based on surface fleet data to forecast female sustainability in the submarine fleet, yet 2...their integration? Such questions cannot be answered by collecting the type of quantitative data that can be analyzed using statistical methods. Complex

  17. Accuracy Assessment of a Complex Building 3d Model Reconstructed from Images Acquired with a Low-Cost Uas

    NASA Astrophysics Data System (ADS)

    Oniga, E.; Chirilă, C.; Stătescu, F.

    2017-02-01

    Nowadays, Unmanned Aerial Systems (UASs) are a wide used technique for acquisition in order to create buildings 3D models, providing the acquisition of a high number of images at very high resolution or video sequences, in a very short time. Since low-cost UASs are preferred, the accuracy of a building 3D model created using this platforms must be evaluated. To achieve results, the dean's office building from the Faculty of "Hydrotechnical Engineering, Geodesy and Environmental Engineering" of Iasi, Romania, has been chosen, which is a complex shape building with the roof formed of two hyperbolic paraboloids. Seven points were placed on the ground around the building, three of them being used as GCPs, while the remaining four as Check points (CPs) for accuracy assessment. Additionally, the coordinates of 10 natural CPs representing the building characteristic points were measured with a Leica TCR 405 total station. The building 3D model was created as a point cloud which was automatically generated based on digital images acquired with the low-cost UASs, using the image matching algorithm and different software like 3DF Zephyr, Visual SfM, PhotoModeler Scanner and Drone2Map for ArcGIS. Except for the PhotoModeler Scanner software, the interior and exterior orientation parameters were determined simultaneously by solving a self-calibrating bundle adjustment. Based on the UAS point clouds, automatically generated by using the above mentioned software and GNSS data respectively, the parameters of the east side hyperbolic paraboloid were calculated using the least squares method and a statistical blunder detection. Then, in order to assess the accuracy of the building 3D model, several comparisons were made for the facades and the roof with reference data, considered with minimum errors: TLS mesh for the facades and GNSS mesh for the roof. Finally, the front facade of the building was created in 3D based on its characteristic points using the PhotoModeler Scanner software, resulting a CAD (Computer Aided Design) model. The results showed the high potential of using low-cost UASs for building 3D model creation and if the building 3D model is created based on its characteristic points the accuracy is significantly improved.

  18. Modeling the Risk of Radiation-Induced Acute Esophagitis for Combined Washington University and RTOG Trial 93-11 Lung Cancer Patients

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, Ellen X.; Bradley, Jeffrey D.; El Naqa, Issam

    2012-04-01

    Purpose: To construct a maximally predictive model of the risk of severe acute esophagitis (AE) for patients who receive definitive radiation therapy (RT) for non-small-cell lung cancer. Methods and Materials: The dataset includes Washington University and RTOG 93-11 clinical trial data (events/patients: 120/374, WUSTL = 101/237, RTOG9311 = 19/137). Statistical model building was performed based on dosimetric and clinical parameters (patient age, sex, weight loss, pretreatment chemotherapy, concurrent chemotherapy, fraction size). A wide range of dose-volume parameters were extracted from dearchived treatment plans, including Dx, Vx, MOHx (mean of hottest x% volume), MOCx (mean of coldest x% volume), and gEUDmore » (generalized equivalent uniform dose) values. Results: The most significant single parameters for predicting acute esophagitis (RTOG Grade 2 or greater) were MOH85, mean esophagus dose (MED), and V30. A superior-inferior weighted dose-center position was derived but not found to be significant. Fraction size was found to be significant on univariate logistic analysis (Spearman R = 0.421, p < 0.00001) but not multivariate logistic modeling. Cross-validation model building was used to determine that an optimal model size needed only two parameters (MOH85 and concurrent chemotherapy, robustly selected on bootstrap model-rebuilding). Mean esophagus dose (MED) is preferred instead of MOH85, as it gives nearly the same statistical performance and is easier to compute. AE risk is given as a logistic function of (0.0688 Asterisk-Operator MED+1.50 Asterisk-Operator ConChemo-3.13), where MED is in Gy and ConChemo is either 1 (yes) if concurrent chemotherapy was given, or 0 (no). This model correlates to the observed risk of AE with a Spearman coefficient of 0.629 (p < 0.000001). Conclusions: Multivariate statistical model building with cross-validation suggests that a two-variable logistic model based on mean dose and the use of concurrent chemotherapy robustly predicts acute esophagitis risk in combined-data WUSTL and RTOG 93-11 trial datasets.« less

  19. Data-Driven Benchmarking of Building Energy Efficiency Utilizing Statistical Frontier Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kavousian, A; Rajagopal, R

    2014-01-01

    Frontier methods quantify the energy efficiency of buildings by forming an efficient frontier (best-practice technology) and by comparing all buildings against that frontier. Because energy consumption fluctuates over time, the efficiency scores are stochastic random variables. Existing applications of frontier methods in energy efficiency either treat efficiency scores as deterministic values or estimate their uncertainty by resampling from one set of measurements. Availability of smart meter data (repeated measurements of energy consumption of buildings) enables using actual data to estimate the uncertainty in efficiency scores. Additionally, existing applications assume a linear form for an efficient frontier; i.e.,they assume that themore » best-practice technology scales up and down proportionally with building characteristics. However, previous research shows that buildings are nonlinear systems. This paper proposes a statistical method called stochastic energy efficiency frontier (SEEF) to estimate a bias-corrected efficiency score and its confidence intervals from measured data. The paper proposes an algorithm to specify the functional form of the frontier, identify the probability distribution of the efficiency score of each building using measured data, and rank buildings based on their energy efficiency. To illustrate the power of SEEF, this paper presents the results from applying SEEF on a smart meter data set of 307 residential buildings in the United States. SEEF efficiency scores are used to rank individual buildings based on energy efficiency, to compare subpopulations of buildings, and to identify irregular behavior of buildings across different time-of-use periods. SEEF is an improvement to the energy-intensity method (comparing kWh/sq.ft.): whereas SEEF identifies efficient buildings across the entire spectrum of building sizes, the energy-intensity method showed bias toward smaller buildings. The results of this research are expected to assist researchers and practitioners compare and rank (i.e.,benchmark) buildings more robustly and over a wider range of building types and sizes. Eventually, doing so is expected to result in improved resource allocation in energy-efficiency programs.« less

  20. Future projections of insured losses in the German private building sector following the A1B climatic change scenario

    NASA Astrophysics Data System (ADS)

    Held, H.; Gerstengarbe, F.-W.; Hattermann, F.; Pinto, J. G.; Ulbrich, U.; Böhm, U.; Born, K.; Büchner, M.; Donat, M. G.; Kücken, M.; Leckebusch, G. C.; Nissen, K.; Nocke, T.; Österle, H.; Pardowitz, T.; Werner, P. C.; Burghoff, O.; Broecker, U.; Kubik, A.

    2012-04-01

    We present an overview of a complementary-approaches impact project dealing with the consequences of climate change for the natural hazard branch of the insurance industry in Germany. The project was conducted by four academic institutions together with the German Insurance Association (GDV) and finalized in autumn 2011. A causal chain is modeled that goes from global warming projections over regional meteorological impacts to regional economic losses for private buildings, hereby fully covering the area of Germany. This presentation will focus on wind storm related losses, although the method developed had also been applied in part to hail and flood impact losses. For the first time, the GDV supplied their collected set of insurance cases, dating back for decades, for such an impact study. These data were used to calibrate and validate event-based damage functions which in turn were driven by three different types of regional climate models to generate storm loss projections. The regional models were driven by a triplet of ECHAM5 experiments following the A1B scenario which were found representative in the recent ENSEMBLES intercomparison study. In our multi-modeling approach we used two types of regional climate models that conceptually differ at maximum: a dynamical model (CCLM) and a statistical model based on the idea of biased bootstrapping (STARS). As a third option we pursued a hybrid approach (statistical-dynamical downscaling). For the assessment of climate change impacts, the buildings' infrastructure and their economic value is kept at current values. For all three approaches, a significant increase of average storm losses and extreme event return levels in the German private building sector is found for future decades assuming an A1B-scenario. However, the three projections differ somewhat in terms of magnitude and regional differentiation. We have developed a formalism that allows us to express the combined effect of multi-source uncertainty on return levels within the framework of a generalized Pareto distribution.

  1. The NBS Energy Model Assessment project: Summary and overview

    NASA Astrophysics Data System (ADS)

    Gass, S. I.; Hoffman, K. L.; Jackson, R. H. F.; Joel, L. S.; Saunders, P. B.

    1980-09-01

    The activities and technical reports for the project are summarized. The reports cover: assessment of the documentation of Midterm Oil and Gas Supply Modeling System; analysis of the model methodology characteristics of the input and other supporting data; statistical procedures undergirding construction of the model and sensitivity of the outputs to variations in input, as well as guidelines and recommendations for the role of these in model building and developing procedures for their evaluation.

  2. Exact computation of the maximum-entropy potential of spiking neural-network models.

    PubMed

    Cofré, R; Cessac, B

    2014-05-01

    Understanding how stimuli and synaptic connectivity influence the statistics of spike patterns in neural networks is a central question in computational neuroscience. The maximum-entropy approach has been successfully used to characterize the statistical response of simultaneously recorded spiking neurons responding to stimuli. However, in spite of good performance in terms of prediction, the fitting parameters do not explain the underlying mechanistic causes of the observed correlations. On the other hand, mathematical models of spiking neurons (neuromimetic models) provide a probabilistic mapping between the stimulus, network architecture, and spike patterns in terms of conditional probabilities. In this paper we build an exact analytical mapping between neuromimetic and maximum-entropy models.

  3. Climate change presents increased potential for very large fires in the contiguous United States

    Treesearch

    R. Barbero; J. T. Abatzoglou; Sim Larkin; C. A. Kolden; B. Stocks

    2015-01-01

    Very large fires (VLFs) have important implications for communities, ecosystems, air quality and fire suppression expenditures. VLFs over the contiguous US have been strongly linked with meteorological and climatological variability. Building on prior modelling of VLFs (>5000 ha), an ensemble of 17 global climate models were statistically downscaled over the US...

  4. Tuition at PhD-Granting Institutions: A Supply and Demand Model.

    ERIC Educational Resources Information Center

    Koshal, Rajindar K.; And Others

    1994-01-01

    Builds and estimates a model that explains educational supply and demand behavior at PhD-granting institutions in the United States. The statistical analysis based on 1988-89 data suggests that student quantity, educational costs, average SAT score, class size, percentage of faculty with a PhD, graduation rate, ranking, and existence of a medical…

  5. From Theory to Air Force Practice: Applications and Non-Binary Extensions of Probabilistic Model-Building Genetic Algorithms

    DTIC Science & Technology

    2006-05-31

    dynamics (MD) and kinetic Monte Carlo ( KMC ) procedures. In 2D surface modeling our calculations project speedups of 9 orders of magnitude at 300 degrees...programming is used to perform customized statistical mechanics by bridging the different time scales of MD and KMC quickly and well. Speedups in

  6. Time series modeling and forecasting using memetic algorithms for regime-switching models.

    PubMed

    Bergmeir, Christoph; Triguero, Isaac; Molina, Daniel; Aznarte, José Luis; Benitez, José Manuel

    2012-11-01

    In this brief, we present a novel model fitting procedure for the neuro-coefficient smooth transition autoregressive model (NCSTAR), as presented by Medeiros and Veiga. The model is endowed with a statistically founded iterative building procedure and can be interpreted in terms of fuzzy rule-based systems. The interpretability of the generated models and a mathematically sound building procedure are two very important properties of forecasting models. The model fitting procedure employed by the original NCSTAR is a combination of initial parameter estimation by a grid search procedure with a traditional local search algorithm. We propose a different fitting procedure, using a memetic algorithm, in order to obtain more accurate models. An empirical evaluation of the method is performed, applying it to various real-world time series originating from three forecasting competitions. The results indicate that we can significantly enhance the accuracy of the models, making them competitive to models commonly used in the field.

  7. Snug as a Bug: Goodness of Fit and Quality of Models.

    PubMed

    Jupiter, Daniel C

    In elucidating risk factors, or attempting to make predictions about the behavior of subjects in our biomedical studies, we often build statistical models. These models are meant to capture some aspect of reality, or some real-world process underlying the phenomena we are examining. However, no model is perfect, and it is thus important to have tools to assess how accurate models are. In this commentary, we delve into the various roles that our models can play. Then we introduce the notion of the goodness of fit of models and lay the ground work for further study of diagnostic tests for assessing both the fidelity of our models and the statistical assumptions underlying them. Copyright © 2017 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  8. Development of a statistical model for cervical cancer cell death with irreversible electroporation in vitro.

    PubMed

    Yang, Yongji; Moser, Michael A J; Zhang, Edwin; Zhang, Wenjun; Zhang, Bing

    2018-01-01

    The aim of this study was to develop a statistical model for cell death by irreversible electroporation (IRE) and to show that the statistic model is more accurate than the electric field threshold model in the literature using cervical cancer cells in vitro. HeLa cell line was cultured and treated with different IRE protocols in order to obtain data for modeling the statistical relationship between the cell death and pulse-setting parameters. In total, 340 in vitro experiments were performed with a commercial IRE pulse system, including a pulse generator and an electric cuvette. Trypan blue staining technique was used to evaluate cell death after 4 hours of incubation following IRE treatment. Peleg-Fermi model was used in the study to build the statistical relationship using the cell viability data obtained from the in vitro experiments. A finite element model of IRE for the electric field distribution was also built. Comparison of ablation zones between the statistical model and electric threshold model (drawn from the finite element model) was used to show the accuracy of the proposed statistical model in the description of the ablation zone and its applicability in different pulse-setting parameters. The statistical models describing the relationships between HeLa cell death and pulse length and the number of pulses, respectively, were built. The values of the curve fitting parameters were obtained using the Peleg-Fermi model for the treatment of cervical cancer with IRE. The difference in the ablation zone between the statistical model and the electric threshold model was also illustrated to show the accuracy of the proposed statistical model in the representation of ablation zone in IRE. This study concluded that: (1) the proposed statistical model accurately described the ablation zone of IRE with cervical cancer cells, and was more accurate compared with the electric field model; (2) the proposed statistical model was able to estimate the value of electric field threshold for the computer simulation of IRE in the treatment of cervical cancer; and (3) the proposed statistical model was able to express the change in ablation zone with the change in pulse-setting parameters.

  9. Statistical molecular design of balanced compound libraries for QSAR modeling.

    PubMed

    Linusson, A; Elofsson, M; Andersson, I E; Dahlgren, M K

    2010-01-01

    A fundamental step in preclinical drug development is the computation of quantitative structure-activity relationship (QSAR) models, i.e. models that link chemical features of compounds with activities towards a target macromolecule associated with the initiation or progression of a disease. QSAR models are computed by combining information on the physicochemical and structural features of a library of congeneric compounds, typically assembled from two or more building blocks, and biological data from one or more in vitro assays. Since the models provide information on features affecting the compounds' biological activity they can be used as guides for further optimization. However, in order for a QSAR model to be relevant to the targeted disease, and drug development in general, the compound library used must contain molecules with balanced variation of the features spanning the chemical space believed to be important for interaction with the biological target. In addition, the assays used must be robust and deliver high quality data that are directly related to the function of the biological target and the associated disease state. In this review, we discuss and exemplify the concept of statistical molecular design (SMD) in the selection of building blocks and final synthetic targets (i.e. compounds to synthesize) to generate information-rich, balanced libraries for biological testing and computation of QSAR models.

  10. Using regression equations built from summary data in the psychological assessment of the individual case: extension to multiple regression.

    PubMed

    Crawford, John R; Garthwaite, Paul H; Denham, Annie K; Chelune, Gordon J

    2012-12-01

    Regression equations have many useful roles in psychological assessment. Moreover, there is a large reservoir of published data that could be used to build regression equations; these equations could then be employed to test a wide variety of hypotheses concerning the functioning of individual cases. This resource is currently underused because (a) not all psychologists are aware that regression equations can be built not only from raw data but also using only basic summary data for a sample, and (b) the computations involved are tedious and prone to error. In an attempt to overcome these barriers, Crawford and Garthwaite (2007) provided methods to build and apply simple linear regression models using summary statistics as data. In the present study, we extend this work to set out the steps required to build multiple regression models from sample summary statistics and the further steps required to compute the associated statistics for drawing inferences concerning an individual case. We also develop, describe, and make available a computer program that implements these methods. Although there are caveats associated with the use of the methods, these need to be balanced against pragmatic considerations and against the alternative of either entirely ignoring a pertinent data set or using it informally to provide a clinical "guesstimate." Upgraded versions of earlier programs for regression in the single case are also provided; these add the point and interval estimates of effect size developed in the present article.

  11. Genomic similarity and kernel methods I: advancements by building on mathematical and statistical foundations.

    PubMed

    Schaid, Daniel J

    2010-01-01

    Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.

  12. Courthouse Prototype Building

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Malhotra, Mini; New, Joshua Ryan; Im, Piljae

    As part of DOE's support of ANSI/ASHRAE/IES Standard 90.1 and IECC, researchers at Pacific Northwest National Laboratory (PNNL) apply a suite of prototype buildings covering 80% of the commercial building floor area in the U.S. for new construction. Efforts have started on expanding the prototype building suite to cover 90% of the commercial building floor area in the U.S., by developing prototype models for additional building types including place of worship, public order and safety, public assembly. Courthouse is courthouse is a sub-category under the “Public Order and Safety" building type category; other sub-categories include police station, fire station, andmore » jail, reformatory or penitentiary.ORNL used building design guides, databases, and documented courthouse projects, supplemented by personal communication with courthouse facility planning and design experts, to systematically conduct research on the courthouse building and system characteristics. This report documents the research conducted for the courthouse building type and proposes building and system characteristics for developing a prototype building energy model to be included in the Commercial Building Prototype Model suite. According to the 2012 CBECS, courthouses occupy a total of 436 million sqft of floor space or 0.5% of the total floor space in all commercial buildings in the US, next to fast food (0.35%), grocery store or food market (0.88%), and restaurant or cafeteria (1.2%) building types currently included in the Commercial Prototype Building Model suite. Considering aggregated average, courthouse falls among the larger with a mean floor area of 69,400 sqft smaller fuel consumption intensity building types and an average of 94.7 kBtu/sqft compared to 77.8 kBtu/sqft for office and 80 kBtu/sqft for all commercial buildings.Courthouses range in size from 1000 sqft to over a million square foot building gross square feet and 1 courtroom to over 100 courtrooms. Small courthouses represent a majority of courthouse buildings. However, collectively they comprise a small fraction of total courthouse floor area in the US. Spaces and operation of courthouse also varies depending on the court type (federal court vs state court; district, appellate, versus Supreme Court) and jurisdiction (general jurisdiction, general jurisdiction trial, or special courts). Based on the statistics on courthouses, general jurisdiction trial court is considered for the prototype model. The model is assumed to be a 4-courtroom, small, 72,000 sqft three-story building including a ground level/ basement.« less

  13. The Building Game: From Enumerative Combinatorics to Conformational Diffusion

    NASA Astrophysics Data System (ADS)

    Johnson-Chyzhykov, Daniel; Menon, Govind

    2016-08-01

    We study a discrete attachment model for the self-assembly of polyhedra called the building game. We investigate two distinct aspects of the model: (i) enumerative combinatorics of the intermediate states and (ii) a notion of Brownian motion for the polyhedral linkage defined by each intermediate that we term conformational diffusion. The combinatorial configuration space of the model is computed for the Platonic, Archimedean, and Catalan solids of up to 30 faces, and several novel enumerative results are generated. These represent the most exhaustive computations of this nature to date. We further extend the building game to include geometric information. The combinatorial structure of each intermediate yields a systems of constraints specifying a polyhedral linkage and its moduli space. We use a random walk to simulate a reflected Brownian motion in each moduli space. Empirical statistics of the random walk may be used to define the rates of transition for a Markov process modeling the process of self-assembly.

  14. What's Missing in Teaching Probability and Statistics: Building Cognitive Schema for Understanding Random Phenomena

    ERIC Educational Resources Information Center

    Kuzmak, Sylvia

    2016-01-01

    Teaching probability and statistics is more than teaching the mathematics itself. Historically, the mathematics of probability and statistics was first developed through analyzing games of chance such as the rolling of dice. This article makes the case that the understanding of probability and statistics is dependent upon building a…

  15. A model to predict radon exhalation from walls to indoor air based on the exhalation from building material samples.

    PubMed

    Sahoo, B K; Sapra, B K; Gaware, J J; Kanse, S D; Mayya, Y S

    2011-06-01

    In recognition of the fact that building materials are an important source of indoor radon, second only to soil, surface radon exhalation fluxes have been extensively measured from the samples of these materials. Based on this flux data, several researchers have attempted to predict the inhalation dose attributable to radon emitted from walls and ceilings made up of these materials. However, an important aspect not considered in this methodology is the enhancement of the radon flux from the wall or the ceiling constructed using the same building material. This enhancement occurs mainly because of the change in the radon diffusion process from the former to the latter configuration. To predict the true radon flux from the wall based on the flux data of building material samples, we now propose a semi-empirical model involving radon diffusion length and the physical dimensions of the samples as well as wall thickness as other input parameters. This model has been established by statistically fitting the ratio of the solution to radon diffusion equations for the cases of three-dimensional cuboidal shaped building materials (such as brick, concrete block) and one dimensional wall system to a simple mathematical function. The model predictions have been validated against the measurements made at a new construction site. This model provides an alternative tool (substitute to conventional 1-D model) to estimate radon flux from a wall without relying on ²²⁶Ra content, radon emanation factor and bulk density of the samples. Moreover, it may be very useful in the context of developing building codes for radon regulation in new buildings. Copyright © 2011 Elsevier B.V. All rights reserved.

  16. Statistics of acoustic emissions and stress drops during granular shearing using a stick-slip fiber bundle mode

    NASA Astrophysics Data System (ADS)

    Cohen, D.; Michlmayr, G.; Or, D.

    2012-04-01

    Shearing of dense granular materials appears in many engineering and Earth sciences applications. Under a constant strain rate, the shearing stress at steady state oscillates with slow rises followed by rapid drops that are linked to the build up and failure of force chains. Experiments indicate that these drops display exponential statistics. Measurements of acoustic emissions during shearing indicates that the energy liberated by failure of these force chains has power-law statistics. Representing force chains as fibers, we use a stick-slip fiber bundle model to obtain analytical solutions of the statistical distribution of stress drops and failure energy. In the model, fibers stretch, fail, and regain strength during deformation. Fibers have Weibull-distributed threshold strengths with either quenched and annealed disorder. The shape of the distribution for drops and energy obtained from the model are similar to those measured during shearing experiments. This simple model may be useful to identify failure events linked to force chain failures. Future generalizations of the model that include different types of fiber failure may also allow identification of different types of granular failures that have distinct statistical acoustic emission signatures.

  17. Statistical label fusion with hierarchical performance models

    PubMed Central

    Asman, Andrew J.; Dagley, Alexander S.; Landman, Bennett A.

    2014-01-01

    Label fusion is a critical step in many image segmentation frameworks (e.g., multi-atlas segmentation) as it provides a mechanism for generalizing a collection of labeled examples into a single estimate of the underlying segmentation. In the multi-label case, typical label fusion algorithms treat all labels equally – fully neglecting the known, yet complex, anatomical relationships exhibited in the data. To address this problem, we propose a generalized statistical fusion framework using hierarchical models of rater performance. Building on the seminal work in statistical fusion, we reformulate the traditional rater performance model from a multi-tiered hierarchical perspective. This new approach provides a natural framework for leveraging known anatomical relationships and accurately modeling the types of errors that raters (or atlases) make within a hierarchically consistent formulation. Herein, we describe several contributions. First, we derive a theoretical advancement to the statistical fusion framework that enables the simultaneous estimation of multiple (hierarchical) performance models within the statistical fusion context. Second, we demonstrate that the proposed hierarchical formulation is highly amenable to the state-of-the-art advancements that have been made to the statistical fusion framework. Lastly, in an empirical whole-brain segmentation task we demonstrate substantial qualitative and significant quantitative improvement in overall segmentation accuracy. PMID:24817809

  18. Initial Systematic Investigations of the Weakly Coupled Free Fermionic Heterotic String Landscape Statistics

    NASA Astrophysics Data System (ADS)

    Renner, Timothy

    2011-12-01

    A C++ framework was constructed with the explicit purpose of systematically generating string models using the Weakly Coupled Free Fermionic Heterotic String (WCFFHS) method. The software, optimized for speed, generality, and ease of use, has been used to conduct preliminary systematic investigations of WCFFHS vacua. Documentation for this framework is provided in the Appendix. After an introduction to theoretical and computational aspects of WCFFHS model building, a study of ten-dimensional WCFFHS models is presented. Degeneracies among equivalent expressions of each of the known models are investigated and classified. A study of more phenomenologically realistic four-dimensional models based on the well known "NAHE" set is then presented, with statistics being reported on gauge content, matter representations, and space-time supersymmetries. The final study is a parallel to the NAHE study in which a variation of the NAHE set is systematically extended and examined statistically. Special attention is paid to models with "mirroring"---identical observable and hidden sector gauge groups and matter representations.

  19. Building Coherent Validation Arguments for the Measurement of Latent Constructs with Unified Statistical Frameworks

    ERIC Educational Resources Information Center

    Rupp, Andre A.

    2012-01-01

    In the focus article of this issue, von Davier, Naemi, and Roberts essentially coupled: (1) a short methodological review of structural similarities of latent variable models with discrete and continuous latent variables; and (2) 2 short empirical case studies that show how these models can be applied to real, rather than simulated, large-scale…

  20. Landowner interest in multifunctional agroforestry riparian buffers.

    Treesearch

    Katie Trozzo; John Munsell; James Chamberlain

    2014-01-01

    Adoption of temperate agroforestry practices generally remains limited despite considerable advances in basic science. This study builds on temperate agroforestry adoption research by empirically testing a statistical model of interest in native fruit and nut tree riparian buffers using technology and agroforestry adoption theory. Data...

  1. Statistical modeling of 4D respiratory lung motion using diffeomorphic image registration.

    PubMed

    Ehrhardt, Jan; Werner, René; Schmidt-Richberg, Alexander; Handels, Heinz

    2011-02-01

    Modeling of respiratory motion has become increasingly important in various applications of medical imaging (e.g., radiation therapy of lung cancer). Current modeling approaches are usually confined to intra-patient registration of 3D image data representing the individual patient's anatomy at different breathing phases. We propose an approach to generate a mean motion model of the lung based on thoracic 4D computed tomography (CT) data of different patients to extend the motion modeling capabilities. Our modeling process consists of three steps: an intra-subject registration to generate subject-specific motion models, the generation of an average shape and intensity atlas of the lung as anatomical reference frame, and the registration of the subject-specific motion models to the atlas in order to build a statistical 4D mean motion model (4D-MMM). Furthermore, we present methods to adapt the 4D mean motion model to a patient-specific lung geometry. In all steps, a symmetric diffeomorphic nonlinear intensity-based registration method was employed. The Log-Euclidean framework was used to compute statistics on the diffeomorphic transformations. The presented methods are then used to build a mean motion model of respiratory lung motion using thoracic 4D CT data sets of 17 patients. We evaluate the model by applying it for estimating respiratory motion of ten lung cancer patients. The prediction is evaluated with respect to landmark and tumor motion, and the quantitative analysis results in a mean target registration error (TRE) of 3.3 ±1.6 mm if lung dynamics are not impaired by large lung tumors or other lung disorders (e.g., emphysema). With regard to lung tumor motion, we show that prediction accuracy is independent of tumor size and tumor motion amplitude in the considered data set. However, tumors adhering to non-lung structures degrade local lung dynamics significantly and the model-based prediction accuracy is lower in these cases. The statistical respiratory motion model is capable of providing valuable prior knowledge in many fields of applications. We present two examples of possible applications in radiation therapy and image guided diagnosis.

  2. Building Analytic Capacity and Statistical Literacy Among Title IV-E MSW Students

    PubMed Central

    LERY, BRIDGETTE; PUTNAM-HORNSTEIN, EMILY; WIEGMANN, WENDY; KING, BRYN

    2016-01-01

    Building and sustaining effective child welfare practice requires an infrastructure of social work professionals trained to use data to identify target populations, connect interventions to outcomes, adapt practice to varying contexts and dynamic populations, and assess their own effectiveness. Increasingly, public agencies are implementing models of self-assessment in which administrative data are used to guide and continuously evaluate the implementation of programs and policies. The research curriculum described in the article was developed to provide Title IV-E and other students interested in public child welfare systems with hands-on opportunities to become experienced and “statistically literate” users of aggregated public child welfare data from California’s administrative child welfare system, attending to the often missing link between data/research and practice improvement. PMID:27429600

  3. Regional analyses of labor markets and demography: a model based Norwegian example.

    PubMed

    Stambol, L S; Stolen, N M; Avitsland, T

    1998-01-01

    The authors discuss the regional REGARD model, developed by Statistics Norway to analyze the regional implications of macroeconomic development of employment, labor force, and unemployment. "In building the model, empirical analyses of regional producer behavior in manufacturing industries have been performed, and the relation between labor market development and regional migration has been investigated. Apart from providing a short description of the REGARD model, this article demonstrates the functioning of the model, and presents some results of an application." excerpt

  4. An Improved Snake Model for Refinement of Lidar-Derived Building Roof Contours Using Aerial Images

    NASA Astrophysics Data System (ADS)

    Chen, Qi; Wang, Shugen; Liu, Xiuguo

    2016-06-01

    Building roof contours are considered as very important geometric data, which have been widely applied in many fields, including but not limited to urban planning, land investigation, change detection and military reconnaissance. Currently, the demand on building contours at a finer scale (especially in urban areas) has been raised in a growing number of studies such as urban environment quality assessment, urban sprawl monitoring and urban air pollution modelling. LiDAR is known as an effective means of acquiring 3D roof points with high elevation accuracy. However, the precision of the building contour obtained from LiDAR data is restricted by its relatively low scanning resolution. With the use of the texture information from high-resolution imagery, the precision can be improved. In this study, an improved snake model is proposed to refine the initial building contours extracted from LiDAR. First, an improved snake model is constructed with the constraints of the deviation angle, image gradient, and area. Then, the nodes of the contour are moved in a certain range to find the best optimized result using greedy algorithm. Considering both precision and efficiency, the candidate shift positions of the contour nodes are constrained, and the searching strategy for the candidate nodes is explicitly designed. The experiments on three datasets indicate that the proposed method for building contour refinement is effective and feasible. The average quality index is improved from 91.66% to 93.34%. The statistics of the evaluation results for every single building demonstrated that 77.0% of the total number of contours is updated with higher quality index.

  5. Classical Statistics and Statistical Learning in Imaging Neuroscience

    PubMed Central

    Bzdok, Danilo

    2017-01-01

    Brain-imaging research has predominantly generated insight by means of classical statistics, including regression-type analyses and null-hypothesis testing using t-test and ANOVA. Throughout recent years, statistical learning methods enjoy increasing popularity especially for applications in rich and complex data, including cross-validated out-of-sample prediction using pattern classification and sparsity-inducing regression. This concept paper discusses the implications of inferential justifications and algorithmic methodologies in common data analysis scenarios in neuroimaging. It is retraced how classical statistics and statistical learning originated from different historical contexts, build on different theoretical foundations, make different assumptions, and evaluate different outcome metrics to permit differently nuanced conclusions. The present considerations should help reduce current confusion between model-driven classical hypothesis testing and data-driven learning algorithms for investigating the brain with imaging techniques. PMID:29056896

  6. Modeling Age-Friendly Environment, Active Aging, and Social Connectedness in an Emerging Asian Economy.

    PubMed

    Lai, Ming-Ming; Lein, Shi-Ying; Lau, Siok-Hwa; Lai, Ming-Ling

    2016-01-01

    This paper empirically tested eight key features of WHO guidelines to age-friendly community by surveying 211 informal caregivers and 402 self-care adults (aged 45 to 85 and above) in Malaysia. We examined the associations of these eight features with active aging and social connectedness through exploratory and confirmatory factor analyses. A structural model with satisfactory goodness-of-fit indices (CMIN/df = 1.11, RMSEA = 0.02, NFI = 0.97, TLI = 1.00, CFI = 1.00, and GFI = 0.96) indicates that transportation and housing, community support and health services, and outdoor spaces and buildings are statistically significant in creating an age-friendly environment. We found a statistically significant positive relationship between an age-friendly environment and active aging. This relationship is mediated by social connectedness. The results indicate that built environments such as accessible public transportations and housing, affordable and accessible healthcare services, and elderly friendly outdoor spaces and buildings have to be put into place before social environment in building an age-friendly environment. Otherwise, the structural barriers would hinder social interactions for the aged. The removal of the environmental barriers and improved public transportation services provide short-term solutions to meet the varied and growing needs of the older population.

  7. Modeling Age-Friendly Environment, Active Aging, and Social Connectedness in an Emerging Asian Economy

    PubMed Central

    Lai, Ming-Ming; Lein, Shi-Ying; Lau, Siok-Hwa; Lai, Ming-Ling

    2016-01-01

    This paper empirically tested eight key features of WHO guidelines to age-friendly community by surveying 211 informal caregivers and 402 self-care adults (aged 45 to 85 and above) in Malaysia. We examined the associations of these eight features with active aging and social connectedness through exploratory and confirmatory factor analyses. A structural model with satisfactory goodness-of-fit indices (CMIN/df = 1.11, RMSEA = 0.02, NFI = 0.97, TLI = 1.00, CFI = 1.00, and GFI = 0.96) indicates that transportation and housing, community support and health services, and outdoor spaces and buildings are statistically significant in creating an age-friendly environment. We found a statistically significant positive relationship between an age-friendly environment and active aging. This relationship is mediated by social connectedness. The results indicate that built environments such as accessible public transportations and housing, affordable and accessible healthcare services, and elderly friendly outdoor spaces and buildings have to be put into place before social environment in building an age-friendly environment. Otherwise, the structural barriers would hinder social interactions for the aged. The removal of the environmental barriers and improved public transportation services provide short-term solutions to meet the varied and growing needs of the older population. PMID:27293889

  8. How to interpret the results of medical time series data analysis: Classical statistical approaches versus dynamic Bayesian network modeling.

    PubMed

    Onisko, Agnieszka; Druzdzel, Marek J; Austin, R Marshall

    2016-01-01

    Classical statistics is a well-established approach in the analysis of medical data. While the medical community seems to be familiar with the concept of a statistical analysis and its interpretation, the Bayesian approach, argued by many of its proponents to be superior to the classical frequentist approach, is still not well-recognized in the analysis of medical data. The goal of this study is to encourage data analysts to use the Bayesian approach, such as modeling with graphical probabilistic networks, as an insightful alternative to classical statistical analysis of medical data. This paper offers a comparison of two approaches to analysis of medical time series data: (1) classical statistical approach, such as the Kaplan-Meier estimator and the Cox proportional hazards regression model, and (2) dynamic Bayesian network modeling. Our comparison is based on time series cervical cancer screening data collected at Magee-Womens Hospital, University of Pittsburgh Medical Center over 10 years. The main outcomes of our comparison are cervical cancer risk assessments produced by the three approaches. However, our analysis discusses also several aspects of the comparison, such as modeling assumptions, model building, dealing with incomplete data, individualized risk assessment, results interpretation, and model validation. Our study shows that the Bayesian approach is (1) much more flexible in terms of modeling effort, and (2) it offers an individualized risk assessment, which is more cumbersome for classical statistical approaches.

  9. Defense Acquisition Research Journal. Volume 21, Number 1, Issue 68

    DTIC Science & Technology

    2014-01-01

    Harrison’s game theory model of competition examines the bidding behavior of two equal competitors, but it does not address character- istics that...analysis examines a series of outcomes in both competitive and sole-source acquisition programs, using a statistical model that builds on a game theory ...model- ing, within a game theory framework developed by Todd Harrison, to show that the DoD may actually incur increased costs from competi- tion

  10. Improving Domain-specific Machine Translation by Constraining the Language Model

    DTIC Science & Technology

    2012-07-01

    performance. To make up for the lack of parallel training data, one assumption is that more monolingual target language data should be used in building the...target language model. Prior work on domain-specific MT has focused on training target language models with monolingual 2 domain-specific data...showed that the using a large dictionary extracted from medical domain documents in a statistical MT system to generalize the training data significantly

  11. Case study of odor and indoor air quality assessment in the dewatering building at the Stickney Water Reclamation Plant.

    PubMed

    Sharma, Manju; O'Connell, Susan; Garelli, Brett; Sattayatewa, Chakkrid; Moschandreas, Demetrios; Pagilla, Krishna

    2012-01-01

    Indoor air quality (IAQ) and odors were determined using sampling/monitoring, measurement, and modeling methods in a large dewatering building at a very large water reclamation plant. The ultimate goal was to determine control strategies to reduce the sensory impacts on the workforce and achieve odor reduction within the building. Study approaches included: (1) investigation of air mixing by using CO(2) as an indicator, (2) measurement of airflow capacity of ventilation fans, (3) measurement of odors and odorants, (4) development of statistical and IAQ models, and (5) recommendation of control strategies. The results showed that air quality in the building complies with occupational safety and health guidelines; however, nuisance odors that can increase stress and productivity loss still persist. Excess roof fan capacity induced odor dispersion to the upper levels. Lack of a local air exhaust system of sufficient capacity and optimum design was found to be the contributor to occasional less than adequate indoor air quality and odors. Overall, air ventilation rate in the building has less effect on persistence of odors in the building. Odor/odorant emission rates from centrifuge drops were approximately 100 times higher than those from the open conveyors. Based on measurements and modeling, the key control strategies recommended include increasing local air exhaust system capacity and relocation of exhaust hoods closer to the centrifuge drops.

  12. Electricity Markets, Smart Grids and Smart Buildings

    NASA Astrophysics Data System (ADS)

    Falcey, Jonathan M.

    A smart grid is an electricity network that accommodates two-way power flows, and utilizes two-way communications and increased measurement, in order to provide more information to customers and aid in the development of a more efficient electricity market. The current electrical network is outdated and has many shortcomings relating to power flows, inefficient electricity markets, generation/supply balance, a lack of information for the consumer and insufficient consumer interaction with electricity markets. Many of these challenges can be addressed with a smart grid, but there remain significant barriers to the implementation of a smart grid. This paper proposes a novel method for the development of a smart grid utilizing a bottom up approach (starting with smart buildings/campuses) with the goal of providing the framework and infrastructure necessary for a smart grid instead of the more traditional approach (installing many smart meters and hoping a smart grid emerges). This novel approach involves combining deterministic and statistical methods in order to accurately estimate building electricity use down to the device level. It provides model users with a cheaper alternative to energy audits and extensive sensor networks (the current methods of quantifying electrical use at this level) which increases their ability to modify energy consumption and respond to price signals The results of this method are promising, but they are still preliminary. As a result, there is still room for improvement. On days when there were no missing or inaccurate data, this approach has R2 of about 0.84, sometimes as high as 0.94 when compared to measured results. However, there were many days where missing data brought overall accuracy down significantly. In addition, the development and implementation of the calibration process is still underway and some functional additions must be made in order to maximize accuracy. The calibration process must be completed before a reliable accuracy can be determined. While this work shows that a combination of a deterministic and statistical methods can accurately forecast building energy usage, the ability to produce accurate results is heavily dependent upon software availability, accurate data and the proper calibration of the model. Creating the software required for a smart building model is time consuming and expensive. Bad or missing data have significant negative impacts on the accuracy of the results and can be caused by a hodgepodge of equipment and communication protocols. Proper calibration of the model is essential to ensure that the device level estimations are sufficiently accurate. Any building model which is to be successful at creating a smart building must be able to overcome these challenges.

  13. Techniques in teaching statistics : linking research production and research use.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martinez-Moyano, I .; Smith, A.; Univ. of Massachusetts at Boston)

    In the spirit of closing the 'research-practice gap,' the authors extend evidence-based principles to statistics instruction in social science graduate education. The authors employ a Delphi method to survey experienced statistics instructors to identify teaching techniques to overcome the challenges inherent in teaching statistics to students enrolled in practitioner-oriented master's degree programs. Among the teaching techniques identi?ed as essential are using real-life examples, requiring data collection exercises, and emphasizing interpretation rather than results. Building on existing research, preliminary interviews, and the ?ndings from the study, the authors develop a model describing antecedents to the strength of the link between researchmore » and practice.« less

  14. Urban weather data and building models for the inclusion of the urban heat island effect in building performance simulation.

    PubMed

    Palme, M; Inostroza, L; Villacreses, G; Lobato, A; Carrasco, C

    2017-10-01

    This data article presents files supporting calculation for urban heat island (UHI) inclusion in building performance simulation (BPS). Methodology is used in the research article "From urban climate to energy consumption. Enhancing building performance simulation by including the urban heat island effect" (Palme et al., 2017) [1]. In this research, a Geographical Information System (GIS) study is done in order to statistically represent the most important urban scenarios of four South-American cities (Guayaquil, Lima, Antofagasta and Valparaíso). Then, a Principal Component Analysis (PCA) is done to obtain reference Urban Tissues Categories (UTC) to be used in urban weather simulation. The urban weather files are generated by using the Urban Weather Generator (UWG) software (version 4.1 beta). Finally, BPS is run out with the Transient System Simulation (TRNSYS) software (version 17). In this data paper, four sets of data are presented: 1) PCA data (excel) to explain how to group different urban samples in representative UTC; 2) UWG data (text) to reproduce the Urban Weather Generation for the UTC used in the four cities (4 UTC in Lima, Guayaquil, Antofagasta and 5 UTC in Valparaíso); 3) weather data (text) with the resulting rural and urban weather; 4) BPS models (text) data containing the TRNSYS models (four building models).

  15. Book review of Wildlife 2000: Modeling relationships of terrestrial vertebrates, edited by J. Verner, M.L. Morrison, and C.J. Ralph

    USGS Publications Warehouse

    Cooper, Robert J.

    1988-01-01

    "Wildlife 2000" is the proceedings of a conference held 7-11 October 1984, near Lake Tahoe, California, the objective of which was to present an up-to-date synthesis of models that predict the responses of wildlife to habitat change. This extremely attractive, well-produced volume has been well received by the wildlife management profession; the editors received an outstanding publication award from The Wildlife Society for this publication. The accolades are deserved. The symposium was purposely integrated in terms of research and management perspectives. Each of the six sections is summarized by both research and management points of view. A majority of the 60 papers presented deal with birds. Although many chapters require a strong quantitative background, especially in multivariate statistics, many others do not. When one compares this publication with previous habitat-modeling symposia proceedings, one realizes what a superior contribution "Wildlife 2000" is, and how incredible far wildlife-habitat modelers have come in a short time. There are very few redundant papers of "nonpapers" in this volume. The wide array of modeling procedures, statistical methods, and computer software developed and used by the authors is impressive; we have indeed learned how to build models. Whether or not we have learned how to build good models is another question.

  16. Modeling urban expansion in Yangon, Myanmar using Landsat time-series and stereo GeoEye Images

    NASA Astrophysics Data System (ADS)

    Sritarapipat, Tanakorn; Takeuchi, Wataru

    2016-06-01

    This research proposed a methodology to model the urban expansion based dynamic statistical model using Landsat and GeoEye Images. Landsat Time-Series from 1978 to 2010 have been applied to extract land covers from the past to the present. Stereo GeoEye Images have been employed to obtain the height of the building. The class translation was obtained by observing land cover from the past to the present. The height of the building can be used to detect the center of the urban area (mainly commercial area). It was assumed that the class translation and the distance of multi-centers of the urban area also the distance of the roads affect the urban growth. The urban expansion model based on the dynamic statistical model was defined to refer to three factors; (1) the class translation, (2) the distance of the multicenters of the urban areas, and (3) the distance from the roads. Estimation and prediction of urban expansion by using our model were formulated and expressed in this research. The experimental area was set up in Yangon, Myanmar. Since it is the major of country's economic with more than five million population and the urban areas have rapidly increased. The experimental results indicated that our model of urban expansion estimated urban growth in both estimation and prediction steps in efficiency.

  17. Automated structure solution, density modification and model building.

    PubMed

    Terwilliger, Thomas C

    2002-11-01

    The approaches that form the basis of automated structure solution in SOLVE and RESOLVE are described. The use of a scoring scheme to convert decision making in macromolecular structure solution to an optimization problem has proven very useful and in many cases a single clear heavy-atom solution can be obtained and used for phasing. Statistical density modification is well suited to an automated approach to structure solution because the method is relatively insensitive to choices of numbers of cycles and solvent content. The detection of non-crystallographic symmetry (NCS) in heavy-atom sites and checking of potential NCS operations against the electron-density map has proven to be a reliable method for identification of NCS in most cases. Automated model building beginning with an FFT-based search for helices and sheets has been successful in automated model building for maps with resolutions as low as 3 A. The entire process can be carried out in a fully automatic fashion in many cases.

  18. Statistical emulators of maize, rice, soybean and wheat yields from global gridded crop models

    DOE PAGES

    Blanc, Élodie

    2017-01-26

    This study provides statistical emulators of crop yields based on global gridded crop model simulations from the Inter-Sectoral Impact Model Intercomparison Project Fast Track project. The ensemble of simulations is used to build a panel of annual crop yields from five crop models and corresponding monthly summer weather variables for over a century at the grid cell level globally. This dataset is then used to estimate, for each crop and gridded crop model, the statistical relationship between yields, temperature, precipitation and carbon dioxide. This study considers a new functional form to better capture the non-linear response of yields to weather,more » especially for extreme temperature and precipitation events, and now accounts for the effect of soil type. In- and out-of-sample validations show that the statistical emulators are able to replicate spatial patterns of yields crop levels and changes overtime projected by crop models reasonably well, although the accuracy of the emulators varies by model and by region. This study therefore provides a reliable and accessible alternative to global gridded crop yield models. By emulating crop yields for several models using parsimonious equations, the tools provide a computationally efficient method to account for uncertainty in climate change impact assessments.« less

  19. Statistical emulators of maize, rice, soybean and wheat yields from global gridded crop models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Blanc, Élodie

    This study provides statistical emulators of crop yields based on global gridded crop model simulations from the Inter-Sectoral Impact Model Intercomparison Project Fast Track project. The ensemble of simulations is used to build a panel of annual crop yields from five crop models and corresponding monthly summer weather variables for over a century at the grid cell level globally. This dataset is then used to estimate, for each crop and gridded crop model, the statistical relationship between yields, temperature, precipitation and carbon dioxide. This study considers a new functional form to better capture the non-linear response of yields to weather,more » especially for extreme temperature and precipitation events, and now accounts for the effect of soil type. In- and out-of-sample validations show that the statistical emulators are able to replicate spatial patterns of yields crop levels and changes overtime projected by crop models reasonably well, although the accuracy of the emulators varies by model and by region. This study therefore provides a reliable and accessible alternative to global gridded crop yield models. By emulating crop yields for several models using parsimonious equations, the tools provide a computationally efficient method to account for uncertainty in climate change impact assessments.« less

  20. Regression Commonality Analysis: A Technique for Quantitative Theory Building

    ERIC Educational Resources Information Center

    Nimon, Kim; Reio, Thomas G., Jr.

    2011-01-01

    When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

  1. Maximum entropy models as a tool for building precise neural controls.

    PubMed

    Savin, Cristina; Tkačik, Gašper

    2017-10-01

    Neural responses are highly structured, with population activity restricted to a small subset of the astronomical range of possible activity patterns. Characterizing these statistical regularities is important for understanding circuit computation, but challenging in practice. Here we review recent approaches based on the maximum entropy principle used for quantifying collective behavior in neural activity. We highlight recent models that capture population-level statistics of neural data, yielding insights into the organization of the neural code and its biological substrate. Furthermore, the MaxEnt framework provides a general recipe for constructing surrogate ensembles that preserve aspects of the data, but are otherwise maximally unstructured. This idea can be used to generate a hierarchy of controls against which rigorous statistical tests are possible. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Efficient Geological Modelling of Large AEM Surveys

    NASA Astrophysics Data System (ADS)

    Bach, Torben; Martlev Pallesen, Tom; Jørgensen, Flemming; Lundh Gulbrandsen, Mats; Mejer Hansen, Thomas

    2014-05-01

    Combining geological expert knowledge with geophysical observations into a final 3D geological model is, in most cases, not a straight forward process. It typically involves many types of data and requires both an understanding of the data and the geological target. When dealing with very large areas, such as modelling of large AEM surveys, the manual task for the geologist to correctly evaluate and properly utilise all the data available in the survey area, becomes overwhelming. In the ERGO project (Efficient High-Resolution Geological Modelling) we address these issues and propose a new modelling methodology enabling fast and consistent modelling of very large areas. The vision of the project is to build a user friendly expert system that enables the combination of very large amounts of geological and geophysical data with geological expert knowledge. This is done in an "auto-pilot" type functionality, named Smart Interpretation, designed to aid the geologist in the interpretation process. The core of the expert system is a statistical model that describes the relation between data and geological interpretation made by a geological expert. This facilitates fast and consistent modelling of very large areas. It will enable the construction of models with high resolution as the system will "learn" the geology of an area directly from interpretations made by a geological expert, and instantly apply it to all hard data in the survey area, ensuring the utilisation of all the data available in the geological model. Another feature is that the statistical model the system creates for one area can be used in another area with similar data and geology. This feature can be useful as an aid to an untrained geologist to build a geological model, guided by the experienced geologist way of interpretation, as quantified by the expert system in the core statistical model. In this project presentation we provide some examples of the problems we are aiming to address in the project, and show some preliminary results.

  3. EEGLAB, SIFT, NFT, BCILAB, and ERICA: new tools for advanced EEG processing.

    PubMed

    Delorme, Arnaud; Mullen, Tim; Kothe, Christian; Akalin Acar, Zeynep; Bigdely-Shamlo, Nima; Vankov, Andrey; Makeig, Scott

    2011-01-01

    We describe a set of complementary EEG data collection and processing tools recently developed at the Swartz Center for Computational Neuroscience (SCCN) that connect to and extend the EEGLAB software environment, a freely available and readily extensible processing environment running under Matlab. The new tools include (1) a new and flexible EEGLAB STUDY design facility for framing and performing statistical analyses on data from multiple subjects; (2) a neuroelectromagnetic forward head modeling toolbox (NFT) for building realistic electrical head models from available data; (3) a source information flow toolbox (SIFT) for modeling ongoing or event-related effective connectivity between cortical areas; (4) a BCILAB toolbox for building online brain-computer interface (BCI) models from available data, and (5) an experimental real-time interactive control and analysis (ERICA) environment for real-time production and coordination of interactive, multimodal experiments.

  4. A new model in achieving Green Accounting at hotels in Bali

    NASA Astrophysics Data System (ADS)

    Astawa, I. P.; Ardina, C.; Yasa, I. M. S.; Parnata, I. K.

    2018-01-01

    The concept of green accounting becomes a debate in terms of its implementation in a company. The result of previous studies indicates that there are no standard model regarding its implementation to support performance. The research aims to create a different green accounting model to other models by using local cultural elements as the variables in building it. The research is conducted in two steps. The first step is designing the model based on theoretical studies by considering the main and supporting elements in building the concept of green accounting. The second step is conducting a model test at 60 five stars hotels started with data collection through questionnaire and followed by data processing using descriptive statistic. The result indicates that the hotels’ owner has implemented green accounting attributes and it supports previous studies. Another result, which is a new finding, shows that the presence of local culture, government regulation, and the awareness of hotels’ owner has important role in the development of green accounting concept. The results of the research give contribution to accounting science in terms of green reporting. The hotel management should adopt local culture in building the character of accountant hired in the accounting department.

  5. Polynomial Chaos decomposition applied to stochastic dosimetry: study of the influence of the magnetic field orientation on the pregnant woman exposure at 50 Hz.

    PubMed

    Liorni, I; Parazzini, M; Fiocchi, S; Guadagnin, V; Ravazzani, P

    2014-01-01

    Polynomial Chaos (PC) is a decomposition method used to build a meta-model, which approximates the unknown response of a model. In this paper the PC method is applied to the stochastic dosimetry to assess the variability of human exposure due to the change of the orientation of the B-field vector respect to the human body. In detail, the analysis of the pregnant woman exposure at 7 months of gestational age is carried out, to build-up a statistical meta-model of the induced electric field for each fetal tissue and in the fetal whole-body by means of the PC expansion as a function of the B-field orientation, considering a uniform exposure at 50 Hz.

  6. Atomistic Cohesive Zone Models for Interface Decohesion in Metals

    NASA Technical Reports Server (NTRS)

    Yamakov, Vesselin I.; Saether, Erik; Glaessgen, Edward H.

    2009-01-01

    Using a statistical mechanics approach, a cohesive-zone law in the form of a traction-displacement constitutive relationship characterizing the load transfer across the plane of a growing edge crack is extracted from atomistic simulations for use within a continuum finite element model. The methodology for the atomistic derivation of a cohesive-zone law is presented. This procedure can be implemented to build cohesive-zone finite element models for simulating fracture in nanocrystalline or ultrafine grained materials.

  7. Modeling Composite Assessment Data Using Item Response Theory

    PubMed Central

    Ueckert, Sebastian

    2018-01-01

    Composite assessments aim to combine different aspects of a disease in a single score and are utilized in a variety of therapeutic areas. The data arising from these evaluations are inherently discrete with distinct statistical properties. This tutorial presents the framework of the item response theory (IRT) for the analysis of this data type in a pharmacometric context. The article considers both conceptual (terms and assumptions) and practical questions (modeling software, data requirements, and model building). PMID:29493119

  8. IsoMAP (Isoscape Modeling, Analysis, and Prediction)

    NASA Astrophysics Data System (ADS)

    Miller, C. C.; Bowen, G. J.; Zhang, T.; Zhao, L.; West, J. B.; Liu, Z.; Rapolu, N.

    2009-12-01

    IsoMAP is a TeraGrid-based web portal aimed at building the infrastructure that brings together distributed multi-scale and multi-format geospatial datasets to enable statistical analysis and modeling of environmental isotopes. A typical workflow enabled by the portal includes (1) data source exploration and selection, (2) statistical analysis and model development; (3) predictive simulation of isotope distributions using models developed in (1) and (2); (4) analysis and interpretation of simulated spatial isotope distributions (e.g., comparison with independent observations, pattern analysis). The gridded models and data products created by one user can be shared and reused among users within the portal, enabling collaboration and knowledge transfer. This infrastructure and the research it fosters can lead to fundamental changes in our knowledge of the water cycle and ecological and biogeochemical processes through analysis of network-based isotope data, but it will be important A) that those with whom the data and models are shared can be sure of the origin, quality, inputs, and processing history of these products, and B) the system is agile and intuitive enough to facilitate this sharing (rather than just ‘allow’ it). IsoMAP researchers are therefore building into the portal’s architecture several components meant to increase the amount of metadata about users’ products and to repurpose those metadata to make sharing and discovery more intuitive and robust to both expected, professional users as well as unforeseeable populations from other sectors.

  9. Bootstrap data methodology for sequential hybrid model building

    NASA Technical Reports Server (NTRS)

    Volponi, Allan J. (Inventor); Brotherton, Thomas (Inventor)

    2007-01-01

    A method for modeling engine operation comprising the steps of: 1. collecting a first plurality of sensory data, 2. partitioning a flight envelope into a plurality of sub-regions, 3. assigning the first plurality of sensory data into the plurality of sub-regions, 4. generating an empirical model of at least one of the plurality of sub-regions, 5. generating a statistical summary model for at least one of the plurality of sub-regions, 6. collecting an additional plurality of sensory data, 7. partitioning the second plurality of sensory data into the plurality of sub-regions, 8. generating a plurality of pseudo-data using the empirical model, and 9. concatenating the plurality of pseudo-data and the additional plurality of sensory data to generate an updated empirical model and an updated statistical summary model for at least one of the plurality of sub-regions.

  10. SEGMENTING CT PROSTATE IMAGES USING POPULATION AND PATIENT-SPECIFIC STATISTICS FOR RADIOTHERAPY.

    PubMed

    Feng, Qianjin; Foskey, Mark; Tang, Songyuan; Chen, Wufan; Shen, Dinggang

    2009-08-07

    This paper presents a new deformable model using both population and patient-specific statistics to segment the prostate from CT images. There are two novelties in the proposed method. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than general intensity and gradient features, is used to characterize the image features. Second, an online training approach is used to build the shape statistics for accurately capturing intra-patient variation, which is more important than inter-patient variation for prostate segmentation in clinical radiotherapy. Experimental results show that the proposed method is robust and accurate, suitable for clinical application.

  11. SEGMENTING CT PROSTATE IMAGES USING POPULATION AND PATIENT-SPECIFIC STATISTICS FOR RADIOTHERAPY

    PubMed Central

    Feng, Qianjin; Foskey, Mark; Tang, Songyuan; Chen, Wufan; Shen, Dinggang

    2010-01-01

    This paper presents a new deformable model using both population and patient-specific statistics to segment the prostate from CT images. There are two novelties in the proposed method. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than general intensity and gradient features, is used to characterize the image features. Second, an online training approach is used to build the shape statistics for accurately capturing intra-patient variation, which is more important than inter-patient variation for prostate segmentation in clinical radiotherapy. Experimental results show that the proposed method is robust and accurate, suitable for clinical application. PMID:21197416

  12. Multi-region statistical shape model for cochlear implantation

    NASA Astrophysics Data System (ADS)

    Romera, Jordi; Kjer, H. Martin; Piella, Gemma; Ceresa, Mario; González Ballester, Miguel A.

    2016-03-01

    Statistical shape models are commonly used to analyze the variability between similar anatomical structures and their use is established as a tool for analysis and segmentation of medical images. However, using a global model to capture the variability of complex structures is not enough to achieve the best results. The complexity of a proper global model increases even more when the amount of data available is limited to a small number of datasets. Typically, the anatomical variability between structures is associated to the variability of their physiological regions. In this paper, a complete pipeline is proposed for building a multi-region statistical shape model to study the entire variability from locally identified physiological regions of the inner ear. The proposed model, which is based on an extension of the Point Distribution Model (PDM), is built for a training set of 17 high-resolution images (24.5 μm voxels) of the inner ear. The model is evaluated according to its generalization ability and specificity. The results are compared with the ones of a global model built directly using the standard PDM approach. The evaluation results suggest that better accuracy can be achieved using a regional modeling of the inner ear.

  13. Examining statewide capacity for school health and mental health promotion: a post hoc application of a district capacity-building framework.

    PubMed

    Maras, Melissa A; Weston, Karen J; Blacksmith, Jennifer; Brophy, Chelsey

    2015-03-01

    Schools must possess a variety of capacities to effectively support comprehensive and coordinated school health promotion activities, and researchers have developed a district-level capacity-building framework specific to school health promotion. State-level school health coalitions often support such capacity-building efforts and should embed this work within a data-based, decision-making model. However, there is a lack of guidance for state school health coalitions on how they should collect and use data. This article uses a district-level capacity-building framework to interpret findings from a statewide coordinated school health needs/resource assessment in order to examine statewide capacity for school health promotion. Participants included school personnel (N = 643) from one state. Descriptive statistics were calculated for survey items, with further examination of subgroup differences among school administrators and nurses. Results were then interpreted via a post hoc application of a district-level capacity-building framework. Findings across districts revealed statewide strengths and gaps with regard to leadership and management capacities, internal and external supports, and an indicator of global capacity. Findings support the utility of using a common framework across local and state levels to align efforts and embed capacity-building activities within a data-driven, continuous improvement model. © 2014 Society for Public Health Education.

  14. Gradient boosting machine for modeling the energy consumption of commercial buildings

    DOE PAGES

    Touzani, Samir; Granderson, Jessica; Fernandes, Samuel

    2017-11-26

    Accurate savings estimations are important to promote energy efficiency projects and demonstrate their cost-effectiveness. The increasing presence of advanced metering infrastructure (AMI) in commercial buildings has resulted in a rising availability of high frequency interval data. These data can be used for a variety of energy efficiency applications such as demand response, fault detection and diagnosis, and heating, ventilation, and air conditioning (HVAC) optimization. This large amount of data has also opened the door to the use of advanced statistical learning models, which hold promise for providing accurate building baseline energy consumption predictions, and thus accurate saving estimations. The gradientmore » boosting machine is a powerful machine learning algorithm that is gaining considerable traction in a wide range of data driven applications, such as ecology, computer vision, and biology. In the present work an energy consumption baseline modeling method based on a gradient boosting machine was proposed. To assess the performance of this method, a recently published testing procedure was used on a large dataset of 410 commercial buildings. The model training periods were varied and several prediction accuracy metrics were used to evaluate the model's performance. The results show that using the gradient boosting machine model improved the R-squared prediction accuracy and the CV(RMSE) in more than 80 percent of the cases, when compared to an industry best practice model that is based on piecewise linear regression, and to a random forest algorithm.« less

  15. Gradient boosting machine for modeling the energy consumption of commercial buildings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Touzani, Samir; Granderson, Jessica; Fernandes, Samuel

    Accurate savings estimations are important to promote energy efficiency projects and demonstrate their cost-effectiveness. The increasing presence of advanced metering infrastructure (AMI) in commercial buildings has resulted in a rising availability of high frequency interval data. These data can be used for a variety of energy efficiency applications such as demand response, fault detection and diagnosis, and heating, ventilation, and air conditioning (HVAC) optimization. This large amount of data has also opened the door to the use of advanced statistical learning models, which hold promise for providing accurate building baseline energy consumption predictions, and thus accurate saving estimations. The gradientmore » boosting machine is a powerful machine learning algorithm that is gaining considerable traction in a wide range of data driven applications, such as ecology, computer vision, and biology. In the present work an energy consumption baseline modeling method based on a gradient boosting machine was proposed. To assess the performance of this method, a recently published testing procedure was used on a large dataset of 410 commercial buildings. The model training periods were varied and several prediction accuracy metrics were used to evaluate the model's performance. The results show that using the gradient boosting machine model improved the R-squared prediction accuracy and the CV(RMSE) in more than 80 percent of the cases, when compared to an industry best practice model that is based on piecewise linear regression, and to a random forest algorithm.« less

  16. Leveraging non-targeted metabolite profiling via statistical genomics

    USDA-ARS?s Scientific Manuscript database

    One of the challenges of systems biology is to integrate multiple sources of data in order to build a cohesive view of the system of study. Here we describe the mass spectrometry based profiling of maize kernels, a model system for genomic studies and a cornerstone of the agroeconomy. Using a networ...

  17. Multivariate geomorphic analysis of forest streams: Implications for assessment of land use impacts on channel condition

    Treesearch

    Richard. D. Wood-Smith; John M. Buffington

    1996-01-01

    Multivariate statistical analyses of geomorphic variables from 23 forest stream reaches in southeast Alaska result in successful discrimination between pristine streams and those disturbed by land management, specifically timber harvesting and associated road building. Results of discriminant function analysis indicate that a three-variable model discriminates 10...

  18. Statistical Power for a Simultaneous Test of Factorial and Predictive Invariance

    ERIC Educational Resources Information Center

    Olivera-Aguilar, Margarita; Millsap, Roger E.

    2013-01-01

    A common finding in studies of differential prediction across groups is that although regression slopes are the same or similar across groups, group differences exist in regression intercepts. Building on earlier work by Birnbaum (1979), Millsap (1998) presented an invariant factor model that would explain such intercept differences as arising due…

  19. Understanding Distributions by Modeling Them

    ERIC Educational Resources Information Center

    Konold, Cliff; Harradine, Anthony; Kazak, Sibel

    2007-01-01

    In current curriculum materials for middle school students in the US, data and chance are considered as separate topics. They are then ideally brought together in the minds of high school or university students when they learn about statistical inference. In recent studies we have been attempting to build connections between data and chance in the…

  20. It's a Girl! Random Numbers, Simulations, and the Law of Large Numbers

    ERIC Educational Resources Information Center

    Goodwin, Chris; Ortiz, Enrique

    2015-01-01

    Modeling using mathematics and making inferences about mathematical situations are becoming more prevalent in most fields of study. Descriptive statistics cannot be used to generalize about a population or make predictions of what can occur. Instead, inference must be used. Simulation and sampling are essential in building a foundation for…

  1. Statistical methods for quantitative mass spectrometry proteomic experiments with labeling.

    PubMed

    Oberg, Ann L; Mahoney, Douglas W

    2012-01-01

    Mass Spectrometry utilizing labeling allows multiple specimens to be subjected to mass spectrometry simultaneously. As a result, between-experiment variability is reduced. Here we describe use of fundamental concepts of statistical experimental design in the labeling framework in order to minimize variability and avoid biases. We demonstrate how to export data in the format that is most efficient for statistical analysis. We demonstrate how to assess the need for normalization, perform normalization, and check whether it worked. We describe how to build a model explaining the observed values and test for differential protein abundance along with descriptive statistics and measures of reliability of the findings. Concepts are illustrated through the use of three case studies utilizing the iTRAQ 4-plex labeling protocol.

  2. Building vulnerability to hydro-geomorphic hazards: Estimating damage probability from qualitative vulnerability assessment using logistic regression

    NASA Astrophysics Data System (ADS)

    Ettinger, Susanne; Mounaud, Loïc; Magill, Christina; Yao-Lafourcade, Anne-Françoise; Thouret, Jean-Claude; Manville, Vern; Negulescu, Caterina; Zuccaro, Giulio; De Gregorio, Daniela; Nardone, Stefano; Uchuchoque, Juan Alexis Luque; Arguedas, Anita; Macedo, Luisa; Manrique Llerena, Nélida

    2016-10-01

    The focus of this study is an analysis of building vulnerability through investigating impacts from the 8 February 2013 flash flood event along the Avenida Venezuela channel in the city of Arequipa, Peru. On this day, 124.5 mm of rain fell within 3 h (monthly mean: 29.3 mm) triggering a flash flood that inundated at least 0.4 km2 of urban settlements along the channel, affecting more than 280 buildings, 23 of a total of 53 bridges (pedestrian, vehicle and railway), and leading to the partial collapse of sections of the main road, paralyzing central parts of the city for more than one week. This study assesses the aspects of building design and site specific environmental characteristics that render a building vulnerable by considering the example of a flash flood event in February 2013. A statistical methodology is developed that enables estimation of damage probability for buildings. The applied method uses observed inundation height as a hazard proxy in areas where more detailed hydrodynamic modeling data is not available. Building design and site-specific environmental conditions determine the physical vulnerability. The mathematical approach considers both physical vulnerability and hazard related parameters and helps to reduce uncertainty in the determination of descriptive parameters, parameter interdependency and respective contributions to damage. This study aims to (1) enable the estimation of damage probability for a certain hazard intensity, and (2) obtain data to visualize variations in damage susceptibility for buildings in flood prone areas. Data collection is based on a post-flood event field survey and the analysis of high (sub-metric) spatial resolution images (Pléiades 2012, 2013). An inventory of 30 city blocks was collated in a GIS database in order to estimate the physical vulnerability of buildings. As many as 1103 buildings were surveyed along the affected drainage and 898 buildings were included in the statistical analysis. Univariate and bivariate analyses were applied to better characterize each vulnerability parameter. Multiple corresponding analyses revealed strong relationships between the "Distance to channel or bridges", "Structural building type", "Building footprint" and the observed damage. Logistic regression enabled quantification of the contribution of each explanatory parameter to potential damage, and determination of the significant parameters that express the damage susceptibility of a building. The model was applied 200 times on different calibration and validation data sets in order to examine performance. Results show that 90% of these tests have a success rate of more than 67%. Probabilities (at building scale) of experiencing different damage levels during a future event similar to the 8 February 2013 flash flood are the major outcomes of this study.

  3. Complex network theory for the identification and assessment of candidate protein targets.

    PubMed

    McGarry, Ken; McDonald, Sharon

    2018-06-01

    In this work we use complex network theory to provide a statistical model of the connectivity patterns of human proteins and their interaction partners. Our intention is to identify important proteins that may be predisposed to be potential candidates as drug targets for therapeutic interventions. Target proteins usually have more interaction partners than non-target proteins, but there are no hard-and-fast rules for defining the actual number of interactions. We devise a statistical measure for identifying hub proteins, we score our target proteins with gene ontology annotations. The important druggable protein targets are likely to have similar biological functions that can be assessed for their potential therapeutic value. Our system provides a statistical analysis of the local and distant neighborhood protein interactions of the potential targets using complex network measures. This approach builds a more accurate model of drug-to-target activity and therefore the likely impact on treating diseases. We integrate high quality protein interaction data from the HINT database and disease associated proteins from the DrugTarget database. Other sources include biological knowledge from Gene Ontology and drug information from DrugBank. The problem is a very challenging one since the data is highly imbalanced between target proteins and the more numerous nontargets. We use undersampling on the training data and build Random Forest classifier models which are used to identify previously unclassified target proteins. We validate and corroborate these findings from the available literature. Copyright © 2018 Elsevier Ltd. All rights reserved.

  4. Statistical ensembles for money and debt

    NASA Astrophysics Data System (ADS)

    Viaggiu, Stefano; Lionetto, Andrea; Bargigli, Leonardo; Longo, Michele

    2012-10-01

    We build a statistical ensemble representation of two economic models describing respectively, in simplified terms, a payment system and a credit market. To this purpose we adopt the Boltzmann-Gibbs distribution where the role of the Hamiltonian is taken by the total money supply (i.e. including money created from debt) of a set of interacting economic agents. As a result, we can read the main thermodynamic quantities in terms of monetary ones. In particular, we define for the credit market model a work term which is related to the impact of monetary policy on credit creation. Furthermore, with our formalism we recover and extend some results concerning the temperature of an economic system, previously presented in the literature by considering only the monetary base as a conserved quantity. Finally, we study the statistical ensemble for the Pareto distribution.

  5. Statistical Mechanics of Node-perturbation Learning with Noisy Baseline

    NASA Astrophysics Data System (ADS)

    Hara, Kazuyuki; Katahira, Kentaro; Okada, Masato

    2017-02-01

    Node-perturbation learning is a type of statistical gradient descent algorithm that can be applied to problems where the objective function is not explicitly formulated, including reinforcement learning. It estimates the gradient of an objective function by using the change in the object function in response to the perturbation. The value of the objective function for an unperturbed output is called a baseline. Cho et al. proposed node-perturbation learning with a noisy baseline. In this paper, we report on building the statistical mechanics of Cho's model and on deriving coupled differential equations of order parameters that depict learning dynamics. We also show how to derive the generalization error by solving the differential equations of order parameters. On the basis of the results, we show that Cho's results are also apply in general cases and show some general performances of Cho's model.

  6. A 3D model retrieval approach based on Bayesian networks lightfield descriptor

    NASA Astrophysics Data System (ADS)

    Xiao, Qinhan; Li, Yanjun

    2009-12-01

    A new 3D model retrieval methodology is proposed by exploiting a novel Bayesian networks lightfield descriptor (BNLD). There are two key novelties in our approach: (1) a BN-based method for building lightfield descriptor; and (2) a 3D model retrieval scheme based on the proposed BNLD. To overcome the disadvantages of the existing 3D model retrieval methods, we explore BN for building a new lightfield descriptor. Firstly, 3D model is put into lightfield, about 300 binary-views can be obtained along a sphere, then Fourier descriptors and Zernike moments descriptors can be calculated out from binaryviews. Then shape feature sequence would be learned into a BN model based on BN learning algorithm; Secondly, we propose a new 3D model retrieval method by calculating Kullback-Leibler Divergence (KLD) between BNLDs. Beneficial from the statistical learning, our BNLD is noise robustness as compared to the existing methods. The comparison between our method and the lightfield descriptor-based approach is conducted to demonstrate the effectiveness of our proposed methodology.

  7. Functional annotation of regulatory pathways.

    PubMed

    Pandey, Jayesh; Koyutürk, Mehmet; Kim, Yohan; Szpankowski, Wojciech; Subramaniam, Shankar; Grama, Ananth

    2007-07-01

    Standardized annotations of biomolecules in interaction networks (e.g. Gene Ontology) provide comprehensive understanding of the function of individual molecules. Extending such annotations to pathways is a critical component of functional characterization of cellular signaling at the systems level. We propose a framework for projecting gene regulatory networks onto the space of functional attributes using multigraph models, with the objective of deriving statistically significant pathway annotations. We first demonstrate that annotations of pairwise interactions do not generalize to indirect relationships between processes. Motivated by this result, we formalize the problem of identifying statistically overrepresented pathways of functional attributes. We establish the hardness of this problem by demonstrating the non-monotonicity of common statistical significance measures. We propose a statistical model that emphasizes the modularity of a pathway, evaluating its significance based on the coupling of its building blocks. We complement the statistical model by an efficient algorithm and software, Narada, for computing significant pathways in large regulatory networks. Comprehensive results from our methods applied to the Escherichia coli transcription network demonstrate that our approach is effective in identifying known, as well as novel biological pathway annotations. Narada is implemented in Java and is available at http://www.cs.purdue.edu/homes/jpandey/narada/.

  8. EEGLAB, SIFT, NFT, BCILAB, and ERICA: New Tools for Advanced EEG Processing

    PubMed Central

    Delorme, Arnaud; Mullen, Tim; Kothe, Christian; Akalin Acar, Zeynep; Bigdely-Shamlo, Nima; Vankov, Andrey; Makeig, Scott

    2011-01-01

    We describe a set of complementary EEG data collection and processing tools recently developed at the Swartz Center for Computational Neuroscience (SCCN) that connect to and extend the EEGLAB software environment, a freely available and readily extensible processing environment running under Matlab. The new tools include (1) a new and flexible EEGLAB STUDY design facility for framing and performing statistical analyses on data from multiple subjects; (2) a neuroelectromagnetic forward head modeling toolbox (NFT) for building realistic electrical head models from available data; (3) a source information flow toolbox (SIFT) for modeling ongoing or event-related effective connectivity between cortical areas; (4) a BCILAB toolbox for building online brain-computer interface (BCI) models from available data, and (5) an experimental real-time interactive control and analysis (ERICA) environment for real-time production and coordination of interactive, multimodal experiments. PMID:21687590

  9. Building unified geospatial data for land-change modeling—A case study in the area of Richmond, Virginia

    USGS Publications Warehouse

    Donato, David I.; Shapiro, Jason L.

    2016-12-13

    An effort to build a unified collection of geospatial data for use in land-change modeling (LCM) led to new insights into the requirements and challenges of building an LCM data infrastructure. A case study of data compilation and unification for the Richmond, Va., Metropolitan Statistical Area (MSA) delineated the problems of combining and unifying heterogeneous data from many independent localities such as counties and cities. The study also produced conclusions and recommendations for use by the national LCM community, emphasizing the critical need for simple, practical data standards and conventions for use by localities. This report contributes an uncopyrighted core glossary and a much needed operational definition of data unification.

  10. Predicting the Ability of Marine Mammal Populations to Compensate for Behavioral Disturbances

    DTIC Science & Technology

    2015-09-30

    approaches, including simple theoretical models as well as statistical analysis of data rich conditions. Building on models developed for PCoD [2,3], we...conditions is population trajectory most likely to be affected (the central aim of PCoD ). For the revised model presented here, we include a population...averaged condition individuals (here used as a proxy for individual health as defined in PCoD ), and E is the quality of the environment in which the

  11. Inflammatory markers as predictors of depression and anxiety in adolescents: Statistical model building with component-wise gradient boosting.

    PubMed

    Walss-Bass, Consuelo; Suchting, Robert; Olvera, Rene L; Williamson, Douglas E

    2018-07-01

    Immune system abnormalities have been repeatedly observed in several psychiatric disorders, including severe depression and anxiety. However, whether specific immune mediators play an early role in the etiopathogenesis of these disorders remains unknown. In a longitudinal design, component-wise gradient boosting was used to build models of depression, assessed by the Mood-Feelings Questionnaire-Child (MFQC), and anxiety, assessed by the Screen for Child Anxiety Related Emotional Disorders (SCARED) in 254 adolescents from a large set of candidate predictors, including sex, race, 39 inflammatory proteins, and the interactions between those proteins and time. Each model was reduced via backward elimination to maximize parsimony and generalizability. Component-wise gradient boosting and model reduction found that female sex, growth- regulated oncogene (GRO), and transforming growth factor alpha (TGF-alpha) predicted depression, while female sex predicted anxiety. Differential onset of puberty as well as a lack of control for menstrual cycle may also have been responsible for differences between males and females in the present study. In addition, investigation of all possible nonlinear relationships between the predictors and the outcomes was beyond the computational capacity and scope of the present research. This study highlights the need for novel statistical modeling to identify reliable biological predictors of aberrant psychological behavior. Copyright © 2018 Elsevier B.V. All rights reserved.

  12. Advanced statistics: linear regression, part II: multiple linear regression.

    PubMed

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  13. Selecting statistical model and optimum maintenance policy: a case study of hydraulic pump.

    PubMed

    Ruhi, S; Karim, M R

    2016-01-01

    Proper maintenance policy can play a vital role for effective investigation of product reliability. Every engineered object such as product, plant or infrastructure needs preventive and corrective maintenance. In this paper we look at a real case study. It deals with the maintenance of hydraulic pumps used in excavators by a mining company. We obtain the data that the owner had collected and carry out an analysis and building models for pump failures. The data consist of both failure and censored lifetimes of the hydraulic pump. Different competitive mixture models are applied to analyze a set of maintenance data of a hydraulic pump. Various characteristics of the mixture models, such as the cumulative distribution function, reliability function, mean time to failure, etc. are estimated to assess the reliability of the pump. Akaike Information Criterion, adjusted Anderson-Darling test statistic, Kolmogrov-Smirnov test statistic and root mean square error are considered to select the suitable models among a set of competitive models. The maximum likelihood estimation method via the EM algorithm is applied mainly for estimating the parameters of the models and reliability related quantities. In this study, it is found that a threefold mixture model (Weibull-Normal-Exponential) fits well for the hydraulic pump failures data set. This paper also illustrates how a suitable statistical model can be applied to estimate the optimum maintenance period at a minimum cost of a hydraulic pump.

  14. Statistical fluctuations in pedestrian evacuation times and the effect of social contagion

    NASA Astrophysics Data System (ADS)

    Nicolas, Alexandre; Bouzat, Sebastián; Kuperman, Marcelo N.

    2016-08-01

    Mathematical models of pedestrian evacuation and the associated simulation software have become essential tools for the assessment of the safety of public facilities and buildings. While a variety of models is now available, their calibration and test against empirical data are generally restricted to global averaged quantities; the statistics compiled from the time series of individual escapes ("microscopic" statistics) measured in recent experiments are thus overlooked. In the same spirit, much research has primarily focused on the average global evacuation time, whereas the whole distribution of evacuation times over some set of realizations should matter. In the present paper we propose and discuss the validity of a simple relation between this distribution and the microscopic statistics, which is theoretically valid in the absence of correlations. To this purpose, we develop a minimal cellular automaton, with features that afford a semiquantitative reproduction of the experimental microscopic statistics. We then introduce a process of social contagion of impatient behavior in the model and show that the simple relation under test may dramatically fail at high contagion strengths, the latter being responsible for the emergence of strong correlations in the system. We conclude with comments on the potential practical relevance for safety science of calculations based on microscopic statistics.

  15. Towards a General Turbulence Model for Planetary Boundary Layers Based on Direct Statistical Simulation

    NASA Astrophysics Data System (ADS)

    Skitka, J.; Marston, B.; Fox-Kemper, B.

    2016-02-01

    Sub-grid turbulence models for planetary boundary layers are typically constructed additively, starting with local flow properties and including non-local (KPP) or higher order (Mellor-Yamada) parameters until a desired level of predictive capacity is achieved or a manageable threshold of complexity is surpassed. Such approaches are necessarily limited in general circumstances, like global circulation models, by their being optimized for particular flow phenomena. By building a model reductively, starting with the infinite hierarchy of turbulence statistics, truncating at a given order, and stripping degrees of freedom from the flow, we offer the prospect a turbulence model and investigative tool that is equally applicable to all flow types and able to take full advantage of the wealth of nonlocal information in any flow. Direct statistical simulation (DSS) that is based upon expansion in equal-time cumulants can be used to compute flow statistics of arbitrary order. We investigate the feasibility of a second-order closure (CE2) by performing simulations of the ocean boundary layer in a quasi-linear approximation for which CE2 is exact. As oceanographic examples, wind-driven Langmuir turbulence and thermal convection are studied by comparison of the quasi-linear and fully nonlinear statistics. We also characterize the computational advantages and physical uncertainties of CE2 defined on a reduced basis determined via proper orthogonal decomposition (POD) of the flow fields.

  16. Probabilistic registration of an unbiased statistical shape model to ultrasound images of the spine

    NASA Astrophysics Data System (ADS)

    Rasoulian, Abtin; Rohling, Robert N.; Abolmaesumi, Purang

    2012-02-01

    The placement of an epidural needle is among the most difficult regional anesthetic techniques. Ultrasound has been proposed to improve success of placement. However, it has not become the standard-of-care because of limitations in the depictions and interpretation of the key anatomical features. We propose to augment the ultrasound images with a registered statistical shape model of the spine to aid interpretation. The model is created with a novel deformable group-wise registration method which utilizes a probabilistic approach to register groups of point sets. The method is compared to a volume-based model building technique and it demonstrates better generalization and compactness. We instantiate and register the shape model to a spine surface probability map extracted from the ultrasound images. Validation is performed on human subjects. The achieved registration accuracy (2-4 mm) is sufficient to guide the choice of puncture site and trajectory of an epidural needle.

  17. Anyonic braiding in optical lattices

    PubMed Central

    Zhang, Chuanwei; Scarola, V. W.; Tewari, Sumanta; Das Sarma, S.

    2007-01-01

    Topological quantum states of matter, both Abelian and non-Abelian, are characterized by excitations whose wavefunctions undergo nontrivial statistical transformations as one excitation is moved (braided) around another. Topological quantum computation proposes to use the topological protection and the braiding statistics of a non-Abelian topological state to perform quantum computation. The enormous technological prospect of topological quantum computation provides new motivation for experimentally observing a topological state. Here, we explicitly work out a realistic experimental scheme to create and braid the Abelian topological excitations in the Kitaev model built on a tunable robust system, a cold atom optical lattice. We also demonstrate how to detect the key feature of these excitations: their braiding statistics. Observation of this statistics would directly establish the existence of anyons, quantum particles that are neither fermions nor bosons. In addition to establishing topological matter, the experimental scheme we develop here can also be adapted to a non-Abelian topological state, supported by the same Kitaev model but in a different parameter regime, to eventually build topologically protected quantum gates. PMID:18000038

  18. Improving PAGER's real-time earthquake casualty and loss estimation toolkit: a challenge

    USGS Publications Warehouse

    Jaiswal, K.S.; Wald, D.J.

    2012-01-01

    We describe the on-going developments of PAGER’s loss estimation models, and discuss value-added web content that can be generated related to exposure, damage and loss outputs for a variety of PAGER users. These developments include identifying vulnerable building types in any given area, estimating earthquake-induced damage and loss statistics by building type, and developing visualization aids that help locate areas of concern for improving post-earthquake response efforts. While detailed exposure and damage information is highly useful and desirable, significant improvements are still necessary in order to improve underlying building stock and vulnerability data at a global scale. Existing efforts with the GEM’s GED4GEM and GVC consortia will help achieve some of these objectives. This will benefit PAGER especially in regions where PAGER’s empirical model is less-well constrained; there, the semi-empirical and analytical models will provide robust estimates of damage and losses. Finally, we outline some of the challenges associated with rapid casualty and loss estimation that we experienced while responding to recent large earthquakes worldwide.

  19. A new in silico classification model for ready biodegradability, based on molecular fragments.

    PubMed

    Lombardo, Anna; Pizzo, Fabiola; Benfenati, Emilio; Manganaro, Alberto; Ferrari, Thomas; Gini, Giuseppina

    2014-08-01

    Regulations such as the European REACH (Registration, Evaluation, Authorization and restriction of Chemicals) often require chemicals to be evaluated for ready biodegradability, to assess the potential risk for environmental and human health. Because not all chemicals can be tested, there is an increasing demand for tools for quick and inexpensive biodegradability screening, such as computer-based (in silico) theoretical models. We developed an in silico model starting from a dataset of 728 chemicals with ready biodegradability data (MITI-test Ministry of International Trade and Industry). We used the novel software SARpy to automatically extract, through a structural fragmentation process, a set of substructures statistically related to ready biodegradability. Then, we analysed these substructures in order to build some general rules. The model consists of a rule-set made up of the combination of the statistically relevant fragments and of the expert-based rules. The model gives good statistical performance with 92%, 82% and 76% accuracy on the training, test and external set respectively. These results are comparable with other in silico models like BIOWIN developed by the United States Environmental Protection Agency (EPA); moreover this new model includes an easily understandable explanation. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Survivability Versus Time

    NASA Technical Reports Server (NTRS)

    Joyner, James J., Sr.

    2014-01-01

    Develop Survivability vs Time Model as a decision-evaluation tool to assess various emergency egress methods used at Launch Complex 39B (LC 39B) and in the Vehicle Assembly Building (VAB) on NASAs Kennedy Space Center. For each hazard scenario, develop probability distributions to address statistical uncertainty resulting in survivability plots over time and composite survivability plots encompassing multiple hazard scenarios.

  1. Mourning dove hunting regulation strategy based on annual harvest statistics and banding data

    USGS Publications Warehouse

    Otis, D.L.

    2006-01-01

    Although managers should strive to base game bird harvest management strategies on mechanistic population models, monitoring programs required to build and continuously update these models may not be in place. Alternatively, If estimates of total harvest and harvest rates are available, then population estimates derived from these harvest data can serve as the basis for making hunting regulation decisions based on population growth rates derived from these estimates. I present a statistically rigorous approach for regulation decision-making using a hypothesis-testing framework and an assumed framework of 3 hunting regulation alternatives. I illustrate and evaluate the technique with historical data on the mid-continent mallard (Anas platyrhynchos) population. I evaluate the statistical properties of the hypothesis-testing framework using the best available data on mourning doves (Zenaida macroura). I use these results to discuss practical implementation of the technique as an interim harvest strategy for mourning doves until reliable mechanistic population models and associated monitoring programs are developed.

  2. Preliminary work toward the development of a dimensional tolerance standard for rapid prototyping

    NASA Technical Reports Server (NTRS)

    Kennedy, W. J.

    1996-01-01

    Rapid prototyping is a new technology for building parts quickly from CAD models. It works by slicing a CAD model into layers, then by building a model of the part one layer at a time. Since most parts can be sliced, most parts can be modeled using rapid prototyping. The layers themselves are created in a number of different ways - by using a laser to cure a layer of an epoxy or a resin, by depositing a layer of plastic or wax upon a surface, by using a laser to sinter a layer of powder, or by using a laser to cut a layer of paper. Rapid prototyping (RP) is new, and a standard part for use in comparing dimensional tolerances has not yet been chosen and accepted by ASTM (the American Society for Testing Materials). Such a part is needed when RP is used to build parts for investment casting or for direct use. The objective of this project was to start the development of a standard part by using statistical techniques to choose the features of the part which show curl - the vertical deviation of a part from its intended horizontal plane.

  3. Collaborative filtering on a family of biological targets.

    PubMed

    Erhan, Dumitru; L'heureux, Pierre-Jean; Yue, Shi Yi; Bengio, Yoshua

    2006-01-01

    Building a QSAR model of a new biological target for which few screening data are available is a statistical challenge. However, the new target may be part of a bigger family, for which we have more screening data. Collaborative filtering or, more generally, multi-task learning, is a machine learning approach that improves the generalization performance of an algorithm by using information from related tasks as an inductive bias. We use collaborative filtering techniques for building predictive models that link multiple targets to multiple examples. The more commonalities between the targets, the better the multi-target model that can be built. We show an example of a multi-target neural network that can use family information to produce a predictive model of an undersampled target. We evaluate JRank, a kernel-based method designed for collaborative filtering. We show their performance on compound prioritization for an HTS campaign and the underlying shared representation between targets. JRank outperformed the neural network both in the single- and multi-target models.

  4. Datamining approaches for modeling tumor control probability.

    PubMed

    Naqa, Issam El; Deasy, Joseph O; Mu, Yi; Huang, Ellen; Hope, Andrew J; Lindsay, Patricia E; Apte, Aditya; Alaly, James; Bradley, Jeffrey D

    2010-11-01

    Tumor control probability (TCP) to radiotherapy is determined by complex interactions between tumor biology, tumor microenvironment, radiation dosimetry, and patient-related variables. The complexity of these heterogeneous variable interactions constitutes a challenge for building predictive models for routine clinical practice. We describe a datamining framework that can unravel the higher order relationships among dosimetric dose-volume prognostic variables, interrogate various radiobiological processes, and generalize to unseen data before when applied prospectively. Several datamining approaches are discussed that include dose-volume metrics, equivalent uniform dose, mechanistic Poisson model, and model building methods using statistical regression and machine learning techniques. Institutional datasets of non-small cell lung cancer (NSCLC) patients are used to demonstrate these methods. The performance of the different methods was evaluated using bivariate Spearman rank correlations (rs). Over-fitting was controlled via resampling methods. Using a dataset of 56 patients with primary NCSLC tumors and 23 candidate variables, we estimated GTV volume and V75 to be the best model parameters for predicting TCP using statistical resampling and a logistic model. Using these variables, the support vector machine (SVM) kernel method provided superior performance for TCP prediction with an rs=0.68 on leave-one-out testing compared to logistic regression (rs=0.4), Poisson-based TCP (rs=0.33), and cell kill equivalent uniform dose model (rs=0.17). The prediction of treatment response can be improved by utilizing datamining approaches, which are able to unravel important non-linear complex interactions among model variables and have the capacity to predict on unseen data for prospective clinical applications.

  5. Motoneuron membrane potentials follow a time inhomogeneous jump diffusion process.

    PubMed

    Jahn, Patrick; Berg, Rune W; Hounsgaard, Jørn; Ditlevsen, Susanne

    2011-11-01

    Stochastic leaky integrate-and-fire models are popular due to their simplicity and statistical tractability. They have been widely applied to gain understanding of the underlying mechanisms for spike timing in neurons, and have served as building blocks for more elaborate models. Especially the Ornstein-Uhlenbeck process is popular to describe the stochastic fluctuations in the membrane potential of a neuron, but also other models like the square-root model or models with a non-linear drift are sometimes applied. Data that can be described by such models have to be stationary and thus, the simple models can only be applied over short time windows. However, experimental data show varying time constants, state dependent noise, a graded firing threshold and time-inhomogeneous input. In the present study we build a jump diffusion model that incorporates these features, and introduce a firing mechanism with a state dependent intensity. In addition, we suggest statistical methods to estimate all unknown quantities and apply these to analyze turtle motoneuron membrane potentials. Finally, simulated and real data are compared and discussed. We find that a square-root diffusion describes the data much better than an Ornstein-Uhlenbeck process with constant diffusion coefficient. Further, the membrane time constant decreases with increasing depolarization, as expected from the increase in synaptic conductance. The network activity, which the neuron is exposed to, can be reasonably estimated to be a threshold version of the nerve output from the network. Moreover, the spiking characteristics are well described by a Poisson spike train with an intensity depending exponentially on the membrane potential.

  6. Statistical Maps of Ground Magnetic Disturbance Derived from Global Geospace Models

    NASA Astrophysics Data System (ADS)

    Rigler, E. J.; Wiltberger, M. J.; Love, J. J.

    2017-12-01

    Electric currents in space are the principal driver of magnetic variations measured at Earth's surface. These in turn induce geoelectric fields that present a natural hazard for technological systems like high-voltage power distribution networks. Modern global geospace models can reasonably simulate large-scale geomagnetic response to solar wind variations, but they are less successful at deterministic predictions of intense localized geomagnetic activity that most impacts technological systems on the ground. Still, recent studies have shown that these models can accurately reproduce the spatial statistical distributions of geomagnetic activity, suggesting that their physics are largely correct. Since the magnetosphere is a largely externally driven system, most model-measurement discrepancies probably arise from uncertain boundary conditions. So, with realistic distributions of solar wind parameters to establish its boundary conditions, we use the Lyon-Fedder-Mobarry (LFM) geospace model to build a synthetic multivariate statistical model of gridded ground magnetic disturbance. From this, we analyze the spatial modes of geomagnetic response, regress on available measurements to fill in unsampled locations on the grid, and estimate the global probability distribution of extreme magnetic disturbance. The latter offers a prototype geomagnetic "hazard map", similar to those used to characterize better-known geophysical hazards like earthquakes and floods.

  7. SOCR: Statistics Online Computational Resource

    PubMed Central

    Dinov, Ivo D.

    2011-01-01

    The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning. PMID:21451741

  8. Impact of statistical learning methods on the predictive power of multivariate normal tissue complication probability models.

    PubMed

    Xu, Cheng-Jian; van der Schaaf, Arjen; Schilstra, Cornelis; Langendijk, Johannes A; van't Veld, Aart A

    2012-03-15

    To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended. Copyright © 2012 Elsevier Inc. All rights reserved.

  9. Building the Community Online Resource for Statistical Seismicity Analysis (CORSSA)

    NASA Astrophysics Data System (ADS)

    Michael, A. J.; Wiemer, S.; Zechar, J. D.; Hardebeck, J. L.; Naylor, M.; Zhuang, J.; Steacy, S.; Corssa Executive Committee

    2010-12-01

    Statistical seismology is critical to the understanding of seismicity, the testing of proposed earthquake prediction and forecasting methods, and the assessment of seismic hazard. Unfortunately, despite its importance to seismology - especially to those aspects with great impact on public policy - statistical seismology is mostly ignored in the education of seismologists, and there is no central repository for the existing open-source software tools. To remedy these deficiencies, and with the broader goal to enhance the quality of statistical seismology research, we have begun building the Community Online Resource for Statistical Seismicity Analysis (CORSSA). CORSSA is a web-based educational platform that is authoritative, up-to-date, prominent, and user-friendly. We anticipate that the users of CORSSA will range from beginning graduate students to experienced researchers. More than 20 scientists from around the world met for a week in Zurich in May 2010 to kick-start the creation of CORSSA: the format and initial table of contents were defined; a governing structure was organized; and workshop participants began drafting articles. CORSSA materials are organized with respect to six themes, each containing between four and eight articles. The CORSSA web page, www.corssa.org, officially unveiled on September 6, 2010, debuts with an initial set of approximately 10 to 15 articles available online for viewing and commenting with additional articles to be added over the coming months. Each article will be peer-reviewed and will present a balanced discussion, including illustrative examples and code snippets. Topics in the initial set of articles will include: introductions to both CORSSA and statistical seismology, basic statistical tests and their role in seismology; understanding seismicity catalogs and their problems; basic techniques for modeling seismicity; and methods for testing earthquake predictability hypotheses. A special article will compare and review available statistical seismology software packages.

  10. Using Patient Demographics and Statistical Modeling to Predict Knee Tibia Component Sizing in Total Knee Arthroplasty.

    PubMed

    Ren, Anna N; Neher, Robert E; Bell, Tyler; Grimm, James

    2018-06-01

    Preoperative planning is important to achieve successful implantation in primary total knee arthroplasty (TKA). However, traditional TKA templating techniques are not accurate enough to predict the component size to a very close range. With the goal of developing a general predictive statistical model using patient demographic information, ordinal logistic regression was applied to build a proportional odds model to predict the tibia component size. The study retrospectively collected the data of 1992 primary Persona Knee System TKA procedures. Of them, 199 procedures were randomly selected as testing data and the rest of the data were randomly partitioned between model training data and model evaluation data with a ratio of 7:3. Different models were trained and evaluated on the training and validation data sets after data exploration. The final model had patient gender, age, weight, and height as independent variables and predicted the tibia size within 1 size difference 96% of the time on the validation data, 94% of the time on the testing data, and 92% on a prospective cadaver data set. The study results indicated the statistical model built by ordinal logistic regression can increase the accuracy of tibia sizing information for Persona Knee preoperative templating. This research shows statistical modeling may be used with radiographs to dramatically enhance the templating accuracy, efficiency, and quality. In general, this methodology can be applied to other TKA products when the data are applicable. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Microstructure characterization of multi-phase composites and utilization of phase change materials and recycled rubbers in cementitious materials

    NASA Astrophysics Data System (ADS)

    Meshgin, Pania

    2011-12-01

    This research focuses on two important subjects: (1) Characterization of heterogeneous microstructure of multi-phase composites and the effect of microstructural features on effective properties of the material. (2) Utilizations of phase change materials and recycled rubber particles from waste tires to improve thermal properties of insulation materials used in building envelopes. Spatial pattern of multi-phase and multidimensional internal structures of most composite materials are highly random. Quantitative description of the spatial distribution should be developed based on proper statistical models, which characterize the morphological features. For a composite material with multi-phases, the volume fraction of the phases as well as the morphological parameters of the phases have very strong influences on the effective property of the composite. These morphological parameters depend on the microstructure of each phase. This study intends to include the effect of higher order morphological details of the microstructure in the composite models. The higher order statistics, called two-point correlation functions characterize various behaviors of the composite at any two points in a stochastic field. Specifically, correlation functions of mosaic patterns are used in the study for characterizing transport properties of composite materials. One of the most effective methods to improve energy efficiency of buildings is to enhance thermal properties of insulation materials. The idea of using phase change materials and recycled rubber particles such as scrap tires in insulation materials for building envelopes has been studied.

  12. Motifs in triadic random graphs based on Steiner triple systems

    NASA Astrophysics Data System (ADS)

    Winkler, Marco; Reichardt, Jörg

    2013-08-01

    Conventionally, pairwise relationships between nodes are considered to be the fundamental building blocks of complex networks. However, over the last decade, the overabundance of certain subnetwork patterns, i.e., the so-called motifs, has attracted much attention. It has been hypothesized that these motifs, instead of links, serve as the building blocks of network structures. Although the relation between a network's topology and the general properties of the system, such as its function, its robustness against perturbations, or its efficiency in spreading information, is the central theme of network science, there is still a lack of sound generative models needed for testing the functional role of subgraph motifs. Our work aims to overcome this limitation. We employ the framework of exponential random graph models (ERGMs) to define models based on triadic substructures. The fact that only a small portion of triads can actually be set independently poses a challenge for the formulation of such models. To overcome this obstacle, we use Steiner triple systems (STSs). These are partitions of sets of nodes into pair-disjoint triads, which thus can be specified independently. Combining the concepts of ERGMs and STSs, we suggest generative models capable of generating ensembles of networks with nontrivial triadic Z-score profiles. Further, we discover inevitable correlations between the abundance of triad patterns, which occur solely for statistical reasons and need to be taken into account when discussing the functional implications of motif statistics. Moreover, we calculate the degree distributions of our triadic random graphs analytically.

  13. Parametric Study of Urban-Like Topographic Statistical Moments Relevant to a Priori Modelling of Bulk Aerodynamic Parameters

    NASA Astrophysics Data System (ADS)

    Zhu, Xiaowei; Iungo, G. Valerio; Leonardi, Stefano; Anderson, William

    2017-02-01

    For a horizontally homogeneous, neutrally stratified atmospheric boundary layer (ABL), aerodynamic roughness length, z_0, is the effective elevation at which the streamwise component of mean velocity is zero. A priori prediction of z_0 based on topographic attributes remains an open line of inquiry in planetary boundary-layer research. Urban topographies - the topic of this study - exhibit spatial heterogeneities associated with variability of building height, width, and proximity with adjacent buildings; such variability renders a priori, prognostic z_0 models appealing. Here, large-eddy simulation (LES) has been used in an extensive parametric study to characterize the ABL response (and z_0) to a range of synthetic, urban-like topographies wherein statistical moments of the topography have been systematically varied. Using LES results, we determined the hierarchical influence of topographic moments relevant to setting z_0. We demonstrate that standard deviation and skewness are important, while kurtosis is negligible. This finding is reconciled with a model recently proposed by Flack and Schultz (J Fluids Eng 132:041203-1-041203-10, 2010), who demonstrate that z_0 can be modelled with standard deviation and skewness, and two empirical coefficients (one for each moment). We find that the empirical coefficient related to skewness is not constant, but exhibits a dependence on standard deviation over certain ranges. For idealized, quasi-uniform cubic topographies and for complex, fully random urban-like topographies, we demonstrate strong performance of the generalized Flack and Schultz model against contemporary roughness correlations.

  14. Development of a funding, cost, and spending model for satellite projects

    NASA Technical Reports Server (NTRS)

    Johnson, Jesse P.

    1989-01-01

    The need for a predictive budget/funging model is obvious. The current models used by the Resource Analysis Office (RAO) are used to predict the total costs of satellite projects. An effort to extend the modeling capabilities from total budget analysis to total budget and budget outlays over time analysis was conducted. A statistical based and data driven methodology was used to derive and develop the model. Th budget data for the last 18 GSFC-sponsored satellite projects were analyzed and used to build a funding model which would describe the historical spending patterns. This raw data consisted of dollars spent in that specific year and their 1989 dollar equivalent. This data was converted to the standard format used by the RAO group and placed in a database. A simple statistical analysis was performed to calculate the gross statistics associated with project length and project cost ant the conditional statistics on project length and project cost. The modeling approach used is derived form the theory of embedded statistics which states that properly analyzed data will produce the underlying generating function. The process of funding large scale projects over extended periods of time is described by Life Cycle Cost Models (LCCM). The data was analyzed to find a model in the generic form of a LCCM. The model developed is based on a Weibull function whose parameters are found by both nonlinear optimization and nonlinear regression. In order to use this model it is necessary to transform the problem from a dollar/time space to a percentage of total budget/time space. This transformation is equivalent to moving to a probability space. By using the basic rules of probability, the validity of both the optimization and the regression steps are insured. This statistically significant model is then integrated and inverted. The resulting output represents a project schedule which relates the amount of money spent to the percentage of project completion.

  15. Forecasting volatility with neural regression: a contribution to model adequacy.

    PubMed

    Refenes, A N; Holt, W T

    2001-01-01

    Neural nets' usefulness for forecasting is limited by problems of overfitting and the lack of rigorous procedures for model identification, selection and adequacy testing. This paper describes a methodology for neural model misspecification testing. We introduce a generalization of the Durbin-Watson statistic for neural regression and discuss the general issues of misspecification testing using residual analysis. We derive a generalized influence matrix for neural estimators which enables us to evaluate the distribution of the statistic. We deploy Monte Carlo simulation to compare the power of the test for neural and linear regressors. While residual testing is not a sufficient condition for model adequacy, it is nevertheless a necessary condition to demonstrate that the model is a good approximation to the data generating process, particularly as neural-network estimation procedures are susceptible to partial convergence. The work is also an important step toward developing rigorous procedures for neural model identification, selection and adequacy testing which have started to appear in the literature. We demonstrate its applicability in the nontrivial problem of forecasting implied volatility innovations using high-frequency stock index options. Each step of the model building process is validated using statistical tests to verify variable significance and model adequacy with the results confirming the presence of nonlinear relationships in implied volatility innovations.

  16. Statistical Mechanics of the US Supreme Court

    NASA Astrophysics Data System (ADS)

    Lee, Edward D.; Broedersz, Chase P.; Bialek, William

    2015-07-01

    We build simple models for the distribution of voting patterns in a group, using the Supreme Court of the United States as an example. The maximum entropy model consistent with the observed pairwise correlations among justices' votes, an Ising spin glass, agrees quantitatively with the data. While all correlations (perhaps surprisingly) are positive, the effective pairwise interactions in the spin glass model have both signs, recovering the intuition that ideologically opposite justices negatively influence each another. Despite the competing interactions, a strong tendency toward unanimity emerges from the model, organizing the voting patterns in a relatively simple "energy landscape." Besides unanimity, other energy minima in this landscape, or maxima in probability, correspond to prototypical voting states, such as the ideological split or a tightly correlated, conservative core. The model correctly predicts the correlation of justices with the majority and gives us a measure of their influence on the majority decision. These results suggest that simple models, grounded in statistical physics, can capture essential features of collective decision making quantitatively, even in a complex political context.

  17. Interactive classification and content-based retrieval of tissue images

    NASA Astrophysics Data System (ADS)

    Aksoy, Selim; Marchisio, Giovanni B.; Tusk, Carsten; Koperski, Krzysztof

    2002-11-01

    We describe a system for interactive classification and retrieval of microscopic tissue images. Our system models tissues in pixel, region and image levels. Pixel level features are generated using unsupervised clustering of color and texture values. Region level features include shape information and statistics of pixel level feature values. Image level features include statistics and spatial relationships of regions. To reduce the gap between low-level features and high-level expert knowledge, we define the concept of prototype regions. The system learns the prototype regions in an image collection using model-based clustering and density estimation. Different tissue types are modeled using spatial relationships of these regions. Spatial relationships are represented by fuzzy membership functions. The system automatically selects significant relationships from training data and builds models which can also be updated using user relevance feedback. A Bayesian framework is used to classify tissues based on these models. Preliminary experiments show that the spatial relationship models we developed provide a flexible and powerful framework for classification and retrieval of tissue images.

  18. Derivation of the Statistical Distribution of the Mass Peak Centroids of Mass Spectrometers Employing Analog-to-Digital Converters and Electron Multipliers

    DOE PAGES

    Ipsen, Andreas

    2017-02-03

    Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less

  19. Derivation of the Statistical Distribution of the Mass Peak Centroids of Mass Spectrometers Employing Analog-to-Digital Converters and Electron Multipliers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ipsen, Andreas

    Here, the mass peak centroid is a quantity that is at the core of mass spectrometry (MS). However, despite its central status in the field, models of its statistical distribution are often chosen quite arbitrarily and without attempts at establishing a proper theoretical justification for their use. Recent work has demonstrated that for mass spectrometers employing analog-to-digital converters (ADCs) and electron multipliers, the statistical distribution of the mass peak intensity can be described via a relatively simple model derived essentially from first principles. Building on this result, the following article derives the corresponding statistical distribution for the mass peak centroidsmore » of such instruments. It is found that for increasing signal strength, the centroid distribution converges to a Gaussian distribution whose mean and variance are determined by physically meaningful parameters and which in turn determine bias and variability of the m/z measurements of the instrument. Through the introduction of the concept of “pulse-peak correlation”, the model also elucidates the complicated relationship between the shape of the voltage pulses produced by the preamplifier and the mean and variance of the centroid distribution. The predictions of the model are validated with empirical data and with Monte Carlo simulations.« less

  20. Inevitable end-of-21st-century trends toward earlier surface runoff timing in California's Sierra Nevada Mountains

    NASA Astrophysics Data System (ADS)

    Schwartz, M. A.; Hall, A. D.; Sun, F.; Walton, D.; Berg, N.

    2015-12-01

    Hybrid dynamical-statistical downscaling is used to produce surface runoff timing projections for California's Sierra Nevada, a high-elevation mountain range with significant seasonal snow cover. First, future climate change projections (RCP8.5 forcing scenario, 2081-2100 period) from five CMIP5 global climate models (GCMs) are dynamically downscaled. These projections reveal that future warming leads to a shift toward earlier snowmelt and surface runoff timing throughout the Sierra Nevada region. Relationships between warming and surface runoff timing from the dynamical simulations are used to build a simple statistical model that mimics the dynamical model's projected surface runoff timing changes given GCM input or other statistically-downscaled input. This statistical model can be used to produce surface runoff timing projections for other GCMs, periods, and forcing scenarios to quantify ensemble-mean changes, uncertainty due to intermodel variability and consequences stemming from choice of forcing scenario. For all CMIP5 GCMs and forcing scenarios, significant trends toward earlier surface runoff timing occur at elevations below 2500m. Thus, we conclude that trends toward earlier surface runoff timing by the end-of-the-21st century are inevitable. The changes to surface runoff timing diagnosed in this study have implications for many dimensions of climate change, including impacts on surface hydrology, water resources, and ecosystems.

  1. Decision tree analysis of factors influencing rainfall-related building damage

    NASA Astrophysics Data System (ADS)

    Spekkers, M. H.; Kok, M.; Clemens, F. H. L. R.; ten Veldhuis, J. A. E.

    2014-04-01

    Flood damage prediction models are essential building blocks in flood risk assessments. Little research has been dedicated so far to damage of small-scale urban floods caused by heavy rainfall, while there is a need for reliable damage models for this flood type among insurers and water authorities. The aim of this paper is to investigate a wide range of damage-influencing factors and their relationships with rainfall-related damage, using decision tree analysis. For this, district-aggregated claim data from private property insurance companies in the Netherlands were analysed, for the period of 1998-2011. The databases include claims of water-related damage, for example, damages related to rainwater intrusion through roofs and pluvial flood water entering buildings at ground floor. Response variables being modelled are average claim size and claim frequency, per district per day. The set of predictors include rainfall-related variables derived from weather radar images, topographic variables from a digital terrain model, building-related variables and socioeconomic indicators of households. Analyses were made separately for property and content damage claim data. Results of decision tree analysis show that claim frequency is most strongly associated with maximum hourly rainfall intensity, followed by real estate value, ground floor area, household income, season (property data only), buildings age (property data only), ownership structure (content data only) and fraction of low-rise buildings (content data only). It was not possible to develop statistically acceptable trees for average claim size, which suggest that variability in average claim size is related to explanatory variables that cannot be defined at the district scale. Cross-validation results show that decision trees were able to predict 22-26% of variance in claim frequency, which is considerably better compared to results from global multiple regression models (11-18% of variance explained). Still, a large part of the variance in claim frequency is left unexplained, which is likely to be caused by variations in data at subdistrict scale and missing explanatory variables.

  2. Finite-sample and asymptotic sign-based tests for parameters of non-linear quantile regression with Markov noise

    NASA Astrophysics Data System (ADS)

    Sirenko, M. A.; Tarasenko, P. F.; Pushkarev, M. I.

    2017-01-01

    One of the most noticeable features of sign-based statistical procedures is an opportunity to build an exact test for simple hypothesis testing of parameters in a regression model. In this article, we expanded a sing-based approach to the nonlinear case with dependent noise. The examined model is a multi-quantile regression, which makes it possible to test hypothesis not only of regression parameters, but of noise parameters as well.

  3. Best practices for evaluating the capability of nondestructive evaluation (NDE) and structural health monitoring (SHM) techniques for damage characterization

    NASA Astrophysics Data System (ADS)

    Aldrin, John C.; Annis, Charles; Sabbagh, Harold A.; Lindgren, Eric A.

    2016-02-01

    A comprehensive approach to NDE and SHM characterization error (CE) evaluation is presented that follows the framework of the `ahat-versus-a' regression analysis for POD assessment. Characterization capability evaluation is typically more complex with respect to current POD evaluations and thus requires engineering and statistical expertise in the model-building process to ensure all key effects and interactions are addressed. Justifying the statistical model choice with underlying assumptions is key. Several sizing case studies are presented with detailed evaluations of the most appropriate statistical model for each data set. The use of a model-assisted approach is introduced to help assess the reliability of NDE and SHM characterization capability under a wide range of part, environmental and damage conditions. Best practices of using models are presented for both an eddy current NDE sizing and vibration-based SHM case studies. The results of these studies highlight the general protocol feasibility, emphasize the importance of evaluating key application characteristics prior to the study, and demonstrate an approach to quantify the role of varying SHM sensor durability and environmental conditions on characterization performance.

  4. LATENT SPACE MODELS FOR MULTIVIEW NETWORK DATA

    PubMed Central

    Salter-Townshend, Michael; McCormick, Tyler H.

    2018-01-01

    Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090–1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)]. PMID:29721127

  5. LATENT SPACE MODELS FOR MULTIVIEW NETWORK DATA.

    PubMed

    Salter-Townshend, Michael; McCormick, Tyler H

    2017-09-01

    Social relationships consist of interactions along multiple dimensions. In social networks, this means that individuals form multiple types of relationships with the same person (e.g., an individual will not trust all of his/her acquaintances). Statistical models for these data require understanding two related types of dependence structure: (i) structure within each relationship type, or network view, and (ii) the association between views. In this paper, we propose a statistical framework that parsimoniously represents dependence between relationship types while also maintaining enough flexibility to allow individuals to serve different roles in different relationship types. Our approach builds on work on latent space models for networks [see, e.g., J. Amer. Statist. Assoc. 97 (2002) 1090-1098]. These models represent the propensity for two individuals to form edges as conditionally independent given the distance between the individuals in an unobserved social space. Our work departs from previous work in this area by representing dependence structure between network views through a multivariate Bernoulli likelihood, providing a representation of between-view association. This approach infers correlations between views not explained by the latent space model. Using our method, we explore 6 multiview network structures across 75 villages in rural southern Karnataka, India [Banerjee et al. (2013)].

  6. Steganalysis of recorded speech

    NASA Astrophysics Data System (ADS)

    Johnson, Micah K.; Lyu, Siwei; Farid, Hany

    2005-03-01

    Digital audio provides a suitable cover for high-throughput steganography. At 16 bits per sample and sampled at a rate of 44,100 Hz, digital audio has the bit-rate to support large messages. In addition, audio is often transient and unpredictable, facilitating the hiding of messages. Using an approach similar to our universal image steganalysis, we show that hidden messages alter the underlying statistics of audio signals. Our statistical model begins by building a linear basis that captures certain statistical properties of audio signals. A low-dimensional statistical feature vector is extracted from this basis representation and used by a non-linear support vector machine for classification. We show the efficacy of this approach on LSB embedding and Hide4PGP. While no explicit assumptions about the content of the audio are made, our technique has been developed and tested on high-quality recorded speech.

  7. Overcoming Student Disengagement and Anxiety in Theory, Methods, and Statistics Courses by Building a Community of Learners

    ERIC Educational Resources Information Center

    Macheski, Ginger E.; Buhrmann, Jan; Lowney, Kathleen S.; Bush, Melanie E. L.

    2008-01-01

    Participants in the 2007 American Sociological Association teaching workshop, "Innovative Teaching Practices for Difficult Subjects," shared concerns about teaching statistics, research methods, and theory. Strategies for addressing these concerns center on building a community of learners by creating three processes throughout the course: 1) an…

  8. Subcellular localization for Gram positive and Gram negative bacterial proteins using linear interpolation smoothing model.

    PubMed

    Saini, Harsh; Raicar, Gaurav; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok

    2015-12-07

    Protein subcellular localization is an important topic in proteomics since it is related to a protein׳s overall function, helps in the understanding of metabolic pathways, and in drug design and discovery. In this paper, a basic approximation technique from natural language processing called the linear interpolation smoothing model is applied for predicting protein subcellular localizations. The proposed approach extracts features from syntactical information in protein sequences to build probabilistic profiles using dependency models, which are used in linear interpolation to determine how likely is a sequence to belong to a particular subcellular location. This technique builds a statistical model based on maximum likelihood. It is able to deal effectively with high dimensionality that hinders other traditional classifiers such as Support Vector Machines or k-Nearest Neighbours without sacrificing performance. This approach has been evaluated by predicting subcellular localizations of Gram positive and Gram negative bacterial proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods.

    PubMed

    Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J Sunil

    2014-08-01

    We introduce a survival/risk bump hunting framework to build a bump hunting model with a possibly censored time-to-event type of response and to validate model estimates. First, we describe the use of adequate survival peeling criteria to build a survival/risk bump hunting model based on recursive peeling methods. Our method called "Patient Recursive Survival Peeling" is a rule-induction method that makes use of specific peeling criteria such as hazard ratio or log-rank statistics. Second, to validate our model estimates and improve survival prediction accuracy, we describe a resampling-based validation technique specifically designed for the joint task of decision rule making by recursive peeling (i.e. decision-box) and survival estimation. This alternative technique, called "combined" cross-validation is done by combining test samples over the cross-validation loops, a design allowing for bump hunting by recursive peeling in a survival setting. We provide empirical results showing the importance of cross-validation and replication.

  10. Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods

    PubMed Central

    Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil

    2015-01-01

    We introduce a survival/risk bump hunting framework to build a bump hunting model with a possibly censored time-to-event type of response and to validate model estimates. First, we describe the use of adequate survival peeling criteria to build a survival/risk bump hunting model based on recursive peeling methods. Our method called “Patient Recursive Survival Peeling” is a rule-induction method that makes use of specific peeling criteria such as hazard ratio or log-rank statistics. Second, to validate our model estimates and improve survival prediction accuracy, we describe a resampling-based validation technique specifically designed for the joint task of decision rule making by recursive peeling (i.e. decision-box) and survival estimation. This alternative technique, called “combined” cross-validation is done by combining test samples over the cross-validation loops, a design allowing for bump hunting by recursive peeling in a survival setting. We provide empirical results showing the importance of cross-validation and replication. PMID:26997922

  11. Impact of the buildings areas on the fire incidence.

    PubMed

    Srekl, Jože; Golob, Janvit

    2010-03-01

    A survey of statistical studies shows that probability of fires is expressed by the equation P(A) = KAα, where A = total floor area of the building and K and  are constants for an individual group, or risk category. This equation, which is based on the statistical data on fires in Great Britain, does not include the impact factors such as the number of employees and the activities carried out in these buildings. In order to find out possible correlations between the activities carried out in buildings, the characteristics of buildings and number of fires, we used a random sample which included 134 buildings as industrial objects, hotels, restaurants, warehouses and shopping malls. Our study shows that the floor area of buildings has low impact on the incidence of fires. After analysing the sample of buildings by using multivariate analysis we proved a correlation between the number of fires, floor area of objects, work operation period (per day) and the number of employees in objects.

  12. High resolution tempo-spatial ozone prediction with SVM and LSTM

    NASA Astrophysics Data System (ADS)

    Gao, D.; Zhang, Y.; Qu, Z.; Sadighi, K.; Coffey, E.; LIU, Q.; Hannigan, M.; Henze, D. K.; Dick, R.; Shang, L.; Lv, Q.

    2017-12-01

    To investigate and predict the exposure of ozone and other pollutants in urban areas, we utilize data from various infrastructures including EPA, NOAA and RIITS from government of Los Angeles and construct statistical models to conduct ozone concentration prediction in Los Angeles areas at finer spatial and temporal granularity. Our work involves cyber data such as traffic, roads and population data as features for prediction. Two statistical models, Support Vector Machine (SVM) and Long Short-term Memory (LSTM, deep learning method) are used for prediction. . Our experiments show that kernelized SVM gains better prediction performance when taking traffic counts, road density and population density as features, with a prediction RMSE of 7.99 ppb for all-time ozone and 6.92 ppb for peak-value ozone. With simulated NOx from Chemical Transport Model(CTM) as features, SVM generates even better prediction performance, with a prediction RMSE of 6.69ppb. We also build LSTM, which has shown great advantages at dealing with temporal sequences, to predict ozone concentration by treating ozone concentration as spatial-temporal sequences. Trained by ozone concentration measurements from the 13 EPA stations in LA area, the model achieves 4.45 ppb RMSE. Besides, we build a variant of this model which adds spatial dynamics into the model in the form of transition matrix that reveals new knowledge on pollutant transition. The forgetting gate of the trained LSTM is consistent with the delay effect of ozone concentration and the trained transition matrix shows spatial consistency with the common direction of winds in LA area.

  13. Dose response explorer: an integrated open-source tool for exploring and modelling radiotherapy dose volume outcome relationships

    NASA Astrophysics Data System (ADS)

    El Naqa, I.; Suneja, G.; Lindsay, P. E.; Hope, A. J.; Alaly, J. R.; Vicic, M.; Bradley, J. D.; Apte, A.; Deasy, J. O.

    2006-11-01

    Radiotherapy treatment outcome models are a complicated function of treatment, clinical and biological factors. Our objective is to provide clinicians and scientists with an accurate, flexible and user-friendly software tool to explore radiotherapy outcomes data and build statistical tumour control or normal tissue complications models. The software tool, called the dose response explorer system (DREES), is based on Matlab, and uses a named-field structure array data type. DREES/Matlab in combination with another open-source tool (CERR) provides an environment for analysing treatment outcomes. DREES provides many radiotherapy outcome modelling features, including (1) fitting of analytical normal tissue complication probability (NTCP) and tumour control probability (TCP) models, (2) combined modelling of multiple dose-volume variables (e.g., mean dose, max dose, etc) and clinical factors (age, gender, stage, etc) using multi-term regression modelling, (3) manual or automated selection of logistic or actuarial model variables using bootstrap statistical resampling, (4) estimation of uncertainty in model parameters, (5) performance assessment of univariate and multivariate analyses using Spearman's rank correlation and chi-square statistics, boxplots, nomograms, Kaplan-Meier survival plots, and receiver operating characteristics curves, and (6) graphical capabilities to visualize NTCP or TCP prediction versus selected variable models using various plots. DREES provides clinical researchers with a tool customized for radiotherapy outcome modelling. DREES is freely distributed. We expect to continue developing DREES based on user feedback.

  14. Forecasting daily source air quality using multivariate statistical analysis and radial basis function networks.

    PubMed

    Sun, Gang; Hoff, Steven J; Zelle, Brian C; Nelson, Minda A

    2008-12-01

    It is vital to forecast gas and particle matter concentrations and emission rates (GPCER) from livestock production facilities to assess the impact of airborne pollutants on human health, ecological environment, and global warming. Modeling source air quality is a complex process because of abundant nonlinear interactions between GPCER and other factors. The objective of this study was to introduce statistical methods and radial basis function (RBF) neural network to predict daily source air quality in Iowa swine deep-pit finishing buildings. The results show that four variables (outdoor and indoor temperature, animal units, and ventilation rates) were identified as relative important model inputs using statistical methods. It can be further demonstrated that only two factors, the environment factor and the animal factor, were capable of explaining more than 94% of the total variability after performing principal component analysis. The introduction of fewer uncorrelated variables to the neural network would result in the reduction of the model structure complexity, minimize computation cost, and eliminate model overfitting problems. The obtained results of RBF network prediction were in good agreement with the actual measurements, with values of the correlation coefficient between 0.741 and 0.995 and very low values of systemic performance indexes for all the models. The good results indicated the RBF network could be trained to model these highly nonlinear relationships. Thus, the RBF neural network technology combined with multivariate statistical methods is a promising tool for air pollutant emissions modeling.

  15. Single-trabecula building block for large-scale finite element models of cancellous bone.

    PubMed

    Dagan, D; Be'ery, M; Gefen, A

    2004-07-01

    Recent development of high-resolution imaging of cancellous bone allows finite element (FE) analysis of bone tissue stresses and strains in individual trabeculae. However, specimen-specific stress/strain analyses can include effects of anatomical variations and local damage that can bias the interpretation of the results from individual specimens with respect to large populations. This study developed a standard (generic) 'building-block' of a trabecula for large-scale FE models. Being parametric and based on statistics of dimensions of ovine trabeculae, this building block can be scaled for trabecular thickness and length and be used in commercial or custom-made FE codes to construct generic, large-scale FE models of bone, using less computer power than that currently required to reproduce the accurate micro-architecture of trabecular bone. Orthogonal lattices constructed with this building block, after it was scaled to trabeculae of the human proximal femur, provided apparent elastic moduli of approximately 150 MPa, in good agreement with experimental data for the stiffness of cancellous bone from this site. Likewise, lattices with thinner, osteoporotic-like trabeculae could predict a reduction of approximately 30% in the apparent elastic modulus, as reported in experimental studies of osteoporotic femora. Based on these comparisons, it is concluded that the single-trabecula element developed in the present study is well-suited for representing cancellous bone in large-scale generic FE simulations.

  16. Statistical Mechanics of Disordered Systems - Series: Cambridge Series in Statistical and Probabilistic Mathematics (No. 18)

    NASA Astrophysics Data System (ADS)

    Bovier, Anton

    2006-06-01

    Our mathematical understanding of the statistical mechanics of disordered systems is going through a period of stunning progress. This self-contained book is a graduate-level introduction for mathematicians and for physicists interested in the mathematical foundations of the field, and can be used as a textbook for a two-semester course on mathematical statistical mechanics. It assumes only basic knowledge of classical physics and, on the mathematics side, a good working knowledge of graduate-level probability theory. The book starts with a concise introduction to statistical mechanics, proceeds to disordered lattice spin systems, and concludes with a presentation of the latest developments in the mathematical understanding of mean-field spin glass models. In particular, recent progress towards a rigorous understanding of the replica symmetry-breaking solutions of the Sherrington-Kirkpatrick spin glass models, due to Guerra, Aizenman-Sims-Starr and Talagrand, is reviewed in some detail. Comprehensive introduction to an active and fascinating area of research Clear exposition that builds to the state of the art in the mathematics of spin glasses Written by a well-known and active researcher in the field

  17. Statistical classification of drug incidents due to look-alike sound-alike mix-ups.

    PubMed

    Wong, Zoie Shui Yee

    2016-06-01

    It has been recognised that medication names that look or sound similar are a cause of medication errors. This study builds statistical classifiers for identifying medication incidents due to look-alike sound-alike mix-ups. A total of 227 patient safety incident advisories related to medication were obtained from the Canadian Patient Safety Institute's Global Patient Safety Alerts system. Eight feature selection strategies based on frequent terms, frequent drug terms and constituent terms were performed. Statistical text classifiers based on logistic regression, support vector machines with linear, polynomial, radial-basis and sigmoid kernels and decision tree were trained and tested. The models developed achieved an average accuracy of above 0.8 across all the model settings. The receiver operating characteristic curves indicated the classifiers performed reasonably well. The results obtained in this study suggest that statistical text classification can be a feasible method for identifying medication incidents due to look-alike sound-alike mix-ups based on a database of advisories from Global Patient Safety Alerts. © The Author(s) 2014.

  18. Statistical Field Estimation for Complex Coastal Regions and Archipelagos (PREPRINT)

    DTIC Science & Technology

    2011-04-09

    and study the computational properties of these schemes. Specifically, we extend a multiscale Objective Analysis (OA) approach to complex coastal...computational properties of these schemes. Specifically, we extend a multiscale Objective Analysis (OA) approach to complex coastal regions and... multiscale free-surface code builds on the primitive-equation model of the Harvard Ocean Predic- tion System (HOPS, Haley et al. (2009)). Additionally

  19. Modeling and Simulation of HVAC Faulty Operations and Performance Degradation due to Maintenance Issues

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Liping; Hong, Tianzhen

    Almost half of the total energy used in the U.S. buildings is consumed by heating, ventilation and air conditionings (HVAC) according to EIA statistics. Among various driving factors to energy performance of building, operations and maintenance play a significant role. Many researches have been done to look at design efficiencies and operational controls for improving energy performance of buildings, but very few study the impacts of HVAC systems maintenance. Different practices of HVAC system maintenance can result in substantial differences in building energy use. If a piece of HVAC equipment is not well maintained, its performance will degrade. If sensorsmore » used for control purpose are not calibrated, not only building energy usage could be dramatically increased, but also mechanical systems may not be able to satisfy indoor thermal comfort. Properly maintained HVAC systems can operate efficiently, improve occupant comfort, and prolong equipment service life. In the paper, maintenance practices for HVAC systems are presented based on literature reviews and discussions with HVAC engineers, building operators, facility managers, and commissioning agents. We categorize the maintenance practices into three levels depending on the maintenance effort and coverage: 1) proactive, performance-monitored maintenance; 2) preventive, scheduled maintenance; and 3) reactive, unplanned or no maintenance. A sampled list of maintenance issues, including cooling tower fouling, boiler/chiller fouling, refrigerant over or under charge, temperature sensor offset, outdoor air damper leakage, outdoor air screen blockage, outdoor air damper stuck at fully open position, and dirty filters are investigated in this study using field survey data and detailed simulation models. The energy impacts of both individual maintenance issue and combined scenarios for an office building with central VAV systems and central plant were evaluated by EnergyPlus simulations using three approaches: 1) direct modeling with EnergyPlus, 2) using the energy management system feature of EnergyPlus, and 3) modifying EnergyPlus source code. The results demonstrated the importance of maintenance for HVAC systems on energy performance of buildings. The research is intended to provide a guideline to help practitioners and building operators to gain the knowledge of maintaining HVAC systems in efficient operations, and prioritize HVAC maintenance work plan. The paper also discusses challenges of modeling building maintenance issues using energy simulation programs.« less

  20. The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra.

    PubMed

    Shin, S M; Kim, Y-I; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

    2015-01-01

    To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. The sample included 24 female and 19 male patients with hand-wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index.

  1. The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra

    PubMed Central

    Shin, S M; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

    2015-01-01

    Objectives: To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. Methods: The sample included 24 female and 19 male patients with hand–wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Results: Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Conclusions: Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index. PMID:25411713

  2. Energy consumption quota management of Wanda commercial buildings in China

    NASA Astrophysics Data System (ADS)

    Sun, D. B.; Xiao, H.; Wang, X.; Liu, J. J.; Wang, X.; Jin, X. Q.; Wang, J.; Xie, X. K.

    2016-08-01

    There is limited research of commercial buildings’ energy use data conducted based on practical analysis in China nowadays. Some energy consumption quota tools like Energy Star in U.S or VDI 3807 in Germany have limitation in China's building sector. This study introduces an innovative methodology of applying energy use quota model and empirical management to commercial buildings, which was in accordance of more than one hundred opened shopping centers of a real estate group in China. On the basis of statistical benchmarking, a new concept of “Modified coefficient”, which considers weather, occupancy, business layout, operation schedule and HVAC efficiency, is originally introduced in this paper. Our study shows that the average energy use quota increases from north to south. The average energy use quota of sample buildings is 159 kWh/(m2.a) of severe cold climate zone, 179 kWh/(m2.a) of cold zone, 188 kWh/(m2.a) of hot summer and cold winter zone, and 200 kWh/(m2.a) of hot summer and warm winter zone. The energy use quota model has been validated in the property management for year 2016, providing a new method of commercial building energy management to the industry. As a key result, there is 180 million energy saving potential based on energy quota management in 2016, equals to 6.2% saving rate of actual energy use in 2015.

  3. Activated desorption at heterogeneous interfaces and long-time kinetics of hydrocarbon recovery from nanoporous media.

    PubMed

    Lee, Thomas; Bocquet, Lydéric; Coasne, Benoit

    2016-06-21

    Hydrocarbon recovery from unconventional reservoirs (shale gas) is debated due to its environmental impact and uncertainties on its predictability. But a lack of scientific knowledge impedes the proposal of reliable alternatives. The requirement of hydrofracking, fast recovery decay and ultra-low permeability-inherent to their nanoporosity-are specificities of these reservoirs, which challenge existing frameworks. Here we use molecular simulation and statistical models to show that recovery is hampered by interfacial effects at the wet kerogen surface. Recovery is shown to be thermally activated with an energy barrier modelled from the interface wetting properties. We build a statistical model of the recovery kinetics with a two-regime decline that is consistent with published data: a short time decay, consistent with Darcy description, followed by a fast algebraic decay resulting from increasingly unreachable energy barriers. Replacing water by CO2 or propane eliminates the barriers, therefore raising hopes for clean/efficient recovery.

  4. Basics of Bayesian methods.

    PubMed

    Ghosh, Sujit K

    2010-01-01

    Bayesian methods are rapidly becoming popular tools for making statistical inference in various fields of science including biology, engineering, finance, and genetics. One of the key aspects of Bayesian inferential method is its logical foundation that provides a coherent framework to utilize not only empirical but also scientific information available to a researcher. Prior knowledge arising from scientific background, expert judgment, or previously collected data is used to build a prior distribution which is then combined with current data via the likelihood function to characterize the current state of knowledge using the so-called posterior distribution. Bayesian methods allow the use of models of complex physical phenomena that were previously too difficult to estimate (e.g., using asymptotic approximations). Bayesian methods offer a means of more fully understanding issues that are central to many practical problems by allowing researchers to build integrated models based on hierarchical conditional distributions that can be estimated even with limited amounts of data. Furthermore, advances in numerical integration methods, particularly those based on Monte Carlo methods, have made it possible to compute the optimal Bayes estimators. However, there is a reasonably wide gap between the background of the empirically trained scientists and the full weight of Bayesian statistical inference. Hence, one of the goals of this chapter is to bridge the gap by offering elementary to advanced concepts that emphasize linkages between standard approaches and full probability modeling via Bayesian methods.

  5. Universal Capacitance Model for Real-Time Biomass in Cell Culture.

    PubMed

    Konakovsky, Viktor; Yagtu, Ali Civan; Clemens, Christoph; Müller, Markus Michael; Berger, Martina; Schlatter, Stefan; Herwig, Christoph

    2015-09-02

    : Capacitance probes have the potential to revolutionize bioprocess control due to their safe and robust use and ability to detect even the smallest capacitors in the form of biological cells. Several techniques have evolved to model biomass statistically, however, there are problems with model transfer between cell lines and process conditions. Errors of transferred models in the declining phase of the culture range for linear models around +100% or worse, causing unnecessary delays with test runs during bioprocess development. The goal of this work was to develop one single universal model which can be adapted by considering a potentially mechanistic factor to estimate biomass in yet untested clones and scales. The novelty of this work is a methodology to select sensitive frequencies to build a statistical model which can be shared among fermentations with an error between 9% and 38% (mean error around 20%) for the whole process, including the declining phase. A simple linear factor was found to be responsible for the transferability of biomass models between cell lines, indicating a link to their phenotype or physiology.

  6. Three-Dimensional City Determinants of the Urban Heat Island: A Statistical Approach

    NASA Astrophysics Data System (ADS)

    Chun, Bum Seok

    There is no doubt that the Urban Heat Island (UHI) is a mounting problem in built-up environments, due to the energy retention by the surface materials of dense buildings, leading to increased temperatures, air pollution, and energy consumption. Much of the earlier research on the UHI has used two-dimensional (2-D) information, such as land uses and the distribution of vegetation. In the case of homogeneous land uses, it is possible to predict surface temperatures with reasonable accuracy with 2-D information. However, three-dimensional (3-D) information is necessary to analyze more complex sites, including dense building clusters. Recent research on the UHI has started to consider multi-dimensional models. The purpose of this research is to explore the urban determinants of the UHI, using 2-D/3-D urban information with statistical modeling. The research includes the following stages: (a) estimating urban temperature, using satellite images, (b) developing a 3-D city model by LiDAR data, (c) generating geometric parameters with regard to 2-/3-D geospatial information, and (d) conducting different statistical analyses: OLS and spatial regressions. The research area is part of the City of Columbus, Ohio. To effectively and systematically analyze the UHI, hierarchical grid scales (480m, 240m, 120m, 60m, and 30m) are proposed, together with linear and the log-linear regression models. The non-linear OLS models with Log(AST) as dependent variable have the highest R2 among all the OLS-estimated models. However, both SAR and GSM models are estimated for the 480m, 240m, 120m, and 60m grids to reduce their spatial dependency. Most GSM models have R2s higher than 0.9, except for the 240m grid. Overall, the urban characteristics having high impacts in all grids are embodied in solar radiation, 3-D open space, greenery, and water streams. These results demonstrate that it is possible to mitigate the UHI, providing guidelines for policies aiming to reduce the UHI.

  7. Long-term Trends and Variability of Eddy Activities in the South China Sea

    NASA Astrophysics Data System (ADS)

    Zhang, M.; von Storch, H.

    2017-12-01

    For constructing empirical downscaling models and projecting possible future states of eddy activities in the South China Sea (SCS), long-term statistical characteristics of the SCS eddy are needed. We use a daily global eddy-resolving model product named STORM covering the period of 1950-2010. This simulation has employed the MPI-OM model with a mean horizontal resolution of 10km and been driven by the NCEP reanalysis-1 data set. An eddy detection and tracking algorithm operating on the gridded sea surface height anomaly (SSHA) fields was developed. A set of parameters for the criteria in the SCS are determined through sensitivity tests. Our method detected more than 6000 eddy tracks in the South China Sea. For all of them, eddy diameters, track length, eddy intensity, eddy lifetime and eddy frequency were determined. The long-term trends and variability of those properties also has been derived. Most of the eddies propagate westward. Nearly 100 eddies travel longer than 1000km, and over 800 eddies have a lifespan of more than 2 months. Furthermore, for building the statistical empirical model, the relationship between the SCS eddy statistics and the large-scale atmospheric and oceanic phenomena has been investigated.

  8. Towards a comprehensive city emission function (CCEF)

    NASA Astrophysics Data System (ADS)

    Kocifaj, Miroslav

    2018-01-01

    The comprehensive city emission function (CCEF) is developed for a heterogeneous light-emitting or blocking urban environments, embracing any combination of input parameters that characterize linear dimensions in the system (size and distances between buildings or luminaires), properties of light-emitting elements (such as luminous building façades and street lighting), ground reflectance and total uplight-fraction, all of these defined for an arbitrarily sized 2D area. The analytical formula obtained is not restricted to a single model class as it can capture any specific light-emission feature for wide range of cities. The CCEF method is numerically fast in contrast to what can be expected of other probabilistic approaches that rely on repeated random sampling. Hence the present solution has great potential in light-pollution modeling and can be included in larger numerical models. Our theoretical findings promise great progress in light-pollution modeling as this is the first time an analytical solution to city emission function (CEF) has been developed that depends on statistical mean size and height of city buildings, inter-building separation, prevailing heights of light fixtures, lighting density, and other factors such as e.g. luminaire light output and light distribution, including the amount of uplight, and representative city size. The model is validated for sensitivity and specificity pertinent to combinations of input parameters in order to test its behavior under various conditions, including those that can occur in complex urban environments. It is demonstrated that the solution model succeeds in reproducing a light emission peak at some elevated zenith angles and is consistent with reduced rather than enhanced emission in directions nearly parallel to the ground.

  9. The costs and benefits of reconstruction options in Nepal using the CEDIM FDA modelled and empirical analysis following the 2015 earthquake

    NASA Astrophysics Data System (ADS)

    Daniell, James; Schaefer, Andreas; Wenzel, Friedemann; Khazai, Bijan; Girard, Trevor; Kunz-Plapp, Tina; Kunz, Michael; Muehr, Bernhard

    2016-04-01

    Over the days following the 2015 Nepal earthquake, rapid loss estimates of deaths and the economic loss and reconstruction cost were undertaken by our research group in conjunction with the World Bank. This modelling relied on historic losses from other Nepal earthquakes as well as detailed socioeconomic data and earthquake loss information via CATDAT. The modelled results were very close to the final death toll and reconstruction cost for the 2015 earthquake of around 9000 deaths and a direct building loss of ca. 3 billion (a). A description of the process undertaken to produce these loss estimates is described and the potential for use in analysing reconstruction costs from future Nepal earthquakes in rapid time post-event. The reconstruction cost and death toll model is then used as the base model for the examination of the effect of spending money on earthquake retrofitting of buildings versus complete reconstruction of buildings. This is undertaken future events using empirical statistics from past events along with further analytical modelling. The effects of investment vs. the time of a future event is also explored. Preliminary low-cost options (b) along the line of other country studies for retrofitting (ca. 100) are examined versus the option of different building typologies in Nepal as well as investment in various sectors of construction. The effect of public vs. private capital expenditure post-earthquake is also explored as part of this analysis, as well as spending on other components outside of earthquakes. a) http://www.scientificamerican.com/article/experts-calculate-new-loss-predictions-for-nepal-quake/ b) http://www.aees.org.au/wp-content/uploads/2015/06/23-Daniell.pdf

  10. Model of mobile agents for sexual interactions networks

    NASA Astrophysics Data System (ADS)

    González, M. C.; Lind, P. G.; Herrmann, H. J.

    2006-02-01

    We present a novel model to simulate real social networks of complex interactions, based in a system of colliding particles (agents). The network is build by keeping track of the collisions and evolves in time with correlations which emerge due to the mobility of the agents. Therefore, statistical features are a consequence only of local collisions among its individual agents. Agent dynamics is realized by an event-driven algorithm of collisions where energy is gained as opposed to physical systems which have dissipation. The model reproduces empirical data from networks of sexual interactions, not previously obtained with other approaches.

  11. A study of two statistical methods as applied to shuttle solid rocket booster expenditures

    NASA Technical Reports Server (NTRS)

    Perlmutter, M.; Huang, Y.; Graves, M.

    1974-01-01

    The state probability technique and the Monte Carlo technique are applied to finding shuttle solid rocket booster expenditure statistics. For a given attrition rate per launch, the probable number of boosters needed for a given mission of 440 launches is calculated. Several cases are considered, including the elimination of the booster after a maximum of 20 consecutive launches. Also considered is the case where the booster is composed of replaceable components with independent attrition rates. A simple cost analysis is carried out to indicate the number of boosters to build initially, depending on booster costs. Two statistical methods were applied in the analysis: (1) state probability method which consists of defining an appropriate state space for the outcome of the random trials, and (2) model simulation method or the Monte Carlo technique. It was found that the model simulation method was easier to formulate while the state probability method required less computing time and was more accurate.

  12. INDUCTIVE SYSTEM HEALTH MONITORING WITH STATISTICAL METRICS

    NASA Technical Reports Server (NTRS)

    Iverson, David L.

    2005-01-01

    Model-based reasoning is a powerful method for performing system monitoring and diagnosis. Building models for model-based reasoning is often a difficult and time consuming process. The Inductive Monitoring System (IMS) software was developed to provide a technique to automatically produce health monitoring knowledge bases for systems that are either difficult to model (simulate) with a computer or which require computer models that are too complex to use for real time monitoring. IMS processes nominal data sets collected either directly from the system or from simulations to build a knowledge base that can be used to detect anomalous behavior in the system. Machine learning and data mining techniques are used to characterize typical system behavior by extracting general classes of nominal data from archived data sets. In particular, a clustering algorithm forms groups of nominal values for sets of related parameters. This establishes constraints on those parameter values that should hold during nominal operation. During monitoring, IMS provides a statistically weighted measure of the deviation of current system behavior from the established normal baseline. If the deviation increases beyond the expected level, an anomaly is suspected, prompting further investigation by an operator or automated system. IMS has shown potential to be an effective, low cost technique to produce system monitoring capability for a variety of applications. We describe the training and system health monitoring techniques of IMS. We also present the application of IMS to a data set from the Space Shuttle Columbia STS-107 flight. IMS was able to detect an anomaly in the launch telemetry shortly after a foam impact damaged Columbia's thermal protection system.

  13. Data-driven modeling, control and tools for cyber-physical energy systems

    NASA Astrophysics Data System (ADS)

    Behl, Madhur

    Energy systems are experiencing a gradual but substantial change in moving away from being non-interactive and manually-controlled systems to utilizing tight integration of both cyber (computation, communications, and control) and physical representations guided by first principles based models, at all scales and levels. Furthermore, peak power reduction programs like demand response (DR) are becoming increasingly important as the volatility on the grid continues to increase due to regulation, integration of renewables and extreme weather conditions. In order to shield themselves from the risk of price volatility, end-user electricity consumers must monitor electricity prices and be flexible in the ways they choose to use electricity. This requires the use of control-oriented predictive models of an energy system's dynamics and energy consumption. Such models are needed for understanding and improving the overall energy efficiency and operating costs. However, learning dynamical models using grey/white box approaches is very cost and time prohibitive since it often requires significant financial investments in retrofitting the system with several sensors and hiring domain experts for building the model. We present the use of data-driven methods for making model capture easy and efficient for cyber-physical energy systems. We develop Model-IQ, a methodology for analysis of uncertainty propagation for building inverse modeling and controls. Given a grey-box model structure and real input data from a temporary set of sensors, Model-IQ evaluates the effect of the uncertainty propagation from sensor data to model accuracy and to closed-loop control performance. We also developed a statistical method to quantify the bias in the sensor measurement and to determine near optimal sensor placement and density for accurate data collection for model training and control. Using a real building test-bed, we show how performing an uncertainty analysis can reveal trends about inverse model accuracy and control performance, which can be used to make informed decisions about sensor requirements and data accuracy. We also present DR-Advisor, a data-driven demand response recommender system for the building's facilities manager which provides suitable control actions to meet the desired load curtailment while maintaining operations and maximizing the economic reward. We develop a model based control with regression trees algorithm (mbCRT), which allows us to perform closed-loop control for DR strategy synthesis for large commercial buildings. Our data-driven control synthesis algorithm outperforms rule-based demand response methods for a large DoE commercial reference building and leads to a significant amount of load curtailment (of 380kW) and over $45,000 in savings which is 37.9% of the summer energy bill for the building. The performance of DR-Advisor is also evaluated for 8 buildings on Penn's campus; where it achieves 92.8% to 98.9% prediction accuracy. We also compare DR-Advisor with other data driven methods and rank 2nd on ASHRAE's benchmarking data-set for energy prediction.

  14. Evolution in Cloud Population Statistics of the MJO: From AMIE Field Observations to Global Cloud-Permiting Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Chidong

    Motivated by the success of the AMIE/DYNAMO field campaign, which collected unprecedented observations of cloud and precipitation from the tropical Indian Ocean in Octber 2011 – March 2012, this project explored how such observations can be applied to assist the development of global cloud-permitting models through evaluating and correcting model biases in cloud statistics. The main accomplishment of this project were made in four categories: generating observational products for model evaluation, using AMIE/DYNAMO observations to validate global model simulations, using AMIE/DYNAMO observations in numerical studies of cloud-permitting models, and providing leadership in the field. Results from this project provide valuablemore » information for building a seamless bridge between DOE ASR program’s component on process level understanding of cloud processes in the tropics and RGCM focus on global variability and regional extremes. In particular, experience gained from this project would be directly applicable to evaluation and improvements of ACME, especially as it transitions to a non-hydrostatic variable resolution model.« less

  15. Lagrangian acceleration statistics in a turbulent channel flow

    NASA Astrophysics Data System (ADS)

    Stelzenmuller, Nickolas; Polanco, Juan Ignacio; Vignal, Laure; Vinkovic, Ivana; Mordant, Nicolas

    2017-05-01

    Lagrangian acceleration statistics in a fully developed turbulent channel flow at Reτ=1440 are investigated, based on tracer particle tracking in experiments and direct numerical simulations. The evolution with wall distance of the Lagrangian velocity and acceleration time scales is analyzed. Dependency between acceleration components in the near-wall region is described using cross-correlations and joint probability density functions. The strong streamwise coherent vortices typical of wall-bounded turbulent flows are shown to have a significant impact on the dynamics. This results in a strong anisotropy at small scales in the near-wall region that remains present in most of the channel. Such statistical properties may be used as constraints in building advanced Lagrangian stochastic models to predict the dispersion and mixing of chemical components for combustion or environmental studies.

  16. Blinded Validation of Breath Biomarkers of Lung Cancer, a Potential Ancillary to Chest CT Screening

    PubMed Central

    Phillips, Michael; Bauer, Thomas L.; Cataneo, Renee N.; Lebauer, Cassie; Mundada, Mayur; Pass, Harvey I.; Ramakrishna, Naren; Rom, William N.; Vallières, Eric

    2015-01-01

    Background Breath volatile organic compounds (VOCs) have been reported as biomarkers of lung cancer, but it is not known if biomarkers identified in one group can identify disease in a separate independent cohort. Also, it is not known if combining breath biomarkers with chest CT has the potential to improve the sensitivity and specificity of lung cancer screening. Methods Model-building phase (unblinded): Breath VOCs were analyzed with gas chromatography mass spectrometry in 82 asymptomatic smokers having screening chest CT, 84 symptomatic high-risk subjects with a tissue diagnosis, 100 without a tissue diagnosis, and 35 healthy subjects. Multiple Monte Carlo simulations identified breath VOC mass ions with greater than random diagnostic accuracy for lung cancer, and these were combined in a multivariate predictive algorithm. Model-testing phase (blinded validation): We analyzed breath VOCs in an independent cohort of similar subjects (n = 70, 51, 75 and 19 respectively). The algorithm predicted discriminant function (DF) values in blinded replicate breath VOC samples analyzed independently at two laboratories (A and B). Outcome modeling: We modeled the expected effects of combining breath biomarkers with chest CT on the sensitivity and specificity of lung cancer screening. Results Unblinded model-building phase. The algorithm identified lung cancer with sensitivity 74.0%, specificity 70.7% and C-statistic 0.78. Blinded model-testing phase: The algorithm identified lung cancer at Laboratory A with sensitivity 68.0%, specificity 68.4%, C-statistic 0.71; and at Laboratory B with sensitivity 70.1%, specificity 68.0%, C-statistic 0.70, with linear correlation between replicates (r = 0.88). In a projected outcome model, breath biomarkers increased the sensitivity, specificity, and positive and negative predictive values of chest CT for lung cancer when the tests were combined in series or parallel. Conclusions Breath VOC mass ion biomarkers identified lung cancer in a separate independent cohort, in a blinded replicated study. Combining breath biomarkers with chest CT could potentially improve the sensitivity and specificity of lung cancer screening. Trial Registration ClinicalTrials.gov NCT00639067 PMID:26698306

  17. Project risk management in the construction of high-rise buildings

    NASA Astrophysics Data System (ADS)

    Titarenko, Boris; Hasnaoui, Amir; Titarenko, Roman; Buzuk, Liliya

    2018-03-01

    This paper shows the project risk management methods, which allow to better identify risks in the construction of high-rise buildings and to manage them throughout the life cycle of the project. One of the project risk management processes is a quantitative analysis of risks. The quantitative analysis usually includes the assessment of the potential impact of project risks and their probabilities. This paper shows the most popular methods of risk probability assessment and tries to indicate the advantages of the robust approach over the traditional methods. Within the framework of the project risk management model a robust approach of P. Huber is applied and expanded for the tasks of regression analysis of project data. The suggested algorithms used to assess the parameters in statistical models allow to obtain reliable estimates. A review of the theoretical problems of the development of robust models built on the methodology of the minimax estimates was done and the algorithm for the situation of asymmetric "contamination" was developed.

  18. Metabolomics for organic food authentication: Results from a long-term field study in carrots.

    PubMed

    Cubero-Leon, Elena; De Rudder, Olivier; Maquet, Alain

    2018-01-15

    Increasing demand for organic products and their premium prices make them an attractive target for fraudulent malpractices. In this study, a large-scale comparative metabolomics approach was applied to investigate the effect of the agronomic production system on the metabolite composition of carrots and to build statistical models for prediction purposes. Orthogonal projections to latent structures-discriminant analysis (OPLS-DA) was applied successfully to predict the origin of the agricultural system of the harvested carrots on the basis of features determined by liquid chromatography-mass spectrometry. When the training set used to build the OPLS-DA models contained samples representative of each harvest year, the models were able to classify unknown samples correctly (100% correct classification). If a harvest year was left out of the training sets and used for predictions, the correct classification rates achieved ranged from 76% to 100%. The results therefore highlight the potential of metabolomic fingerprinting for organic food authentication purposes. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  19. Building electronic forms for elderly program: integrated care model for high risk elders in Hong Kong.

    PubMed

    Yiu, Rex; Fung, Vicky; Szeto, Karen; Hung, Veronica; Siu, Ricky; Lam, Johnny; Lai, Daniel; Maw, Christina; Cheung, Adah; Shea, Raman; Choy, Anna

    2013-01-01

    In Hong Kong, elderly patients discharged from hospital are at high risk of unplanned readmission. The Integrated Care Model (ICM) program is introduced to provide continuous and coordinated care for high risk elders from hospital to community to prevent unplanned readmission. A multidisciplinary working group was set up to address the requirements on developing the electronic forms for ICM program. Six (6) forms were developed. These forms can support ICM service delivery for the high risk elders, clinical documentation, statistical analysis and information sharing.

  20. Computer aided drug design

    NASA Astrophysics Data System (ADS)

    Jain, A.

    2017-08-01

    Computer based method can help in discovery of leads and can potentially eliminate chemical synthesis and screening of many irrelevant compounds, and in this way, it save time as well as cost. Molecular modeling systems are powerful tools for building, visualizing, analyzing and storing models of complex molecular structure that can help to interpretate structure activity relationship. The use of various techniques of molecular mechanics and dynamics and software in Computer aided drug design along with statistics analysis is powerful tool for the medicinal chemistry to synthesis therapeutic and effective drugs with minimum side effect.

  1. EpiModel: An R Package for Mathematical Modeling of Infectious Disease over Networks.

    PubMed

    Jenness, Samuel M; Goodreau, Steven M; Morris, Martina

    2018-04-01

    Package EpiModel provides tools for building, simulating, and analyzing mathematical models for the population dynamics of infectious disease transmission in R. Several classes of models are included, but the unique contribution of this software package is a general stochastic framework for modeling the spread of epidemics on networks. EpiModel integrates recent advances in statistical methods for network analysis (temporal exponential random graph models) that allow the epidemic modeling to be grounded in empirical data on contacts that can spread infection. This article provides an overview of both the modeling tools built into EpiModel , designed to facilitate learning for students new to modeling, and the application programming interface for extending package EpiModel , designed to facilitate the exploration of novel research questions for advanced modelers.

  2. EpiModel: An R Package for Mathematical Modeling of Infectious Disease over Networks

    PubMed Central

    Jenness, Samuel M.; Goodreau, Steven M.; Morris, Martina

    2018-01-01

    Package EpiModel provides tools for building, simulating, and analyzing mathematical models for the population dynamics of infectious disease transmission in R. Several classes of models are included, but the unique contribution of this software package is a general stochastic framework for modeling the spread of epidemics on networks. EpiModel integrates recent advances in statistical methods for network analysis (temporal exponential random graph models) that allow the epidemic modeling to be grounded in empirical data on contacts that can spread infection. This article provides an overview of both the modeling tools built into EpiModel, designed to facilitate learning for students new to modeling, and the application programming interface for extending package EpiModel, designed to facilitate the exploration of novel research questions for advanced modelers. PMID:29731699

  3. Test Activities in the Langley Transonic Dynamics Tunnel and a Summary of Recent Facility Improvements

    NASA Technical Reports Server (NTRS)

    Cole, Stanley R.; Johnson, R. Keith; Piatak, David J.; Florance, Jennifer P.; Rivera, Jose A., Jr.

    2003-01-01

    The Langley Transonic Dynamics Tunnel (TDT) has provided a unique capability for aeroelastic testing for over forty years. The facility has a rich history of significant contributions to the design of many United States commercial transports, military aircraft, launch vehicles, and spacecraft. The facility has many features that contribute to its uniqueness for aeroelasticity testing, perhaps the most important feature being the use of a heavy gas test medium to achieve higher test densities compared to testing in air. Higher test medium densities substantially improve model-building requirements and therefore simplify the fabrication process for building aeroelastically scaled wind tunnel models. This paper describes TDT capabilities that make it particularly suited for aeroelasticity testing. The paper also discusses the nature of recent test activities in the TDT, including summaries of several specific tests. Finally, the paper documents recent facility improvement projects and the continuous statistical quality assessment effort for the TDT.

  4. Using MERRA, AMIP II, CMIP5 Outputs to Assess Actual and Potential Building Climate Zone Change and Variability From the Last 30 Years Through 2100

    NASA Astrophysics Data System (ADS)

    Stackhouse, P. W.; Westberg, D. J.; Hoell, J. M., Jr.; Chandler, W.; Zhang, T.

    2014-12-01

    In the US, residential and commercial building infrastructure combined consumes about 40% of total energy usage and emits about 39% of total CO2emission (DOE/EIA "Annual Energy Outlook 2013"). Thus, increasing the energy efficiency of buildings is paramount to reducing energy costs and emissions. Building codes, as used by local and state enforcement entities are typically tied to the dominant climate within an enforcement jurisdiction classified according to various climate zones. These climates zones are based upon a 30-year average of local surface observations and are developed by DOE and ASHRAE (formerly known as the American Society of Hearting, Refrigeration and Air-Conditioning Engineers). A significant shortcoming of the methodology used in constructing such maps is the use of surface observations (located mainly near airports) that are unequally distributed and frequently have periods of missing data that need to be filled by various approximation schemes. This paper demonstrates the usefulness of using NASA's Modern Era Retrospective-analysis for Research and Applications (MERRA) atmospheric data assimilation to derive the ASHRAE climate zone maps and then using MERRA to define the last 30 years of variability in climate zones. These results show that there is a statistically significant increase in the area covered by warmer climate zones and some tendency for a reduction of area in colder climate zones that require longer time series to confirm. Using the uncertainties of the basic surface temperature and precipitation parameters from MERRA as determined by comparison to surface measurements, we first compare patterns and variability of ASHRAE climate zones from MERRA relative to present day climate model runs from AMIP simulations to establish baseline sensitivity. Based upon these results, we assess the variability of the ASHRAE climate zones according to CMIP runs through 2100 using an ensemble analysis that classifies model output changes by percentiles. Estimates of statistical significance are then compared to original model variability during the AMIP period. This work quantifies and tests for significance the changes seen in the various US regions that represent a potential contribution by NASA to the ongoing National Climate Assessment.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lewis, John R.; Brooks, Dusty Marie

    In pressurized water reactors, the prevention, detection, and repair of cracks within dissimilar metal welds is essential to ensure proper plant functionality and safety. Weld residual stresses, which are difficult to model and cannot be directly measured, contribute to the formation and growth of cracks due to primary water stress corrosion cracking. Additionally, the uncertainty in weld residual stress measurements and modeling predictions is not well understood, further complicating the prediction of crack evolution. The purpose of this document is to develop methodology to quantify the uncertainty associated with weld residual stress that can be applied to modeling predictions andmore » experimental measurements. Ultimately, the results can be used to assess the current state of uncertainty and to build confidence in both modeling and experimental procedures. The methodology consists of statistically modeling the variation in the weld residual stress profiles using functional data analysis techniques. Uncertainty is quantified using statistical bounds (e.g. confidence and tolerance bounds) constructed with a semi-parametric bootstrap procedure. Such bounds describe the range in which quantities of interest, such as means, are expected to lie as evidenced by the data. The methodology is extended to provide direct comparisons between experimental measurements and modeling predictions by constructing statistical confidence bounds for the average difference between the two quantities. The statistical bounds on the average difference can be used to assess the level of agreement between measurements and predictions. The methodology is applied to experimental measurements of residual stress obtained using two strain relief measurement methods and predictions from seven finite element models developed by different organizations during a round robin study.« less

  6. Tracing the source of numerical climate model uncertainties in precipitation simulations using a feature-oriented statistical model

    NASA Astrophysics Data System (ADS)

    Xu, Y.; Jones, A. D.; Rhoades, A.

    2017-12-01

    Precipitation is a key component in hydrologic cycles, and changing precipitation regimes contribute to more intense and frequent drought and flood events around the world. Numerical climate modeling is a powerful tool to study climatology and to predict future changes. Despite the continuous improvement in numerical models, long-term precipitation prediction remains a challenge especially at regional scales. To improve numerical simulations of precipitation, it is important to find out where the uncertainty in precipitation simulations comes from. There are two types of uncertainty in numerical model predictions. One is related to uncertainty in the input data, such as model's boundary and initial conditions. These uncertainties would propagate to the final model outcomes even if the numerical model has exactly replicated the true world. But a numerical model cannot exactly replicate the true world. Therefore, the other type of model uncertainty is related the errors in the model physics, such as the parameterization of sub-grid scale processes, i.e., given precise input conditions, how much error could be generated by the in-precise model. Here, we build two statistical models based on a neural network algorithm to predict long-term variation of precipitation over California: one uses "true world" information derived from observations, and the other uses "modeled world" information using model inputs and outputs from the North America Coordinated Regional Downscaling Project (NA CORDEX). We derive multiple climate feature metrics as the predictors for the statistical model to represent the impact of global climate on local hydrology, and include topography as a predictor to represent the local control. We first compare the predictors between the true world and the modeled world to determine the errors contained in the input data. By perturbing the predictors in the statistical model, we estimate how much uncertainty in the model's final outcomes is accounted for by each predictor. By comparing the statistical model derived from true world information and modeled world information, we assess the errors lying in the physics of the numerical models. This work provides a unique insight to assess the performance of numerical climate models, and can be used to guide improvement of precipitation prediction.

  7. NOx dispersion modelling around roundabout in a small city, example from Hungary

    NASA Astrophysics Data System (ADS)

    Farkas, Orsolya; Rákai, Anikó; Czáder, Károly; Török, Ákos

    2013-04-01

    The present paper focuses on the modelling of pollutant distribution and dispersion in an urban region that is located in a moderately industrialized town of Hungary, Székesfehérvár, with a population of 100,000. The study area is located close to the city centre, with different housing styles and different building elevations. High-rise buildings with 10 floors to small houses with gardens are found in the modelled area. The roundabout has 5 access roads; three major ones and two minor ones with different geometries and traffic load. The traffic load of the roads was defined by traffic count, while for the meteorological characteristics wind-statistics were created. Additional input parameters were the ground plan and the elevation of buildings. To simulate the airflow and the dispersion of pollutants a Computational Fluid Dynamics code called MISKAM was used. The background concentration was taken from the dataset of a nearby air quality monitoring station. According to vehicle counting the 5 roads of the roundabout have very different loads from 12 vehicles to more than 412 vehicles/hour. Three different grid systems were applied ranging from half million to 5 million cells. The difference in the results related to grid density was also evaluated. Wind speed distribution, wind turbulence and building wake flow patterns were identified by using the model. With the help of the simulation the NOx flow and dispersion of pollutants around the roundabout can be estimated and the critical locations with higher pollution concentration are presented. The results of the modelling can be more generalized and used in the design of the layout, development, traffic-control and environmental aspects of roundabouts located in small urban areas.

  8. Modelling 1-minute directional observations of the global irradiance.

    NASA Astrophysics Data System (ADS)

    Thejll, Peter; Pagh Nielsen, Kristian; Andersen, Elsa; Furbo, Simon

    2016-04-01

    Direct and diffuse irradiances from the sky has been collected at 1-minute intervals for about a year from the experimental station at the Technical University of Denmark for the IEA project "Solar Resource Assessment and Forecasting". These data were gathered by pyrheliometers tracking the Sun, as well as with apertured pyranometers gathering 1/8th and 1/16th of the light from the sky in 45 degree azimuthal ranges pointed around the compass. The data are gathered in order to develop detailed models of the potentially available solar energy and its variations at high temporal resolution in order to gain a more detailed understanding of the solar resource. This is important for a better understanding of the sub-grid scale cloud variation that cannot be resolved with climate and weather models. It is also important for optimizing the operation of active solar energy systems such as photovoltaic plants and thermal solar collector arrays, and for passive solar energy and lighting to buildings. We present regression-based modelling of the observed data, and focus, here, on the statistical properties of the model fits. Using models based on the one hand on what is found in the literature and on physical expectations, and on the other hand on purely statistical models, we find solutions that can explain up to 90% of the variance in global radiation. The models leaning on physical insights include terms for the direct solar radiation, a term for the circum-solar radiation, a diffuse term and a term for the horizon brightening/darkening. The purely statistical model is found using data- and formula-validation approaches picking model expressions from a general catalogue of possible formulae. The method allows nesting of expressions, and the results found are dependent on and heavily constrained by the cross-validation carried out on statistically independent testing and training data-sets. Slightly better fits -- in terms of variance explained -- is found using the purely statistical fitting/searching approach. We describe the methods applied, results found, and discuss the different potentials of the physics- and statistics-only based model-searches.

  9. Providing peak river flow statistics and forecasting in the Niger River basin

    NASA Astrophysics Data System (ADS)

    Andersson, Jafet C. M.; Ali, Abdou; Arheimer, Berit; Gustafsson, David; Minoungou, Bernard

    2017-08-01

    Flooding is a growing concern in West Africa. Improved quantification of discharge extremes and associated uncertainties is needed to improve infrastructure design, and operational forecasting is needed to provide timely warnings. In this study, we use discharge observations, a hydrological model (Niger-HYPE) and extreme value analysis to estimate peak river flow statistics (e.g. the discharge magnitude with a 100-year return period) across the Niger River basin. To test the model's capacity of predicting peak flows, we compared 30-year maximum discharge and peak flow statistics derived from the model vs. derived from nine observation stations. The results indicate that the model simulates peak discharge reasonably well (on average + 20%). However, the peak flow statistics have a large uncertainty range, which ought to be considered in infrastructure design. We then applied the methodology to derive basin-wide maps of peak flow statistics and their associated uncertainty. The results indicate that the method is applicable across the hydrologically active part of the river basin, and that the uncertainty varies substantially depending on location. Subsequently, we used the most recent bias-corrected climate projections to analyze potential changes in peak flow statistics in a changed climate. The results are generally ambiguous, with consistent changes only in very few areas. To test the forecasting capacity, we ran Niger-HYPE with a combination of meteorological data sets for the 2008 high-flow season and compared with observations. The results indicate reasonable forecasting capacity (on average 17% deviation), but additional years should also be evaluated. We finish by presenting a strategy and pilot project which will develop an operational flood monitoring and forecasting system based in-situ data, earth observations, modelling, and extreme statistics. In this way we aim to build capacity to ultimately improve resilience toward floods, protecting lives and infrastructure in the region.

  10. Associations of indoor carbon dioxide concentrations and environmental susceptibilities with mucous membrane and lower respiratory building related symptoms in the BASE study: Analyses of the 100 building dataset

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Erdmann, Christine A.; Apte, Michael G.

    Using the US EPA 100 office-building BASE Study dataset, they conducted multivariate logistic regression analyses to quantify the relationship between indoor CO{sub 2} concentrations (dCO{sub 2}) and mucous membrane (MM) and lower respiratory system (LResp) building related symptoms, adjusting for age, sex, smoking status, presence of carpet in workspace, thermal exposure, relative humidity, and a marker for entrained automobile exhaust. In addition, they tested the hypothesis that certain environmentally-mediated health conditions (e.g., allergies and asthma) confer increased susceptibility to building related symptoms within office buildings. Adjusted odds ratios (ORs) for statistically significant, dose-dependent associations (p < 0.05) for dry eyes,more » sore throat, nose/sinus congestion, and wheeze symptoms with 100 ppm increases in dCO{sub 2} ranged from 1.1 to 1.2. These results suggest that increases in the ventilation rates per person among typical office buildings will, on average, reduce the prevalence of several building related symptoms by up to 70%, even when these buildings meet the existing ASHRAE ventilation standards for office buildings. Building occupants with certain environmentally-mediated health conditions are more likely to experience building related symptoms than those without these conditions (statistically significant ORs ranged from 2 to 11).« less

  11. The road map towards providing a robust Raman spectroscopy-based cancer diagnostic platform and integration into clinic

    NASA Astrophysics Data System (ADS)

    Lau, Katherine; Isabelle, Martin; Lloyd, Gavin R.; Old, Oliver; Shepherd, Neil; Bell, Ian M.; Dorney, Jennifer; Lewis, Aaran; Gaifulina, Riana; Rodriguez-Justo, Manuel; Kendall, Catherine; Stone, Nicolas; Thomas, Geraint; Reece, David

    2016-03-01

    Despite the demonstrated potential as an accurate cancer diagnostic tool, Raman spectroscopy (RS) is yet to be adopted by the clinic for histopathology reviews. The Stratified Medicine through Advanced Raman Technologies (SMART) consortium has begun to address some of the hurdles in its adoption for cancer diagnosis. These hurdles include awareness and acceptance of the technology, practicality of integration into the histopathology workflow, data reproducibility and availability of transferrable models. We have formed a consortium, in joint efforts, to develop optimised protocols for tissue sample preparation, data collection and analysis. These protocols will be supported by provision of suitable hardware and software tools to allow statistically sound classification models to be built and transferred for use on different systems. In addition, we are building a validated gastrointestinal (GI) cancers model, which can be trialled as part of the histopathology workflow at hospitals, and a classification tool. At the end of the project, we aim to deliver a robust Raman based diagnostic platform to enable clinical researchers to stage cancer, define tumour margin, build cancer diagnostic models and discover novel disease bio markers.

  12. Building sector feedbacks lead to increased energy demands

    NASA Astrophysics Data System (ADS)

    Hartin, C.; Link, R. P.; Patel, P.; Horowitz, R.; Clarke, L.; Mundra, A.

    2017-12-01

    Typically in human-earth system modeling studies, feedbacks between the earth and human systems are analyzed by passing information between independent models, leading to data errors and poor reproducibility. In this study we explore the two-way feedbacks between the human and earth systems in the building sector of GCAM, an integrated assessment model and, its fully-integrated climate component, Hector. While there is a general agreement in the literature that increasing temperatures will increase cooling energy demands and decrease heating energy demands, there has been no fully-coupled analysis of this dynamic that would, for example, account for the feedbacks on hydrofluorocarbons from increased cooling demands. Using a statistical relationship between global mean temperature change and heating and cooling degree days, we find that the feedbacks on hydrofluorocarbons lead to an increase in global mean temperature of between 0.16 to 0.27 °C in 2100. Demands for electricity increase by about 10% in Africa, while demands decrease in Canada by about 3.0% when taking into account these feedbacks. While the feedbacks between building energy demand and global mean temperature are modest by themselves, this study prompts future research on coupled human-earth system feedbacks, in particular in regards to land, water, and other energy infrastructure.

  13. Decision making based on analysis of benefit versus costs of preventive retrofit versus costs of repair after earthquake hazards

    NASA Astrophysics Data System (ADS)

    Bostenaru Dan, M.

    2012-04-01

    In this presentation interventions on seismically vulnerable early reinforced concrete skeleton buildings, from the interwar time, at different performance levels, from avoiding collapse up to assuring immediate post-earthquake functionality are considered. Between these two poles there are degrees of damage depending on the performance aim set. The costs of the retrofit and post-earthquake repair differ depending on the targeted performance. Not only an earthquake has impact on a heritage building, but also the retrofit measure, for example on its appearance or its functional layout. This way criteria of the structural engineer, the investor, the architect/conservator/urban planner and the owner/inhabitants from the neighbourhood are considered for taking a benefit-cost decision. Benefit-cost analysis based decision is an element in a risk management process. A solution must be found on how much change to accept for retrofit and how much repairable damage to take into account. There are two impact studies. Numerical simulation was run for the building typology considered for successive earthquakes, selected in a deterministic way (1977, 1986 and two for 1991 from Vrancea, Romania and respectively 1978 Thessaloniki, Greece), considering also the case when retrofit is done between two earthquakes. The typology of buildings itself was studied not only for Greece and Romania, but for numerous European countries, including Italy. The typology was compared to earlier reinforced concrete buildings, with Hennebique system, in order to see to which amount these can belong to structural heritage and to shape the criteria of the architect/conservator. Based on the typology study two model buildings were designed, and for one of these different retrofit measures (side walls, structural walls, steel braces, steel jacketing) were considered, while for the other one of these retrofit techniques (diagonal braces, which permits adding also active measures such as energy dissipaters) to different amount and location in the building was considered. Device computations, a civil engineering method for building economics (and which was, before statistics existed, also the method for computing the costs of general upgrade of buildings), were done for the retrofit and for the repair measures, being able to be applied for different countries, also ones where there is no database on existing projects in seismic retrofit. The building elements for which the device computations were done are named "retrofit elements" and they can be new elements, modified elements or replaced elements of the initial building. The addition of the devices is simple, as the row in project management was, but, for the sake of comparison, also complex project management computed in other works was compared for innovative measures such as FRP (with glass and fibre). The theoretical costs for model measures were compared to the way costs of real retrofit for this building type (with reinforced concrete jacketing and FRP) are computed in Greece. The theoretical proposed measures were generally compared to those applied in practice, in Romania and Italy as well. A further study will include these, as in Italy diagonal braces with dissipation had been used. The typology of braces is relevant also for the local seismic culture, maybe outgoing for another type of skeleton structures the distribution of which has been studied: the timber skeleton. A subtype of Romanian reinforced concrete skeleton buildings includes diagonal braces. In order to assess the costs of rebuilding or general upgrade without retrofit, architecture methods for building economics are considered based on floor surface. Diagrams have been built to see how the total costs vary as addition between the preventive retrofit and the post-earthquake repair, and tables to compare to the costs of rebuilding, outgoing from a the model of addition of day-lighting in atria of buildings. The moment when a repair measure has to be applied, function of the recurrence period of earthquakes, is similar to the depth of the atria. Depending on how strong the expected earthquake is, a more extensive retrofit is required in order to decrease repair costs. A further study would allow converting the device computations in floor surface costs, to be able not only to implement in an ICT environment by means of ontology and BIM, but also to convert to urban scale. For the latter studies of probabilistic application of structural mechanics models instead of observation based statistics can be considered. But first the socio-economic models of construction management games will be considered, both computer games and board hard-copy games, starting with SimCity which initially included the San Francisco 1906 earthquake, in order to see how the resources needed can be modeled. All criteria build the taxonomy of decision. Among them different ways to make the cost-benefit analysis exist, from weighted tree to pair-wise comparison. The taxonomy was modeled as a decision tree, which builds the basis for an ontology.

  14. Extreme heat event projections for a coastal megacity

    NASA Astrophysics Data System (ADS)

    Ortiz, L. E.; Gonzalez, J.

    2017-12-01

    As summers become warmer, extreme heat events are expected to increase in intensity, frequency, and duration. Large urban centers may affect these projections by introducing feedbacks between the atmosphere and the built environment through processes involving anthropogenic heat, wind modification, radiation blocking, and others. General circulation models are often run with spatial resolutions in the order of 100 km, limiting their skill at resolving local scale processes and highly spatially varying features such as cities' heterogeneous landscape and mountain topography. This study employs climate simulations using the Weather Research and Forecast (WRF) model coupled with a modified multi-layer urban canopy and building energy model to downscale CESM1 at 1 km horizontal resolution across three time slices (2006-2010, 2075-2079, and 2095-2099) and two projections (RCP 4.5 and 8.5). New York City Metropolitan area, with a population of over 20 million and a complex urban canopy, is used as a case study. The urban canopy model of WRF was modified to include a drag coefficient as a function of the building plant area fraction and the introduction of evaporative cooling systems at building roofs to reject the anthropogenic heat from the buildings, with urban canopy parameters computed from the New York City Property Land-Use Tax-lot Output (PLUTO). Model performance is evaluated against the input model and historical records from airport stations, showing improvement in the statistical characteristics in the downscaled model output. Projection results are presented as spatially distributed anomalies in heat wave frequency, duration, and maximum intensity from the 2006-2010 benchmark period. Results show that local sea-breeze circulations mitigate heat wave impacts, following a positive gradient with increasing distance from the coastline. However, end of century RCP 8.5 projections show the possibility of reversal of this pattern, sea surface temperatures increase and reduce the sea-land temperature gradient, thus reducing the sea-breeze magnitude. Impacts to human health and buildings energy demand are explored for future climate scenarios as key examples of anticipated societal consequences.

  15. PSSMSearch: a server for modeling, visualization, proteome-wide discovery and annotation of protein motif specificity determinants.

    PubMed

    Krystkowiak, Izabella; Manguy, Jean; Davey, Norman E

    2018-06-05

    There is a pressing need for in silico tools that can aid in the identification of the complete repertoire of protein binding (SLiMs, MoRFs, miniMotifs) and modification (moiety attachment/removal, isomerization, cleavage) motifs. We have created PSSMSearch, an interactive web-based tool for rapid statistical modeling, visualization, discovery and annotation of protein motif specificity determinants to discover novel motifs in a proteome-wide manner. PSSMSearch analyses proteomes for regions with significant similarity to a motif specificity determinant model built from a set of aligned motif-containing peptides. Multiple scoring methods are available to build a position-specific scoring matrix (PSSM) describing the motif specificity determinant model. This model can then be modified by a user to add prior knowledge of specificity determinants through an interactive PSSM heatmap. PSSMSearch includes a statistical framework to calculate the significance of specificity determinant model matches against a proteome of interest. PSSMSearch also includes the SLiMSearch framework's annotation, motif functional analysis and filtering tools to highlight relevant discriminatory information. Additional tools to annotate statistically significant shared keywords and GO terms, or experimental evidence of interaction with a motif-recognizing protein have been added. Finally, PSSM-based conservation metrics have been created for taxonomic range analyses. The PSSMSearch web server is available at http://slim.ucd.ie/pssmsearch/.

  16. Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models.

    PubMed

    Jacquin, Hugo; Gilson, Amy; Shakhnovich, Eugene; Cocco, Simona; Monasson, Rémi

    2016-05-01

    Inverse statistical approaches to determine protein structure and function from Multiple Sequence Alignments (MSA) are emerging as powerful tools in computational biology. However the underlying assumptions of the relationship between the inferred effective Potts Hamiltonian and real protein structure and energetics remain untested so far. Here we use lattice protein model (LP) to benchmark those inverse statistical approaches. We build MSA of highly stable sequences in target LP structures, and infer the effective pairwise Potts Hamiltonians from those MSA. We find that inferred Potts Hamiltonians reproduce many important aspects of 'true' LP structures and energetics. Careful analysis reveals that effective pairwise couplings in inferred Potts Hamiltonians depend not only on the energetics of the native structure but also on competing folds; in particular, the coupling values reflect both positive design (stabilization of native conformation) and negative design (destabilization of competing folds). In addition to providing detailed structural information, the inferred Potts models used as protein Hamiltonian for design of new sequences are able to generate with high probability completely new sequences with the desired folds, which is not possible using independent-site models. Those are remarkable results as the effective LP Hamiltonians used to generate MSA are not simple pairwise models due to the competition between the folds. Our findings elucidate the reasons for the success of inverse approaches to the modelling of proteins from sequence data, and their limitations.

  17. From intuition to statistics in building subsurface structural models

    USGS Publications Warehouse

    Brandenburg, J.P.; Alpak, F.O.; Naruk, S.; Solum, J.

    2011-01-01

    Experts associated with the oil and gas exploration industry suggest that combining forward trishear models with stochastic global optimization algorithms allows a quantitative assessment of the uncertainty associated with a given structural model. The methodology is applied to incompletely imaged structures related to deepwater hydrocarbon reservoirs and results are compared to prior manual palinspastic restorations and borehole data. This methodology is also useful for extending structural interpretations into other areas of limited resolution, such as subsalt in addition to extrapolating existing data into seismic data gaps. This technique can be used for rapid reservoir appraisal and potentially have other applications for seismic processing, well planning, and borehole stability analysis.

  18. Transport of Bacillus thuringiensis var. kurstaki from an outdoor release into buildings: pathways of infiltration and a rapid method to identify contaminated buildings.

    PubMed

    Van Cuyk, Sheila; Deshpande, Alina; Hollander, Attelia; Franco, David O; Teclemariam, Nerayo P; Layshock, Julie A; Ticknor, Lawrence O; Brown, Michael J; Omberg, Kristin M

    2012-06-01

    Understanding the fate and transport of biological agents into buildings will be critical to recovery and restoration efforts after a biological attack in an urban area. As part of the Interagency Biological Restoration Demonstration (IBRD), experiments were conducted in Fairfax County, VA, to study whether a biological agent can be expected to infiltrate into buildings following a wide-area release. Bacillus thuringiensis var. kurstaki is a common organic pesticide that has been sprayed in Fairfax County for a number of years to control the gypsy moth. Because the bacterium shares many physical and biological properties with Bacillus anthracis, the results from these studies can be extrapolated to a bioterrorist release. In 2009, samples were collected from inside buildings located immediately adjacent to a spray block. A combined probabilistic and targeted sampling strategy and modeling were conducted to provide insight into likely methods of infiltration. Both the simulations and the experimental results indicate sampling entryways and heating, ventilation, and air conditioning (HVAC) filters are reasonable methods for "ruling in" a building as contaminated. Following a biological attack, this method is likely to provide significant savings in time and labor compared to more rigorous, statistically based characterization. However, this method should never be used to "rule out," or clear, a building.

  19. Does Breast Cancer Drive the Building of Survival Probability Models among States? An Assessment of Goodness of Fit for Patient Data from SEER Registries

    PubMed

    Khan, Hafiz; Saxena, Anshul; Perisetti, Abhilash; Rafiq, Aamrin; Gabbidon, Kemesha; Mende, Sarah; Lyuksyutova, Maria; Quesada, Kandi; Blakely, Summre; Torres, Tiffany; Afesse, Mahlet

    2016-12-01

    Background: Breast cancer is a worldwide public health concern and is the most prevalent type of cancer in women in the United States. This study concerned the best fit of statistical probability models on the basis of survival times for nine state cancer registries: California, Connecticut, Georgia, Hawaii, Iowa, Michigan, New Mexico, Utah, and Washington. Materials and Methods: A probability random sampling method was applied to select and extract records of 2,000 breast cancer patients from the Surveillance Epidemiology and End Results (SEER) database for each of the nine state cancer registries used in this study. EasyFit software was utilized to identify the best probability models by using goodness of fit tests, and to estimate parameters for various statistical probability distributions that fit survival data. Results: Statistical analysis for the summary of statistics is reported for each of the states for the years 1973 to 2012. Kolmogorov-Smirnov, Anderson-Darling, and Chi-squared goodness of fit test values were used for survival data, the highest values of goodness of fit statistics being considered indicative of the best fit survival model for each state. Conclusions: It was found that California, Connecticut, Georgia, Iowa, New Mexico, and Washington followed the Burr probability distribution, while the Dagum probability distribution gave the best fit for Michigan and Utah, and Hawaii followed the Gamma probability distribution. These findings highlight differences between states through selected sociodemographic variables and also demonstrate probability modeling differences in breast cancer survival times. The results of this study can be used to guide healthcare providers and researchers for further investigations into social and environmental factors in order to reduce the occurrence of and mortality due to breast cancer. Creative Commons Attribution License

  20. Agent-based model to rural urban migration analysis

    NASA Astrophysics Data System (ADS)

    Silveira, Jaylson J.; Espíndola, Aquino L.; Penna, T. J. P.

    2006-05-01

    In this paper, we analyze the rural-urban migration phenomenon as it is usually observed in economies which are in the early stages of industrialization. The analysis is conducted by means of a statistical mechanics approach which builds a computational agent-based model. Agents are placed on a lattice and the connections among them are described via an Ising-like model. Simulations on this computational model show some emergent properties that are common in developing economies, such as a transitional dynamics characterized by continuous growth of urban population, followed by the equalization of expected wages between rural and urban sectors (Harris-Todaro equilibrium condition), urban concentration and increasing of per capita income.

  1. Alpha 2 LASSO Data Bundles

    DOE Data Explorer

    Gustafson, William Jr; Vogelmann, Andrew; Endo, Satoshi; Toto, Tami; Xiao, Heng; Li, Zhijin; Cheng, Xiaoping; Kim, Jinwon; Krishna, Bhargavi

    2015-08-31

    The Alpha 2 release is the second release from the LASSO Pilot Phase that builds upon the Alpha 1 release. Alpha 2 contains additional diagnostics in the data bundles and focuses on cases from spring-summer 2016. A data bundle is a unified package consisting of LASSO LES input and output, observations, evaluation diagnostics, and model skill scores. LES input include model configuration information and forcing data. LES output includes profile statistics and full domain fields of cloud and environmental variables. Model evaluation data consists of LES output and ARM observations co-registered on the same grid and sampling frequency. Model performance is quantified by skill scores and diagnostics in terms of cloud and environmental variables.

  2. [Fabrication and accuracy research on 3D printing dental model based on cone beam computed tomography digital modeling].

    PubMed

    Zhang, Hui-Rong; Yin, Le-Feng; Liu, Yan-Li; Yan, Li-Yi; Wang, Ning; Liu, Gang; An, Xiao-Li; Liu, Bin

    2018-04-01

    The aim of this study is to build a digital dental model with cone beam computed tomography (CBCT), to fabricate a virtual model via 3D printing, and to determine the accuracy of 3D printing dental model by comparing the result with a traditional dental cast. CBCT of orthodontic patients was obtained to build a digital dental model by using Mimics 10.01 and Geomagic studio software. The 3D virtual models were fabricated via fused deposition modeling technique (FDM). The 3D virtual models were compared with the traditional cast models by using a Vernier caliper. The measurements used for comparison included the width of each tooth, the length and width of the maxillary and mandibular arches, and the length of the posterior dental crest. 3D printing models had higher accuracy compared with the traditional cast models. The results of the paired t-test of all data showed that no statistically significant difference was observed between the two groups (P>0.05). Dental digital models built with CBCT realize the digital storage of patients' dental condition. The virtual dental model fabricated via 3D printing avoids traditional impression and simplifies the clinical examination process. The 3D printing dental models produced via FDM show a high degree of accuracy. Thus, these models are appropriate for clinical practice.

  3. Model identification using stochastic differential equation grey-box models in diabetes.

    PubMed

    Duun-Henriksen, Anne Katrine; Schmidt, Signe; Røge, Rikke Meldgaard; Møller, Jonas Bech; Nørgaard, Kirsten; Jørgensen, John Bagterp; Madsen, Henrik

    2013-03-01

    The acceptance of virtual preclinical testing of control algorithms is growing and thus also the need for robust and reliable models. Models based on ordinary differential equations (ODEs) can rarely be validated with standard statistical tools. Stochastic differential equations (SDEs) offer the possibility of building models that can be validated statistically and that are capable of predicting not only a realistic trajectory, but also the uncertainty of the prediction. In an SDE, the prediction error is split into two noise terms. This separation ensures that the errors are uncorrelated and provides the possibility to pinpoint model deficiencies. An identifiable model of the glucoregulatory system in a type 1 diabetes mellitus (T1DM) patient is used as the basis for development of a stochastic-differential-equation-based grey-box model (SDE-GB). The parameters are estimated on clinical data from four T1DM patients. The optimal SDE-GB is determined from likelihood-ratio tests. Finally, parameter tracking is used to track the variation in the "time to peak of meal response" parameter. We found that the transformation of the ODE model into an SDE-GB resulted in a significant improvement in the prediction and uncorrelated errors. Tracking of the "peak time of meal absorption" parameter showed that the absorption rate varied according to meal type. This study shows the potential of using SDE-GBs in diabetes modeling. Improved model predictions were obtained due to the separation of the prediction error. SDE-GBs offer a solid framework for using statistical tools for model validation and model development. © 2013 Diabetes Technology Society.

  4. Introduction to bioinformatics.

    PubMed

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  5. Leadership and Culture-Building in Schools: Quantitative and Qualitative Understandings.

    ERIC Educational Resources Information Center

    Sashkin, Marshall; Sashkin, Molly G.

    Understanding effective school leadership as a function of culture building through quantitative and qualitative analyses is the purpose of this paper. The two-part quantitative phase of the research focused on statistical measures of culture and leadership behavior directed toward culture building in the school. The first quantitative part…

  6. Optimization of Analytical Potentials for Coarse-Grained Biopolymer Models.

    PubMed

    Mereghetti, Paolo; Maccari, Giuseppe; Spampinato, Giulia Lia Beatrice; Tozzini, Valentina

    2016-08-25

    The increasing trend in the recent literature on coarse grained (CG) models testifies their impact in the study of complex systems. However, the CG model landscape is variegated: even considering a given resolution level, the force fields are very heterogeneous and optimized with very different parametrization procedures. Along the road for standardization of CG models for biopolymers, here we describe a strategy to aid building and optimization of statistics based analytical force fields and its implementation in the software package AsParaGS (Assisted Parameterization platform for coarse Grained modelS). Our method is based on the use and optimization of analytical potentials, optimized by targeting internal variables statistical distributions by means of the combination of different algorithms (i.e., relative entropy driven stochastic exploration of the parameter space and iterative Boltzmann inversion). This allows designing a custom model that endows the force field terms with a physically sound meaning. Furthermore, the level of transferability and accuracy can be tuned through the choice of statistical data set composition. The method-illustrated by means of applications to helical polypeptides-also involves the analysis of two and three variable distributions, and allows handling issues related to the FF term correlations. AsParaGS is interfaced with general-purpose molecular dynamics codes and currently implements the "minimalist" subclass of CG models (i.e., one bead per amino acid, Cα based). Extensions to nucleic acids and different levels of coarse graining are in the course.

  7. IFLA General Conference 1988. Division of Management and Technology. Section on Information Technology; Section on Statistics; Section on Library Buildings and Equipment; Section on Conservation; Round Table on Management of Library Associations; Round Table on AV Media.

    ERIC Educational Resources Information Center

    International Federation of Library Associations, The Hague (Netherlands).

    The 15 papers in this compilation focus on library nagement and technology, including information technology, statistics, buildings and equipment, and conservation: (1) "Information Control: OSI (Open Systems Interconnection) and Networking Strategies" (Neil McClean, United Kingdom); (2) "OSI in Australia: Potential, Planning,…

  8. The Impact of Measurement Noise in GPA Diagnostic Analysis of a Gas Turbine Engine

    NASA Astrophysics Data System (ADS)

    Ntantis, Efstratios L.; Li, Y. G.

    2013-12-01

    The performance diagnostic analysis of a gas turbine is accomplished by estimating a set of internal engine health parameters from available sensor measurements. No physical measuring instruments however can ever completely eliminate the presence of measurement uncertainties. Sensor measurements are often distorted by noise and bias leading to inaccurate estimation results. This paper explores the impact of measurement noise on Gas Turbine GPA analysis. The analysis is demonstrated with a test case where gas turbine performance simulation and diagnostics code TURBOMATCH is used to build a performance model of a model engine similar to Rolls-Royce Trent 500 turbofan engine, and carry out the diagnostic analysis with the presence of different levels of measurement noise. Conclusively, to improve the reliability of the diagnostic results, a statistical analysis of the data scattering caused by sensor uncertainties is made. The diagnostic tool used to deal with the statistical analysis of measurement noise impact is a model-based method utilizing a non-linear GPA.

  9. Feature discrimination/identification based upon SAR return variations

    NASA Technical Reports Server (NTRS)

    Rasco, W. A., Sr.; Pietsch, R.

    1978-01-01

    A study of the statistics of The look-to-look variation statistics in the returns recorded in-flight by a digital, realtime SAR system are analyzed. The determination that the variations in the look-to-look returns from different classes do carry information content unique to the classes was illustrated by a model based on four variants derived from four look in-flight SAR data under study. The model was limited to four classes of returns: mowed grass on a athletic field, rough unmowed grass and weeds on a large vacant field, young fruit trees in a large orchard, and metal mobile homes and storage buildings in a large mobile home park. The data population in excess of 1000 returns represented over 250 individual pixels from the four classes. The multivariant discriminant model operated on the set of returns for each pixel and assigned that pixel to one of the four classes, based on the target variants and the probability distribution function of the four variants for each class.

  10. Activated desorption at heterogeneous interfaces and long-time kinetics of hydrocarbon recovery from nanoporous media

    PubMed Central

    Lee, Thomas; Bocquet, Lydéric; Coasne, Benoit

    2016-01-01

    Hydrocarbon recovery from unconventional reservoirs (shale gas) is debated due to its environmental impact and uncertainties on its predictability. But a lack of scientific knowledge impedes the proposal of reliable alternatives. The requirement of hydrofracking, fast recovery decay and ultra-low permeability—inherent to their nanoporosity—are specificities of these reservoirs, which challenge existing frameworks. Here we use molecular simulation and statistical models to show that recovery is hampered by interfacial effects at the wet kerogen surface. Recovery is shown to be thermally activated with an energy barrier modelled from the interface wetting properties. We build a statistical model of the recovery kinetics with a two-regime decline that is consistent with published data: a short time decay, consistent with Darcy description, followed by a fast algebraic decay resulting from increasingly unreachable energy barriers. Replacing water by CO2 or propane eliminates the barriers, therefore raising hopes for clean/efficient recovery. PMID:27327254

  11. Modeling of Yb3+/Er3+-codoped microring resonators

    NASA Astrophysics Data System (ADS)

    Vallés, Juan A.; Gălătuş, Ramona

    2015-03-01

    The performance of a highly Yb3+/Er3+-codoped phosphate glass add-drop microring resonator is numerically analyzed. The model assumes resonant behaviour of both pump and signal powers and the dependences of pump intensity build-up inside the microring resonator and of the signal transfer functions to the device through and drop ports are evaluated. Detailed equations for the evolution of the rare-earth ions levels population densities and the propagation of the optical powers inside the microring resonator are included in the model. Moreover, due to the high dopant concentrations considered, the microscopic statistical formalism based on the statistical average of the excitation probability of the Er3+ ion in a microscopic level has been used to describe energy-transfer inter-atomic mechanisms. Realistic parameters and working conditions are used for the calculations. Requirements to achieve amplification and laser oscillation within these devices are obtainable as a function of rare earth ions concentration and coupling losses.

  12. Statistical modelling of gaze behaviour as categorical time series: what you should watch to save soccer penalties.

    PubMed

    Button, C; Dicks, M; Haines, R; Barker, R; Davids, K

    2011-08-01

    Previous research on gaze behaviour in sport has typically reported summary fixation statistics thereby largely ignoring the temporal sequencing of gaze. In the present study on penalty kicking in soccer, our aim was to apply a Markov chain modelling method to eye movement data obtained from goalkeepers. Building on the discrete analysis of gaze employed by Dicks et al. (Atten Percept Psychophys 72(3):706-720, 2010b), we wanted to statistically model the relative probabilities of the goalkeeper's gaze being directed to different locations throughout the penalty taker's approach (Dicks et al. in Atten Percept Psychophys 72(3):706-720, 2010b). Examination of gaze behaviours under in situ and video-simulation task constraints reveals differences in information pickup for perception and action (Attention, Perception and Psychophysics 72(3), 706-720). The probabilities of fixating anatomical locations of the penalty taker were high under simulated movement response conditions. In contrast, when actually required to intercept kicks, the goalkeepers initially favoured watching the penalty taker's head but then rapidly shifted focus directly to the ball for approximately the final second prior to foot-ball contact. The increased spatio-temporal demands of in situ interceptive actions over laboratory-based simulated actions lead to different visual search strategies being used. When eye movement data are modelled as time series, it is possible to discern subtle but important behavioural characteristics that are less apparent with discrete summary statistics alone.

  13. Predicting the potential distribution of invasive exotic species using GIS and information-theoretic approaches: A case of ragweed (Ambrosia artemisiifolia L.) distribution in China

    USGS Publications Warehouse

    Hao, Chen; LiJun, Chen; Albright, Thomas P.

    2007-01-01

    Invasive exotic species pose a growing threat to the economy, public health, and ecological integrity of nations worldwide. Explaining and predicting the spatial distribution of invasive exotic species is of great importance to prevention and early warning efforts. We are investigating the potential distribution of invasive exotic species, the environmental factors that influence these distributions, and the ability to predict them using statistical and information-theoretic approaches. For some species, detailed presence/absence occurrence data are available, allowing the use of a variety of standard statistical techniques. However, for most species, absence data are not available. Presented with the challenge of developing a model based on presence-only information, we developed an improved logistic regression approach using Information Theory and Frequency Statistics to produce a relative suitability map. This paper generated a variety of distributions of ragweed (Ambrosia artemisiifolia L.) from logistic regression models applied to herbarium specimen location data and a suite of GIS layers including climatic, topographic, and land cover information. Our logistic regression model was based on Akaike's Information Criterion (AIC) from a suite of ecologically reasonable predictor variables. Based on the results we provided a new Frequency Statistical method to compartmentalize habitat-suitability in the native range. Finally, we used the model and the compartmentalized criterion developed in native ranges to "project" a potential distribution onto the exotic ranges to build habitat-suitability maps. ?? Science in China Press 2007.

  14. Standardized data collection to build prediction models in oncology: a prototype for rectal cancer.

    PubMed

    Meldolesi, Elisa; van Soest, Johan; Damiani, Andrea; Dekker, Andre; Alitto, Anna Rita; Campitelli, Maura; Dinapoli, Nicola; Gatta, Roberto; Gambacorta, Maria Antonietta; Lanzotti, Vito; Lambin, Philippe; Valentini, Vincenzo

    2016-01-01

    The advances in diagnostic and treatment technology are responsible for a remarkable transformation in the internal medicine concept with the establishment of a new idea of personalized medicine. Inter- and intra-patient tumor heterogeneity and the clinical outcome and/or treatment's toxicity's complexity, justify the effort to develop predictive models from decision support systems. However, the number of evaluated variables coming from multiple disciplines: oncology, computer science, bioinformatics, statistics, genomics, imaging, among others could be very large thus making traditional statistical analysis difficult to exploit. Automated data-mining processes and machine learning approaches can be a solution to organize the massive amount of data, trying to unravel important interaction. The purpose of this paper is to describe the strategy to collect and analyze data properly for decision support and introduce the concept of an 'umbrella protocol' within the framework of 'rapid learning healthcare'.

  15. Combination of complementary data mining methods for geographical characterization of extra virgin olive oils based on mineral composition.

    PubMed

    Sayago, Ana; González-Domínguez, Raúl; Beltrán, Rafael; Fernández-Recamales, Ángeles

    2018-09-30

    This work explores the potential of multi-element fingerprinting in combination with advanced data mining strategies to assess the geographical origin of extra virgin olive oil samples. For this purpose, the concentrations of 55 elements were determined in 125 oil samples from multiple Spanish geographic areas. Several unsupervised and supervised multivariate statistical techniques were used to build classification models and investigate the relationship between mineral composition of olive oils and their provenance. Results showed that Spanish extra virgin olive oils exhibit characteristic element profiles, which can be differentiated on the basis of their origin in accordance with three geographical areas: Atlantic coast (Huelva province), Mediterranean coast and inland regions. Furthermore, statistical modelling yielded high sensitivity and specificity, principally when random forest and support vector machines were employed, thus demonstrating the utility of these techniques in food traceability and authenticity research. Copyright © 2018 Elsevier Ltd. All rights reserved.

  16. Rooftop Solar Photovoltaic Technical Potential in the United States. A Detailed Assessment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gagnon, Pieter; Margolis, Robert; Melius, Jennifer

    2016-01-01

    How much energy could be generated if PV modules were installed on all of the suitable roof area in the nation? To answer this question, we first use GIS methods to process a lidar dataset and determine the amount of roof area that is suitable for PV deployment in 128 cities nationwide, containing 23% of U.S. buildings, and provide PV-generation results for a subset of those cities. We then extend the insights from that analysis to the entire continental United States. We develop two statistical models--one for small buildings and one for medium and large buildings--and populate them with geographicmore » variables that correlate with rooftop's suitability for PV. We simulate the productivity of PV installed on the suitable roof area, and present the technical potential of PV on both small buildings and medium/large buildings for every state in the continental US. Within the 128 cities covered by lidar data, 83% of small buildings have a location suitable for a PV installation, but only 26% of the total rooftop area of small buildings is suitable for development. The sheer number of buildings in this class, however, gives small buildings the greatest technical potential. Small building rooftops could accommodate 731 GW of PV capacity and generate 926 TWh/year of PV energy, approximately 65% of rooftop PV's total technical potential. We conclude by summing the PV-generation results for all building sizes and therefore answering our original question, estimating that the total national technical potential of rooftop PV is 1,118 GW of installed capacity and 1,432 TWh of annual energy generation. This equates to 39% of total national electric-sector sales.« less

  17. Rooftop Solar Photovoltaic Technical Potential in the United States

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gagnon, Pieter; Margolis, Robert; Melius, Jennifer

    2016-01-01

    How much energy could we generate if PV modules were installed on all of the suitable roof area in the nation? To answer this question, we first use GIS methods to process a lidar dataset and determine the amount of roof area that is suitable for PV deployment in 128 cities nationwide, containing 23% of U.S. buildings, and provide PV-generation results for a subset of those cities. We then extend the insights from that analysis to the entire continental United States. We develop two statistical models -- one for small buildings and one for medium and large buildings -- andmore » populate them with geographic variables that correlate with rooftop's suitability for PV. We simulate the productivity of PV installed on the suitable roof area, and present the technical potential of PV on both small buildings and medium/large buildings for every state in the continental US. Within the 128 cities covered by lidar data, 83% of small buildings have a location suitable for a PV installation, but only 26% of the total rooftop area of small buildings is suitable for development. The sheer number of buildings in this class, however, gives small buildings the greatest technical potential. Small building rooftops could accommodate 731 GW of PV capacity and generate 926 TWh/year of PV energy, approximately 65% of rooftop PV's total technical potential. We conclude by summing the PV-generation results for all building sizes and therefore answering our original question, estimating that the total national technical potential of rooftop PV is 1,118 GW of installed capacity and 1,432 TWh of annual energy generation. This equates to 39% of total national electric-sector sales.« less

  18. Disease clusters, exact distributions of maxima, and P-values.

    PubMed

    Grimson, R C

    1993-10-01

    This paper presents combinatorial (exact) methods that are useful in the analysis of disease cluster data obtained from small environments, such as buildings and neighbourhoods. Maxwell-Boltzmann and Fermi-Dirac occupancy models are compared in terms of appropriateness of representation of disease incidence patterns (space and/or time) in these environments. The methods are illustrated by a statistical analysis of the incidence pattern of bone fractures in a setting wherein fracture clustering was alleged to be occurring. One of the methodological results derived in this paper is the exact distribution of the maximum cell frequency in occupancy models.

  19. Statistical Examination of the Resolution of a Block-Scale Urban Drainage Model

    NASA Astrophysics Data System (ADS)

    Goldstein, A.; Montalto, F. A.; Digiovanni, K. A.

    2009-12-01

    Stormwater drainage models are utilized by cities in order to plan retention systems to prevent combined sewage overflows and design for development. These models aggregate subcatchments and ignore small pipelines providing a coarse representation of a sewage network. This study evaluates the importance of resolution by comparing two models developed on a neighborhood scale for predicting the total quantity and peak flow of runoff to observed runoff measured at the site. The low and high resolution models were designed for a 2.6 ha block in Bronx, NYC in EPA Stormwater Management Model (SWMM) using a single catchment and separate subcatchments based on surface cover, respectively. The surface covers represented included sidewalks, street, buildings, and backyards. Characteristics for physical surfaces and the infrastructure in the high resolution mode were determined from site visits, sewer pipe maps, aerial photographs, and GIS data-sets provided by the NYC Department of City Planning. Since the low resolution model was depicted at a coarser scale, generalizations were assumed about the overall average characteristics of the catchment. Rainfall and runoff data were monitored over a four month period during the summer rainy season. A total of 53 rain fall events were recorded but only 29 storms produced significant amount of runoffs to be evaluated in the simulations. To determine which model was more accurate at predicting the observed runoff, three characteristics for each storm were compared: peak runoff, total runoff, and time to peak. Two statistical tests were used to determine the significance of the results: the percent difference for each storm and the overall Chi-squared Goodness of Fit distribution for both the low and high resolution model. These tests will evaluate if there is a statistical difference depending on the resolution of scale of the stormwater model. The scale of representation is being evaluated because it could have a profound impact on how low-impact development strategies are assessed. Rerouting flows to delay the time of entry into the combined sewage is the primary goal of stormwater source controls which may be better differentiated in a high resolution as opposed to low resolution model. The preliminary hypothesis is that the low resolution model simplifies watershed by defining attributes uniformly across the watershed. In the high resolution model, the physical flow can be more accurate depicted by connected the various subcatchments. For example, the runoff from buildings can directly be routed to the backyard. The main drawback to the high resolution model is the risk of adding uncertainty due to the number of parameters.

  20. Development of a CFD Model Including Tree's Drag Parameterizations: Application to Pedestrian's Wind Comfort in an Urban Area

    NASA Astrophysics Data System (ADS)

    Kang, G.; Kim, J.

    2017-12-01

    This study investigated the tree's effect on wind comfort at pedestrian height in an urban area using a computational fluid dynamics (CFD) model. We implemented the tree's drag parameterization scheme to the CFD model and validated the simulated results against the wind-tunnel measurement data as well as LES data via several statistical methods. The CFD model underestimated (overestimated) the concentrations on the leeward (windward) walls inside the street canyon in the presence of trees, because the CFD model can't resolve the latticed cage and can't reflect the concentration increase and decrease caused by the latticed cage in the simulations. However, the scalar pollutants' dispersion simulated by the CFD model was quite similar to that in the wind-tunnel measurement in pattern and magnitude, on the whole. The CFD model overall satisfied the statistical validation indices (root normalized mean square error, geometric mean variance, correlation coefficient, and FAC2) but failed to satisfy the fractional bias and geometric mean bias due to the underestimation on the leeward wall and overestimation on the windward wall, showing that its performance was comparable to the LES's performance. We applied the CFD model to evaluation of the trees' effect on the pedestrian's wind-comfort in an urban area. To investigate sensory levels for human activities, the wind-comfort criteria based on Beaufort wind-force scales (BWSs) were used. In the tree-free scenario, BWS 4 and 5 (unpleasant condition for sitting long and sitting short, respectively) appeared in the narrow spaces between buildings, in the upwind side of buildings, and the unobstructed areas. In the tree scenario, BWSs decreased by 1 3 grade inside the campus of Pukyong National University located in the target area, which indicated that trees planted in the campus effectively improved pedestrian's wind comfort.

  1. Application of statistical shape analysis for the estimation of bone and forensic age using the shapes of the 2nd, 3rd, and 4th cervical vertebrae in a young Japanese population.

    PubMed

    Rhee, Chang-Hoon; Shin, Sang Min; Choi, Yong-Seok; Yamaguchi, Tetsutaro; Maki, Koutaro; Kim, Yong-Il; Kim, Seong-Sik; Park, Soo-Byung; Son, Woo-Sung

    2015-12-01

    From computed tomographic images, the dentocentral synchondrosis can be identified in the second cervical vertebra. This can demarcate the border between the odontoid process and the body of the 2nd cervical vertebra and serve as a good model for the prediction of bone and forensic age. Nevertheless, until now, there has been no application of the 2nd cervical vertebra based on the dentocentral synchondrosis. In this study, statistical shape analysis was used to build bone and forensic age estimation regression models. Following the principles of statistical shape analysis and principal components analysis, we used cone-beam computed tomography (CBCT) to evaluate a Japanese population (35 males and 45 females, from 5 to 19 years old). The narrowest prediction intervals among the multivariate regression models were 19.63 for bone age and 2.99 for forensic age. There was no significant difference between form space and shape space in the bone and forensic age estimation models. However, for gender comparison, the bone and forensic age estimation models for males had the higher explanatory power. This study derived an improved objective and quantitative method for bone and forensic age estimation based on only the 2nd, 3rd and 4th cervical vertebral shapes. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  2. Mathematical Capture of Human Data for Computer Model Building and Validation

    DTIC Science & Technology

    2014-04-03

    weapon. The Projectile, the VDE , and the IDE weapons had effects of financial loss for the targeted participant, while the MRAD yielded its own...for LE, Centroid and TE for the baseline and The VDE weapon conditions since p-values exceeded α. All other conditions rejected the null...hypothesis except the LE for VDE weapon. The K-S Statistics were correspondingly lower for the measures that failed to reject the null hypothesis. The CDF

  3. Building and using a statistical 3D motion atlas for analyzing myocardial contraction in MRI

    NASA Astrophysics Data System (ADS)

    Rougon, Nicolas F.; Petitjean, Caroline; Preteux, Francoise J.

    2004-05-01

    We address the issue of modeling and quantifying myocardial contraction from 4D MR sequences, and present an unsupervised approach for building and using a statistical 3D motion atlas for the normal heart. This approach relies on a state-of-the-art variational non rigid registration (NRR) technique using generalized information measures, which allows for robust intra-subject motion estimation and inter-subject anatomical alignment. The atlas is built from a collection of jointly acquired tagged and cine MR exams in short- and long-axis views. Subject-specific non parametric motion estimates are first obtained by incremental NRR of tagged images onto the end-diastolic (ED) frame. Individual motion data are then transformed into the coordinate system of a reference subject using subject-to-reference mappings derived by NRR of cine ED images. Finally, principal component analysis of aligned motion data is performed for each cardiac phase, yielding a mean model and a set of eigenfields encoding kinematic ariability. The latter define an organ-dedicated hierarchical motion basis which enables parametric motion measurement from arbitrary tagged MR exams. To this end, the atlas is transformed into subject coordinates by reference-to-subject NRR of ED cine frames. Atlas-based motion estimation is then achieved by parametric NRR of tagged images onto the ED frame, yielding a compact description of myocardial contraction during diastole.

  4. Artificial Neural Network Approach in Laboratory Test Reporting:  Learning Algorithms.

    PubMed

    Demirci, Ferhat; Akan, Pinar; Kume, Tuncay; Sisman, Ali Riza; Erbayraktar, Zubeyde; Sevinc, Suleyman

    2016-08-01

    In the field of laboratory medicine, minimizing errors and establishing standardization is only possible by predefined processes. The aim of this study was to build an experimental decision algorithm model open to improvement that would efficiently and rapidly evaluate the results of biochemical tests with critical values by evaluating multiple factors concurrently. The experimental model was built by Weka software (Weka, Waikato, New Zealand) based on the artificial neural network method. Data were received from Dokuz Eylül University Central Laboratory. "Training sets" were developed for our experimental model to teach the evaluation criteria. After training the system, "test sets" developed for different conditions were used to statistically assess the validity of the model. After developing the decision algorithm with three iterations of training, no result was verified that was refused by the laboratory specialist. The sensitivity of the model was 91% and specificity was 100%. The estimated κ score was 0.950. This is the first study based on an artificial neural network to build an experimental assessment and decision algorithm model. By integrating our trained algorithm model into a laboratory information system, it may be possible to reduce employees' workload without compromising patient safety. © American Society for Clinical Pathology, 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  5. Statistical Analysis of Solar PV Power Frequency Spectrum for Optimal Employment of Building Loads

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Olama, Mohammed M; Sharma, Isha; Kuruganti, Teja

    In this paper, a statistical analysis of the frequency spectrum of solar photovoltaic (PV) power output is conducted. This analysis quantifies the frequency content that can be used for purposes such as developing optimal employment of building loads and distributed energy resources. One year of solar PV power output data was collected and analyzed using one-second resolution to find ideal bounds and levels for the different frequency components. The annual, seasonal, and monthly statistics of the PV frequency content are computed and illustrated in boxplot format. To examine the compatibility of building loads for PV consumption, a spectral analysis ofmore » building loads such as Heating, Ventilation and Air-Conditioning (HVAC) units and water heaters was performed. This defined the bandwidth over which these devices can operate. Results show that nearly all of the PV output (about 98%) is contained within frequencies lower than 1 mHz (equivalent to ~15 min), which is compatible for consumption with local building loads such as HVAC units and water heaters. Medium frequencies in the range of ~15 min to ~1 min are likely to be suitable for consumption by fan equipment of variable air volume HVAC systems that have time constants in the range of few seconds to few minutes. This study indicates that most of the PV generation can be consumed by building loads with the help of proper control strategies, thereby reducing impact on the grid and the size of storage systems.« less

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wurtz, R.; Kaplan, A.

    Pulse shape discrimination (PSD) is a variety of statistical classifier. Fully-­realized statistical classifiers rely on a comprehensive set of tools for designing, building, and implementing. PSD advances rely on improvements to the implemented algorithm. PSD advances can be improved by using conventional statistical classifier or machine learning methods. This paper provides the reader with a glossary of classifier-­building elements and their functions in a fully-­designed and operational classifier framework that can be used to discover opportunities for improving PSD classifier projects. This paper recommends reporting the PSD classifier’s receiver operating characteristic (ROC) curve and its behavior at a gamma rejectionmore » rate (GRR) relevant for realistic applications.« less

  7. Frequency of urban building fires as related to daily weather conditions

    Treesearch

    Arthur R. Pirsko; Wallace L. Fons

    1956-01-01

    Daily weather elements of precipitation, wind, mean temperature, relative humidity, and dew-point temperature for selected urban areas (approximately 850,000 population) in the United States are statistically analyzed to determine their correlation with daily number of building fires. The frequency of urban building fires is found to be significantly correlated with...

  8. CAPACITY BUILDING PROCESS IN ENVIRONMENTAL AND HEALTH IMPACT ASSESSMENT FOR A THAI COMMUNITY.

    PubMed

    Chaithui, Suthat; Sithisarankul, Pornchai; Hengpraprom, Sarunya

    2017-03-01

    This research aimed at exploring the development of the capacitybuilding process in environmental and health impact assessment, including the consideration of subsequent, capacity-building achievements. Data were gathered through questionnaires, participatory observations, in-depth interviews, focus group discussions, and capacity building checklist forms. These data were analyzed using content analysis, descriptive statistics, and inferential statistics. Our study used the components of the final draft for capacity-building processes consisting of ten steps that were formulated by synthesis from each respective process. Additionally, the evaluation of capacity building levels was performed using 10-item evaluation criteria for nine communities. The results indicated that the communities performed well under these criteria. Finally, exploration of the factors influencing capacity building in environmental and health impact assessment indicated that the learning of community members by knowledge exchange via activities and study visits were the most influential factors of the capacity building processes in environmental and health impact assessment. The final revised version of capacitybuilding process in environmental and health impact assessment could serve as a basis for the consideration of interventions in similar areas, so that they increased capacity in environmental and health impact assessments.

  9. Energy Production Calculations with Field Flow Models and Windspeed Predictions with Statistical Methods

    NASA Astrophysics Data System (ADS)

    Rüstemoǧlu, Sevinç; Barutçu, Burak; Sibel Menteş, Å.ž.

    2010-05-01

    The continuous usage of fossil fuels as primary energy source is the reason of the emission of CO and powerless economy of the country affected by the great flactuations in the unit price of energy sources. In recent years, developments in wind energy sector and the supporting new renewable energy policies of the countries allow the new wind farm owners and the firms who expect to be an owner to consider and invest on the renewable sources. In this study, the annual production of the turbines with 1.8 kW and 30 kW which are available for Istanbul Technical University in Energy Institute is calculated by Wasp and WindPro Field Flow Models and the wind characteristics of the area are analysed. The meteorological data used in calculation includes the period between 02.March.2000 and 31.May.2004 and is taken from the meteorological mast ( ) in Istanbul Technical University's campus area. The measurement data is taken from 2 m and 10 m heights with hourly means. The topography, roughness classes and shelter effects are defined in the models to make accurate extrapolation to the turbine sites. As an advantage, the region is nearly 3.5 km close to the Istanbul Bosphorous but as it can be seen from the Wasp and WindPro Model Results, the Bosphorous effect is interrupted by the new buildings and hight forestry. The shelter effect of these high buildings have a great influence on the wind flow and decrease the high wind energy potential which is produced by the Bosphorous effect. This study, which determines wind characteristics and expected annual production, is important for this Project Site and therefore gains importance before the construction of wind energy system. However, when the system is operating, developing the energy management skills, forecasting the wind speed and direction will become important. At this point, three statistical models which are Kalman Fitler, AR Model and Neural Networks models are used to determine the success of each method for correct wind prediction. Statistical methods' preditictions as time series are included and the similartiy rates are compared for each method. The algorithms which are performed in MATLAB, gave the similarity results of each model. According to the Neural Networks results which are found to be the most successful method for prediction within these three statistical models, the windspeed similarity rate between the original measurements and the prediction set which includes 1 year period between 2003 and 2004, is evaluated as % 94.7. For wind direction, the similarity rate is %81.61. High noise margin and ability to learn the characteristics of the signal are important advantages of Neural Networks for compatible windspeed and direction predictions compared with measurements.

  10. IFLA General Conference, 1992. Division of Management and Technology: Audiovisual Media (RT); Section on Library Services to Multicultural Populations; Section on Library Buildings and Equipment; Section on Information Technology; Management of Library Associations (RT); Section on Statistics. Papers

    ERIC Educational Resources Information Center

    International Federation of Library Associations and Institutions, London (England).

    Eleven papers delivered at the annual meeting of the International Federation of Library Associations and Institutions for the Division of Management and Technology are presented. Some were presented at a roundtable on audiovisual media, and others are from sessions on library buildings and equipment, information management, and statistics in…

  11. Data Analysis & Statistical Methods for Command File Errors

    NASA Technical Reports Server (NTRS)

    Meshkat, Leila; Waggoner, Bruce; Bryant, Larry

    2014-01-01

    This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.

  12. Shilling Attacks Detection in Recommender Systems Based on Target Item Analysis

    PubMed Central

    Zhou, Wei; Wen, Junhao; Koh, Yun Sing; Xiong, Qingyu; Gao, Min; Dobbie, Gillian; Alam, Shafiq

    2015-01-01

    Recommender systems are highly vulnerable to shilling attacks, both by individuals and groups. Attackers who introduce biased ratings in order to affect recommendations, have been shown to negatively affect collaborative filtering (CF) algorithms. Previous research focuses only on the differences between genuine profiles and attack profiles, ignoring the group characteristics in attack profiles. In this paper, we study the use of statistical metrics to detect rating patterns of attackers and group characteristics in attack profiles. Another question is that most existing detecting methods are model specific. Two metrics, Rating Deviation from Mean Agreement (RDMA) and Degree of Similarity with Top Neighbors (DegSim), are used for analyzing rating patterns between malicious profiles and genuine profiles in attack models. Building upon this, we also propose and evaluate a detection structure called RD-TIA for detecting shilling attacks in recommender systems using a statistical approach. In order to detect more complicated attack models, we propose a novel metric called DegSim’ based on DegSim. The experimental results show that our detection model based on target item analysis is an effective approach for detecting shilling attacks. PMID:26222882

  13. Geographical origin discrimination of lentils (Lens culinaris Medik.) using 1H NMR fingerprinting and multivariate statistical analyses.

    PubMed

    Longobardi, Francesco; Innamorato, Valentina; Di Gioia, Annalisa; Ventrella, Andrea; Lippolis, Vincenzo; Logrieco, Antonio F; Catucci, Lucia; Agostiano, Angela

    2017-12-15

    Lentil samples coming from two different countries, i.e. Italy and Canada, were analysed using untargeted 1 H NMR fingerprinting in combination with chemometrics in order to build models able to classify them according to their geographical origin. For such aim, Soft Independent Modelling of Class Analogy (SIMCA), k-Nearest Neighbor (k-NN), Principal Component Analysis followed by Linear Discriminant Analysis (PCA-LDA) and Partial Least Squares-Discriminant Analysis (PLS-DA) were applied to the NMR data and the results were compared. The best combination of average recognition (100%) and cross-validation prediction abilities (96.7%) was obtained for the PCA-LDA. All the statistical models were validated both by using a test set and by carrying out a Monte Carlo Cross Validation: the obtained performances were found to be satisfying for all the models, with prediction abilities higher than 95% demonstrating the suitability of the developed methods. Finally, the metabolites that mostly contributed to the lentil discrimination were indicated. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. High-Density Signal Interface Electromagnetic Radiation Prediction for Electromagnetic Compatibility Evaluation.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Halligan, Matthew

    Radiated power calculation approaches for practical scenarios of incomplete high- density interface characterization information and incomplete incident power information are presented. The suggested approaches build upon a method that characterizes power losses through the definition of power loss constant matrices. Potential radiated power estimates include using total power loss information, partial radiated power loss information, worst case analysis, and statistical bounding analysis. A method is also proposed to calculate radiated power when incident power information is not fully known for non-periodic signals at the interface. Incident data signals are modeled from a two-state Markov chain where bit state probabilities aremore » derived. The total spectrum for windowed signals is postulated as the superposition of spectra from individual pulses in a data sequence. Statistical bounding methods are proposed as a basis for the radiated power calculation due to the statistical calculation complexity to find a radiated power probability density function.« less

  15. MEASURE: An integrated data-analysis and model identification facility

    NASA Technical Reports Server (NTRS)

    Singh, Jaidip; Iyer, Ravi K.

    1990-01-01

    The first phase of the development of MEASURE, an integrated data analysis and model identification facility is described. The facility takes system activity data as input and produces as output representative behavioral models of the system in near real time. In addition a wide range of statistical characteristics of the measured system are also available. The usage of the system is illustrated on data collected via software instrumentation of a network of SUN workstations at the University of Illinois. Initially, statistical clustering is used to identify high density regions of resource-usage in a given environment. The identified regions form the states for building a state-transition model to evaluate system and program performance in real time. The model is then solved to obtain useful parameters such as the response-time distribution and the mean waiting time in each state. A graphical interface which displays the identified models and their characteristics (with real time updates) was also developed. The results provide an understanding of the resource-usage in the system under various workload conditions. This work is targeted for a testbed of UNIX workstations with the initial phase ported to SUN workstations on the NASA, Ames Research Center Advanced Automation Testbed.

  16. RANdom SAmple Consensus (RANSAC) algorithm for material-informatics: application to photovoltaic solar cells.

    PubMed

    Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch

    2017-06-06

    An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a "one stop shop" algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For "future" predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.

  17. Statistical Mechanics of US Supreme Court

    NASA Astrophysics Data System (ADS)

    Lee, Edward; Broedersz, Chase; Bialek, William; Biophysics Theory Group Team

    2014-03-01

    We build simple models for the distribution of voting patterns in a group, using the Supreme Court of the United States as an example. The least structured, or maximum entropy, model that is consistent with the observed pairwise correlations among justices' votes is equivalent to an Ising spin glass. While all correlations (perhaps surprisingly) are positive, the effective pairwise interactions in the spin glass model have both signs, recovering some of our intuition that justices on opposite sides of the ideological spectrum should have a negative influence on one another. Despite the competing interactions, a strong tendency toward unanimity emerges from the model, and this agrees quantitatively with the data. The model shows that voting patterns are organized in a relatively simple ``energy landscape,'' correctly predicts the extent to which each justice is correlated with the majority, and gives us a measure of the influence that justices exert on one another. These results suggest that simple models, grounded in statistical physics, can capture essential features of collective decision making quantitatively, even in a complex political context. Funded by National Science Foundation Grants PHY-0957573 and CCF-0939370, WM Keck Foundation, Lewis-Sigler Fellowship, Burroughs Wellcome Fund, and Winston Foundation.

  18. Building energy analysis tool

    DOEpatents

    Brackney, Larry; Parker, Andrew; Long, Nicholas; Metzger, Ian; Dean, Jesse; Lisell, Lars

    2016-04-12

    A building energy analysis system includes a building component library configured to store a plurality of building components, a modeling tool configured to access the building component library and create a building model of a building under analysis using building spatial data and using selected building components of the plurality of building components stored in the building component library, a building analysis engine configured to operate the building model and generate a baseline energy model of the building under analysis and further configured to apply one or more energy conservation measures to the baseline energy model in order to generate one or more corresponding optimized energy models, and a recommendation tool configured to assess the one or more optimized energy models against the baseline energy model and generate recommendations for substitute building components or modifications.

  19. 7 CFR Appendix A to Part 3600 - List of State Statistical Offices

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 32803 Georgia, Stephens Federal Building, Suite 320, Athens, GA 30613 Hawaii, State Department of Agriculture Building, 1428 South King Street, Honolulu, HI 96814 Idaho, 2224 Old Penitentiary Road, Boise, ID...

  20. 7 CFR Appendix A to Part 3600 - List of State Statistical Offices

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 32803 Georgia, Stephens Federal Building, Suite 320, Athens, GA 30613 Hawaii, State Department of Agriculture Building, 1428 South King Street, Honolulu, HI 96814 Idaho, 2224 Old Penitentiary Road, Boise, ID...

  1. 7 CFR Appendix A to Part 3600 - List of State Statistical Offices

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 32803 Georgia, Stephens Federal Building, Suite 320, Athens, GA 30613 Hawaii, State Department of Agriculture Building, 1428 South King Street, Honolulu, HI 96814 Idaho, 2224 Old Penitentiary Road, Boise, ID...

  2. Estimating the impact of mineral aerosols on crop yields in food insecure regions using statistical crop models

    NASA Astrophysics Data System (ADS)

    Hoffman, A.; Forest, C. E.; Kemanian, A.

    2016-12-01

    A significant number of food-insecure nations exist in regions of the world where dust plays a large role in the climate system. While the impacts of common climate variables (e.g. temperature, precipitation, ozone, and carbon dioxide) on crop yields are relatively well understood, the impact of mineral aerosols on yields have not yet been thoroughly investigated. This research aims to develop the data and tools to progress our understanding of mineral aerosol impacts on crop yields. Suspended dust affects crop yields by altering the amount and type of radiation reaching the plant, modifying local temperature and precipitation. While dust events (i.e. dust storms) affect crop yields by depleting the soil of nutrients or by defoliation via particle abrasion. The impact of dust on yields is modeled statistically because we are uncertain which impacts will dominate the response on national and regional scales considered in this study. Multiple linear regression is used in a number of large-scale statistical crop modeling studies to estimate yield responses to various climate variables. In alignment with previous work, we develop linear crop models, but build upon this simple method of regression with machine-learning techniques (e.g. random forests) to identify important statistical predictors and isolate how dust affects yields on the scales of interest. To perform this analysis, we develop a crop-climate dataset for maize, soybean, groundnut, sorghum, rice, and wheat for the regions of West Africa, East Africa, South Africa, and the Sahel. Random forest regression models consistently model historic crop yields better than the linear models. In several instances, the random forest models accurately capture the temperature and precipitation threshold behavior in crops. Additionally, improving agricultural technology has caused a well-documented positive trend that dominates time series of global and regional yields. This trend is often removed before regression with traditional crop models, but likely at the cost of removing climate information. Our random forest models consistently discover the positive trend without removing any additional data. The application of random forests as a statistical crop model provides insight into understanding the impact of dust on yields in marginal food producing regions.

  3. Efficient ensemble forecasting of marine ecology with clustered 1D models and statistical lateral exchange: application to the Red Sea

    NASA Astrophysics Data System (ADS)

    Dreano, Denis; Tsiaras, Kostas; Triantafyllou, George; Hoteit, Ibrahim

    2017-07-01

    Forecasting the state of large marine ecosystems is important for many economic and public health applications. However, advanced three-dimensional (3D) ecosystem models, such as the European Regional Seas Ecosystem Model (ERSEM), are computationally expensive, especially when implemented within an ensemble data assimilation system requiring several parallel integrations. As an alternative to 3D ecological forecasting systems, we propose to implement a set of regional one-dimensional (1D) water-column ecological models that run at a fraction of the computational cost. The 1D model domains are determined using a Gaussian mixture model (GMM)-based clustering method and satellite chlorophyll-a (Chl-a) data. Regionally averaged Chl-a data is assimilated into the 1D models using the singular evolutive interpolated Kalman (SEIK) filter. To laterally exchange information between subregions and improve the forecasting skills, we introduce a new correction step to the assimilation scheme, in which we assimilate a statistical forecast of future Chl-a observations based on information from neighbouring regions. We apply this approach to the Red Sea and show that the assimilative 1D ecological models can forecast surface Chl-a concentration with high accuracy. The statistical assimilation step further improves the forecasting skill by as much as 50%. This general approach of clustering large marine areas and running several interacting 1D ecological models is very flexible. It allows many combinations of clustering, filtering and regression technics to be used and can be applied to build efficient forecasting systems in other large marine ecosystems.

  4. Relationships between parents’ academic backgrounds and incomes and building students’ healthy eating habits

    PubMed Central

    Hoque, Kazi Fardinul; A/P Thanabalan, Revethy

    2018-01-01

    Background Building healthy eating habit is essential for all people. School and family are the prime institutions to instill this habit during early age. This study is aimed at understanding the impact of family such as parents’ educations and incomes on building students’ healthy eating habits. Methods A survey on building students’ eating habits was conducted among primary school students of grade 4 (11 years) and 5 (12 years) from Kulim district, Malaysia. Data from 318 respondents were analysed. Descriptive statistics were used to find the present scenario of their knowledge, attitude and practices towards their eating habits while one-way ANOVA and independent sample t-test were used to find the differences between their practices based on students’ gender, parents’ educations and incomes. Results The study finds that the students have a good knowledge of types of healthy food but yet their preferences are towards the unhealthy food. Though the students’ gender and parents’ educations are not found significantly related to students’ knowledge, attitude and practices towards healthy eating habits, parents’ incomes have significant influence on promoting the healthy eating habit. Discussion Findings of this study can be useful to guide parents in healthy food choices and suggest them to be models to their children in building healthy eating habits. PMID:29736328

  5. Relationships between parents' academic backgrounds and incomes and building students' healthy eating habits.

    PubMed

    Hoque, Kazi Enamul; Hoque, Kazi Fardinul; A/P Thanabalan, Revethy

    2018-01-01

    Building healthy eating habit is essential for all people. School and family are the prime institutions to instill this habit during early age. This study is aimed at understanding the impact of family such as parents' educations and incomes on building students' healthy eating habits. A survey on building students' eating habits was conducted among primary school students of grade 4 (11 years) and 5 (12 years) from Kulim district, Malaysia. Data from 318 respondents were analysed. Descriptive statistics were used to find the present scenario of their knowledge, attitude and practices towards their eating habits while one-way ANOVA and independent sample t -test were used to find the differences between their practices based on students' gender, parents' educations and incomes. The study finds that the students have a good knowledge of types of healthy food but yet their preferences are towards the unhealthy food. Though the students' gender and parents' educations are not found significantly related to students' knowledge, attitude and practices towards healthy eating habits, parents' incomes have significant influence on promoting the healthy eating habit. Findings of this study can be useful to guide parents in healthy food choices and suggest them to be models to their children in building healthy eating habits.

  6. Morphometrical study on senile larynx.

    PubMed

    Zieliński, R

    2001-01-01

    The aim of the study was a morphometrical macroscopic evaluation of senile larynges, according to its usefulness in ORL diagnostic and operational methods. Larynx preparations were taken from cadavers of both sexes, of age 65 and over, about 24 hours after death. Clinically important laryngeal diameters were collected using common morphometrical methods. A few body features were also being gathered. Computer statistical methods were used in data assessment, including basic statistics and linear correlations between diameters and between diameters and body features. The data presented in the study may be very helpful in evaluation of diagnostic methods. It may also help in selection of right operational tool' sizes, the most appropriate operational technique choice, preoperative preparations and designing and building virtual and plastic models for physicians' training.

  7. Implementation of building information modeling in Malaysian construction industry

    NASA Astrophysics Data System (ADS)

    Memon, Aftab Hameed; Rahman, Ismail Abdul; Harman, Nur Melly Edora

    2014-10-01

    This study has assessed the implementation level of Building Information Modeling (BIM) in the construction industry of Malaysia. It also investigated several computer software packages facilitating BIM and challenges affecting its implementation. Data collection for this study was carried out using questionnaire survey among the construction practitioners. 95 completed forms of questionnaire received against 150 distributed questionnaire sets from consultant, contractor and client organizations were analyzed statistically. Analysis findings indicated that the level of implementation of BIM in the construction industry of Malaysia is very low. Average index method employed to assess the effectiveness of various software packages of BIM highlighted that Bentley construction, AutoCAD and ArchiCAD are three most popular and effective software packages. Major challenges to BIM implementation are it requires enhanced collaboration, add work to a designer, interoperability and needs enhanced collaboration. For improving the level of implementing BIM in Malaysian industry, it is recommended that a flexible training program of BIM for all practitioners must be created.

  8. An Estimation of Construction and Demolition Debris in Seoul, Korea: Waste Amount, Type, and Estimating Model.

    PubMed

    Seo, Seongwon; Hwang, Yongwoo

    1999-08-01

    Construction and demolition (C&D) debris is generated at the site of various construction activities. However, the amount of the debris is usually so large that it is necessary to estimate the amount of C&D debris as accurately as possible for effective waste management and control in urban areas. In this paper, an effective estimation method using a statistical model was proposed. The estimation process was composed of five steps: estimation of the life span of buildings; estimation of the floor area of buildings to be constructed and demolished; calculation of individual intensity units of C&D debris; and estimation of the future C&D debris production. This method was also applied in the city of Seoul as an actual case, and the estimated amount of C&D debris in Seoul in 2021 was approximately 24 million tons. Of this total amount, 98% was generated by demolition, and the main components of debris were concrete and brick.

  9. Merging Digital Surface Models Implementing Bayesian Approaches

    NASA Astrophysics Data System (ADS)

    Sadeq, H.; Drummond, J.; Li, Z.

    2016-06-01

    In this research different DSMs from different sources have been merged. The merging is based on a probabilistic model using a Bayesian Approach. The implemented data have been sourced from very high resolution satellite imagery sensors (e.g. WorldView-1 and Pleiades). It is deemed preferable to use a Bayesian Approach when the data obtained from the sensors are limited and it is difficult to obtain many measurements or it would be very costly, thus the problem of the lack of data can be solved by introducing a priori estimations of data. To infer the prior data, it is assumed that the roofs of the buildings are specified as smooth, and for that purpose local entropy has been implemented. In addition to the a priori estimations, GNSS RTK measurements have been collected in the field which are used as check points to assess the quality of the DSMs and to validate the merging result. The model has been applied in the West-End of Glasgow containing different kinds of buildings, such as flat roofed and hipped roofed buildings. Both quantitative and qualitative methods have been employed to validate the merged DSM. The validation results have shown that the model was successfully able to improve the quality of the DSMs and improving some characteristics such as the roof surfaces, which consequently led to better representations. In addition to that, the developed model has been compared with the well established Maximum Likelihood model and showed similar quantitative statistical results and better qualitative results. Although the proposed model has been applied on DSMs that were derived from satellite imagery, it can be applied to any other sourced DSMs.

  10. CHIMERA: Top-down model for hierarchical, overlapping and directed cluster structures in directed and weighted complex networks

    NASA Astrophysics Data System (ADS)

    Franke, R.

    2016-11-01

    In many networks discovered in biology, medicine, neuroscience and other disciplines special properties like a certain degree distribution and hierarchical cluster structure (also called communities) can be observed as general organizing principles. Detecting the cluster structure of an unknown network promises to identify functional subdivisions, hierarchy and interactions on a mesoscale. It is not trivial choosing an appropriate detection algorithm because there are multiple network, cluster and algorithmic properties to be considered. Edges can be weighted and/or directed, clusters overlap or build a hierarchy in several ways. Algorithms differ not only in runtime, memory requirements but also in allowed network and cluster properties. They are based on a specific definition of what a cluster is, too. On the one hand, a comprehensive network creation model is needed to build a large variety of benchmark networks with different reasonable structures to compare algorithms. On the other hand, if a cluster structure is already known, it is desirable to separate effects of this structure from other network properties. This can be done with null model networks that mimic an observed cluster structure to improve statistics on other network features. A third important application is the general study of properties in networks with different cluster structures, possibly evolving over time. Currently there are good benchmark and creation models available. But what is left is a precise sandbox model to build hierarchical, overlapping and directed clusters for undirected or directed, binary or weighted complex random networks on basis of a sophisticated blueprint. This gap shall be closed by the model CHIMERA (Cluster Hierarchy Interconnection Model for Evaluation, Research and Analysis) which will be introduced and described here for the first time.

  11. The Incoming Statistical Knowledge of Undergraduate Majors in a Department of Mathematics and Statistics

    ERIC Educational Resources Information Center

    Cook, Samuel A.; Fukawa-Connelly, Timothy

    2016-01-01

    Studies have shown that at the end of an introductory statistics course, students struggle with building block concepts, such as mean and standard deviation, and rely on procedural understandings of the concepts. This study aims to investigate the understandings entering freshman of a department of mathematics and statistics (including mathematics…

  12. Probabilistic inversion with graph cuts: Application to the Boise Hydrogeophysical Research Site

    NASA Astrophysics Data System (ADS)

    Pirot, Guillaume; Linde, Niklas; Mariethoz, Grégoire; Bradford, John H.

    2017-02-01

    Inversion methods that build on multiple-point statistics tools offer the possibility to obtain model realizations that are not only in agreement with field data, but also with conceptual geological models that are represented by training images. A recent inversion approach based on patch-based geostatistical resimulation using graph cuts outperforms state-of-the-art multiple-point statistics methods when applied to synthetic inversion examples featuring continuous and discontinuous property fields. Applications of multiple-point statistics tools to field data are challenging due to inevitable discrepancies between actual subsurface structure and the assumptions made in deriving the training image. We introduce several amendments to the original graph cut inversion algorithm and present a first-ever field application by addressing porosity estimation at the Boise Hydrogeophysical Research Site, Boise, Idaho. We consider both a classical multi-Gaussian and an outcrop-based prior model (training image) that are in agreement with available porosity data. When conditioning to available crosshole ground-penetrating radar data using Markov chain Monte Carlo, we find that the posterior realizations honor overall both the characteristics of the prior models and the geophysical data. The porosity field is inverted jointly with the measurement error and the petrophysical parameters that link dielectric permittivity to porosity. Even though the multi-Gaussian prior model leads to posterior realizations with higher likelihoods, the outcrop-based prior model shows better convergence. In addition, it offers geologically more realistic posterior realizations and it better preserves the full porosity range of the prior.

  13. University Leadership in Energy and Environmental Design: How Postsecondary Institutions Use the LEEDRTM Green Building Rating System

    ERIC Educational Resources Information Center

    Chance, Shannon Massie

    2010-01-01

    This descriptive, exploratory study focused on how institutions of higher education have used the United States Green Building Council's (USGBC) Leadership in Energy and Environmental Design (LEED[R]) Green Building Rating system. It employed statistical methods to assess which types of universities have used LEED, what ratings they earned, and…

  14. Computational Process Modeling for Additive Manufacturing (OSU)

    NASA Technical Reports Server (NTRS)

    Bagg, Stacey; Zhang, Wei

    2015-01-01

    Powder-Bed Additive Manufacturing (AM) through Direct Metal Laser Sintering (DMLS) or Selective Laser Melting (SLM) is being used by NASA and the Aerospace industry to "print" parts that traditionally are very complex, high cost, or long schedule lead items. The process spreads a thin layer of metal powder over a build platform, then melts the powder in a series of welds in a desired shape. The next layer of powder is applied, and the process is repeated until layer-by-layer, a very complex part can be built. This reduces cost and schedule by eliminating very complex tooling and processes traditionally used in aerospace component manufacturing. To use the process to print end-use items, NASA seeks to understand SLM material well enough to develop a method of qualifying parts for space flight operation. Traditionally, a new material process takes many years and high investment to generate statistical databases and experiential knowledge, but computational modeling can truncate the schedule and cost -many experiments can be run quickly in a model, which would take years and a high material cost to run empirically. This project seeks to optimize material build parameters with reduced time and cost through modeling.

  15. Learning Probabilistic Inference through Spike-Timing-Dependent Plasticity.

    PubMed

    Pecevski, Dejan; Maass, Wolfgang

    2016-01-01

    Numerous experimental data show that the brain is able to extract information from complex, uncertain, and often ambiguous experiences. Furthermore, it can use such learnt information for decision making through probabilistic inference. Several models have been proposed that aim at explaining how probabilistic inference could be performed by networks of neurons in the brain. We propose here a model that can also explain how such neural network could acquire the necessary information for that from examples. We show that spike-timing-dependent plasticity in combination with intrinsic plasticity generates in ensembles of pyramidal cells with lateral inhibition a fundamental building block for that: probabilistic associations between neurons that represent through their firing current values of random variables. Furthermore, by combining such adaptive network motifs in a recursive manner the resulting network is enabled to extract statistical information from complex input streams, and to build an internal model for the distribution p (*) that generates the examples it receives. This holds even if p (*) contains higher-order moments. The analysis of this learning process is supported by a rigorous theoretical foundation. Furthermore, we show that the network can use the learnt internal model immediately for prediction, decision making, and other types of probabilistic inference.

  16. Learning Probabilistic Inference through Spike-Timing-Dependent Plasticity123

    PubMed Central

    Pecevski, Dejan

    2016-01-01

    Abstract Numerous experimental data show that the brain is able to extract information from complex, uncertain, and often ambiguous experiences. Furthermore, it can use such learnt information for decision making through probabilistic inference. Several models have been proposed that aim at explaining how probabilistic inference could be performed by networks of neurons in the brain. We propose here a model that can also explain how such neural network could acquire the necessary information for that from examples. We show that spike-timing-dependent plasticity in combination with intrinsic plasticity generates in ensembles of pyramidal cells with lateral inhibition a fundamental building block for that: probabilistic associations between neurons that represent through their firing current values of random variables. Furthermore, by combining such adaptive network motifs in a recursive manner the resulting network is enabled to extract statistical information from complex input streams, and to build an internal model for the distribution p* that generates the examples it receives. This holds even if p* contains higher-order moments. The analysis of this learning process is supported by a rigorous theoretical foundation. Furthermore, we show that the network can use the learnt internal model immediately for prediction, decision making, and other types of probabilistic inference. PMID:27419214

  17. Regional climate model downscaling may improve the prediction of alien plant species distributions

    NASA Astrophysics Data System (ADS)

    Liu, Shuyan; Liang, Xin-Zhong; Gao, Wei; Stohlgren, Thomas J.

    2014-12-01

    Distributions of invasive species are commonly predicted with species distribution models that build upon the statistical relationships between observed species presence data and climate data. We used field observations, climate station data, and Maximum Entropy species distribution models for 13 invasive plant species in the United States, and then compared the models with inputs from a General Circulation Model (hereafter GCM-based models) and a downscaled Regional Climate Model (hereafter, RCM-based models).We also compared species distributions based on either GCM-based or RCM-based models for the present (1990-1999) to the future (2046-2055). RCM-based species distribution models replicated observed distributions remarkably better than GCM-based models for all invasive species under the current climate. This was shown for the presence locations of the species, and by using four common statistical metrics to compare modeled distributions. For two widespread invasive taxa ( Bromus tectorum or cheatgrass, and Tamarix spp. or tamarisk), GCM-based models failed miserably to reproduce observed species distributions. In contrast, RCM-based species distribution models closely matched observations. Future species distributions may be significantly affected by using GCM-based inputs. Because invasive plants species often show high resilience and low rates of local extinction, RCM-based species distribution models may perform better than GCM-based species distribution models for planning containment programs for invasive species.

  18. Statistical Analysis of Japanese Structural Damage Data

    DTIC Science & Technology

    1977-01-01

    buildings and no ready correlation between I-beam and lattice work columns could be established. The complete listing of the buildings contained in the final...subclassification efforts in this structure class. Of the 90 buildings in the data base, two have such light lattice work steel columns that they would...more properly be clas- sified as Very Light Steel Frame Buildings; six have concrete panel walls; two have lattice steel columns that are filled with

  19. Understanding Summary Statistics and Graphical Techniques to Compare Michael Jordan versus LeBron James

    ERIC Educational Resources Information Center

    Williams, Immanuel James; Williams, Kelley Kim

    2016-01-01

    Understanding summary statistics and graphical techniques are building blocks to comprehending concepts beyond basic statistics. It's known that motivated students perform better in school. Using examples that students find engaging allows them to understand the concepts at a deeper level.

  20. Protein structure modeling for CASP10 by multiple layers of global optimization.

    PubMed

    Joo, Keehyoung; Lee, Juyong; Sim, Sangjin; Lee, Sun Young; Lee, Kiho; Heo, Seungryong; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

    2014-02-01

    In the template-based modeling (TBM) category of CASP10 experiment, we introduced a new protocol called protein modeling system (PMS) to generate accurate protein structures in terms of side-chains as well as backbone trace. In the new protocol, a global optimization algorithm, called conformational space annealing (CSA), is applied to the three layers of TBM procedure: multiple sequence-structure alignment, 3D chain building, and side-chain re-modeling. For 3D chain building, we developed a new energy function which includes new distance restraint terms of Lorentzian type (derived from multiple templates), and new energy terms that combine (physical) energy terms such as dynamic fragment assembly (DFA) energy, DFIRE statistical potential energy, hydrogen bonding term, etc. These physical energy terms are expected to guide the structure modeling especially for loop regions where no template structures are available. In addition, we developed a new quality assessment method based on random forest machine learning algorithm to screen templates, multiple alignments, and final models. For TBM targets of CASP10, we find that, due to the combination of three stages of CSA global optimizations and quality assessment, the modeling accuracy of PMS improves at each additional stage of the protocol. It is especially noteworthy that the side-chains of the final PMS models are far more accurate than the models in the intermediate steps. Copyright © 2013 Wiley Periodicals, Inc.

  1. Game Theoretic, Multi-agent Approach to Network Traffic Monitoring

    DTIC Science & Technology

    2012-01-16

    cases. That is why the symmetrized form of Kullback - Leibler divergence is often used Dskl = Dkl(P‖Q) +Dkl(Q‖P ) (3.9) We use a similar metric. If both...as a sequence of single-stage games with no transfer of information between the stages. This model is used as a formalism for the regret minimization...content of the transmitted information , but use the statistics (Fig. 1.1) in the NetFlow/IPFIX format [15, 14] to build, maintain and combine behav

  2. QSAR modeling for anti-human African trypanosomiasis activity of substituted 2-Phenylimidazopyridines

    NASA Astrophysics Data System (ADS)

    Masand, Vijay H.; El-Sayed, Nahed N. E.; Mahajan, Devidas T.; Mercader, Andrew G.; Alafeefy, Ahmed M.; Shibi, I. G.

    2017-02-01

    In the present work, sixty substituted 2-Phenylimidazopyridines previously reported with potent anti-human African trypanosomiasis (HAT) activity were selected to build genetic algorithm (GA) based QSAR models to determine the structural features that have significant correlation with the activity. Multiple QSAR models were built using easily interpretable descriptors that are directly associated with the presence or the absence of a structural scaffold, or a specific atom. All the QSAR models have been thoroughly validated according to the OECD principles. All the QSAR models are statistically very robust (R2 = 0.80-0.87) with high external predictive ability (CCCex = 0.81-0.92). The QSAR analysis reveals that the HAT activity has good correlation with the presence of five membered rings in the molecule.

  3. GAMBIT: the global and modular beyond-the-standard-model inference tool

    NASA Astrophysics Data System (ADS)

    Athron, Peter; Balazs, Csaba; Bringmann, Torsten; Buckley, Andy; Chrząszcz, Marcin; Conrad, Jan; Cornell, Jonathan M.; Dal, Lars A.; Dickinson, Hugh; Edsjö, Joakim; Farmer, Ben; Gonzalo, Tomás E.; Jackson, Paul; Krislock, Abram; Kvellestad, Anders; Lundberg, Johan; McKay, James; Mahmoudi, Farvah; Martinez, Gregory D.; Putze, Antje; Raklev, Are; Ripken, Joachim; Rogan, Christopher; Saavedra, Aldo; Savage, Christopher; Scott, Pat; Seo, Seon-Hee; Serra, Nicola; Weniger, Christoph; White, Martin; Wild, Sebastian

    2017-11-01

    We describe the open-source global fitting package GAMBIT: the Global And Modular Beyond-the-Standard-Model Inference Tool. GAMBIT combines extensive calculations of observables and likelihoods in particle and astroparticle physics with a hierarchical model database, advanced tools for automatically building analyses of essentially any model, a flexible and powerful system for interfacing to external codes, a suite of different statistical methods and parameter scanning algorithms, and a host of other utilities designed to make scans faster, safer and more easily-extendible than in the past. Here we give a detailed description of the framework, its design and motivation, and the current models and other specific components presently implemented in GAMBIT. Accompanying papers deal with individual modules and present first GAMBIT results. GAMBIT can be downloaded from gambit.hepforge.org.

  4. Application of crowd-sourced data to multi-scale evolutionary exposure and vulnerability models

    NASA Astrophysics Data System (ADS)

    Pittore, Massimiliano

    2016-04-01

    Seismic exposure, defined as the assets (population, buildings, infrastructure) exposed to earthquake hazard and susceptible to damage, is a critical -but often neglected- component of seismic risk assessment. This partly stems from the burden associated with the compilation of a useful and reliable model over wide spatial areas. While detailed engineering data have still to be collected in order to constrain exposure and vulnerability models, the availability of increasingly large crowd-sourced datasets (e. g. OpenStreetMap) opens up the exciting possibility to generate incrementally evolving models. Integrating crowd-sourced and authoritative data using statistical learning methodologies can reduce models uncertainties and also provide additional drive and motivation to volunteered geoinformation collection. A case study in Central Asia will be presented and discussed.

  5. Analysis of a Rocket Based Combined Cycle Engine during Rocket Only Operation

    NASA Technical Reports Server (NTRS)

    Smith, T. D.; Steffen, C. J., Jr.; Yungster, S.; Keller, D. J.

    1998-01-01

    The all rocket mode of operation is a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. However, outside of performing experiments or a full three dimensional analysis, there are no first order parametric models to estimate performance. As a result, an axisymmetric RBCC engine was used to analytically determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and statistical regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, percent of injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inject diameter ratio. A perfect gas computational fluid dynamics analysis was performed to obtain values of vacuum specific impulse. Statistical regression analysis was performed based on both full flow and gas generator engine cycles. Results were also found to be dependent upon the entire cycle assumptions. The statistical regression analysis determined that there were five significant linear effects, six interactions, and one second-order effect. Two parametric models were created to provide performance assessments of an RBCC engine in the all rocket mode of operation.

  6. Enhancement of CFD validation exercise along the roof profile of a low-rise building

    NASA Astrophysics Data System (ADS)

    Deraman, S. N. C.; Majid, T. A.; Zaini, S. S.; Yahya, W. N. W.; Abdullah, J.; Ismail, M. A.

    2018-04-01

    The aim of this study is to enhance the validation of CFD exercise along the roof profile of a low-rise building. An isolated gabled-roof house having 26.6° roof pitch was simulated to obtain the pressure coefficient around the house. Validation of CFD analysis with experimental data requires many input parameters. This study performed CFD simulation based on the data from a previous study. Where the input parameters were not clearly stated, new input parameters were established from the open literatures. The numerical simulations were performed in FLUENT 14.0 by applying the Computational Fluid Dynamics (CFD) approach based on steady RANS equation together with RNG k-ɛ model. Hence, the result from CFD was analysed by using quantitative test (statistical analysis) and compared with CFD results from the previous study. The statistical analysis results from ANOVA test and error measure showed that the CFD results from the current study produced good agreement and exhibited the closest error compared to the previous study. All the input data used in this study can be extended to other types of CFD simulation involving wind flow over an isolated single storey house.

  7. Scalable Deployment of Advanced Building Energy Management Systems

    DTIC Science & Technology

    2013-06-01

    Building Automation and Control Network BDAS Building Data Acquisition System BEM building energy model BIM building information modeling BMS...A prototype toolkit to seamlessly and automatically transfer a Building Information Model ( BIM ) to a Building Energy Model (BEM) has been...circumvent the need to manually construct and maintain a detailed building energy simulation model . This detailed

  8. Application of the GEM Inventory Data Capture Tools for Dynamic Vulnerability Assessment and Recovery Modelling

    NASA Astrophysics Data System (ADS)

    Verrucci, Enrica; Bevington, John; Vicini, Alessandro

    2014-05-01

    A set of open-source tools to create building exposure datasets for seismic risk assessment was developed from 2010-13 by the Inventory Data Capture Tools (IDCT) Risk Global Component of the Global Earthquake Model (GEM). The tools were designed to integrate data derived from remotely-sensed imagery, statistically-sampled in-situ field data of buildings to generate per-building and regional exposure data. A number of software tools were created to aid the development of these data, including mobile data capture tools for in-field structural assessment, and the Spatial Inventory Data Developer (SIDD) for creating "mapping schemes" - statistically-inferred distributions of building stock applied to areas of homogeneous urban land use. These tools were made publically available in January 2014. Exemplar implementations in Europe and Central Asia during the IDCT project highlighted several potential application areas beyond the original scope of the project. These are investigated here. We describe and demonstrate how the GEM-IDCT suite can be used extensively within the framework proposed by the EC-FP7 project SENSUM (Framework to integrate Space-based and in-situ sENSing for dynamic vUlnerability and recovery Monitoring). Specifically, applications in the areas of 1) dynamic vulnerability assessment (pre-event), and 2) recovery monitoring and evaluation (post-event) are discussed. Strategies for using the IDC Tools for these purposes are discussed. The results demonstrate the benefits of using advanced technology tools for data capture, especially in a systematic fashion using the taxonomic standards set by GEM. Originally designed for seismic risk assessment, it is clear the IDCT tools have relevance for multi-hazard risk assessment. When combined with a suitable sampling framework and applied to multi-temporal recovery monitoring, data generated from the tools can reveal spatio-temporal patterns in the quality of recovery activities and resilience trends can be inferred. Lastly, this work draws attention to the use of the IDCT suite as an education resource for inspiring and training new students and engineers in the field of disaster risk reduction.

  9. Commercial Building Tenant Energy Usage Data Aggregation and Privacy: Technical Appendix

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Livingston, Olga V.; Pulsipher, Trenton C.; Anderson, David M.

    2014-11-12

    This technical appendix accompanies report PNNL–23786 “Commercial Building Tenant Energy Usage Data Aggregation and Privacy”. The objective is to provide background information on the methods utilized in the statistical analysis of the aggregation thresholds.

  10. Commercial Building Tenant Energy Usage Aggregation and Privacy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Livingston, Olga V.; Pulsipher, Trenton C.; Anderson, David M.

    A growing number of building owners are benchmarking their building energy use. This requires the building owner to acquire monthly whole-building energy usage information, which can be challenging for buildings in which individual tenants have their own utility meters and accounts with the utility. Some utilities and utility regulators have turned to aggregation of customer energy use data (CEUD) as a way to give building owners whole-building energy usage data while protecting customer privacy. Meter profile aggregation adds a layer of protection that decreases the risk of revealing CEUD as the number of meters aggregated increases. The report statistically characterizesmore » the similarity between individual energy usage patterns and whole-building totals at various levels of meter aggregation.« less

  11. An electronic health record based model predicts statin adherence, LDL cholesterol, and cardiovascular disease in the United States Military Health System

    PubMed Central

    Lucas, Joseph E.; Bazemore, Taylor C.; Alo, Celan; Monahan, Patrick B.

    2017-01-01

    HMG-CoA reductase inhibitors (or “statins”) are important and commonly used medications to lower cholesterol and prevent cardiovascular disease. Nearly half of patients stop taking statin medications one year after they are prescribed leading to higher cholesterol, increased cardiovascular risk, and costs due to excess hospitalizations. Identifying which patients are at highest risk for not adhering to long-term statin therapy is an important step towards individualizing interventions to improve adherence. Electronic health records (EHR) are an increasingly common source of data that are challenging to analyze but have potential for generating more accurate predictions of disease risk. The aim of this study was to build an EHR based model for statin adherence and link this model to biologic and clinical outcomes in patients receiving statin therapy. We gathered EHR data from the Military Health System which maintains administrative data for active duty, retirees, and dependents of the United States armed forces military that receive health care benefits. Data were gathered from patients prescribed their first statin prescription in 2005 and 2006. Baseline billing, laboratory, and pharmacy claims data were collected from the two years leading up to the first statin prescription and summarized using non-negative matrix factorization. Follow up statin prescription refill data was used to define the adherence outcome (> 80 percent days covered). The subsequent factors to emerge from this model were then used to build cross-validated, predictive models of 1) overall disease risk using coalescent regression and 2) statin adherence (using random forest regression). The predicted statin adherence for each patient was subsequently used to correlate with cholesterol lowering and hospitalizations for cardiovascular disease during the 5 year follow up period using Cox regression. The analytical dataset included 138 731 individuals and 1840 potential baseline predictors that were reduced to 30 independent EHR “factors”. A random forest predictive model taking patient, statin prescription, predicted disease risk, and the EHR factors as potential inputs produced a cross-validated c-statistic of 0.736 for classifying statin non-adherence. The addition of the first refill to the model increased the c-statistic to 0.81. The predicted statin adherence was independently associated with greater cholesterol lowering (correlation = 0.14, p < 1e-20) and lower hospitalization for myocardial infarction, coronary artery disease, and stroke (hazard ratio = 0.84, p = 1.87E-06). Electronic health records data can be used to build a predictive model of statin adherence that also correlates with statins’ cardiovascular benefits. PMID:29155848

  12. Wolves at the Schoolhouse Door: An Investigation of the Condition of Public School Buildings. A Report of the Education Writers Association.

    ERIC Educational Resources Information Center

    Lewis, Anne; And Others

    If the American schoolhouse symbolizes public concern for children, millions of today's youngsters are receiving a negative message. Based on available statistics and information, and using a representative sample of one-half of the country's public school buildings, this investigation found that 25 percent of the nation's school buildings are…

  13. Bayesian depth estimation from monocular natural images.

    PubMed

    Su, Che-Chun; Cormack, Lawrence K; Bovik, Alan C

    2017-05-01

    Estimating an accurate and naturalistic dense depth map from a single monocular photographic image is a difficult problem. Nevertheless, human observers have little difficulty understanding the depth structure implied by photographs. Two-dimensional (2D) images of the real-world environment contain significant statistical information regarding the three-dimensional (3D) structure of the world that the vision system likely exploits to compute perceived depth, monocularly as well as binocularly. Toward understanding how this might be accomplished, we propose a Bayesian model of monocular depth computation that recovers detailed 3D scene structures by extracting reliable, robust, depth-sensitive statistical features from single natural images. These features are derived using well-accepted univariate natural scene statistics (NSS) models and recent bivariate/correlation NSS models that describe the relationships between 2D photographic images and their associated depth maps. This is accomplished by building a dictionary of canonical local depth patterns from which NSS features are extracted as prior information. The dictionary is used to create a multivariate Gaussian mixture (MGM) likelihood model that associates local image features with depth patterns. A simple Bayesian predictor is then used to form spatial depth estimates. The depth results produced by the model, despite its simplicity, correlate well with ground-truth depths measured by a current-generation terrestrial light detection and ranging (LIDAR) scanner. Such a strong form of statistical depth information could be used by the visual system when creating overall estimated depth maps incorporating stereopsis, accommodation, and other conditions. Indeed, even in isolation, the Bayesian predictor delivers depth estimates that are competitive with state-of-the-art "computer vision" methods that utilize highly engineered image features and sophisticated machine learning algorithms.

  14. Building Capacity in a Rural North Carolina Community to Address Prostate Health Using a Lay Health Advisor Model

    PubMed Central

    Vines, Anissa I.; Hunter, Jaimie C.; White, Brandolyn S.; Richmond, Alan N.

    2018-01-01

    Background Prostate cancer is a critical concern for African Americans in North Carolina (NC), and innovative strategies are needed to help rural African American men maximize their prostate health. Engaging the community in research affords opportunities to build capacity for teaching and raising awareness. Approach and Strategies A community steering committee of academicians, community partners, religious leaders, and other stakeholders modified a curriculum on prostate health and screening to include interactive knowledge- and skill-building activities. This curriculum was then used to train 15 African American lay health advisors, dubbed Prostate Cancer Ambassadors, in a rural NC community. Over the 2-day training, Ambassadors achieved statistically significant improvements in knowledge of prostate health and maintained confidence in teaching. The Ambassadors, in turn, used their personal networks to share their knowledge with over 1,000 individuals in their community. Finally, the Ambassadors became researchers, implementing a prostate health survey in local churches. Discussion and Conclusions It is feasible to use community engagement models for raising awareness of prostate health in NC African American communities. Mobilizing community coalitions to develop curricula ensures that the curricula meet the communities’ needs, and training lay health advisors to deliver curricula helps secure community buy-in for the information. PMID:26232777

  15. Building block extraction and classification by means of aerial images fused with super-resolution reconstructed elevation data

    NASA Astrophysics Data System (ADS)

    Panagiotopoulou, Antigoni; Bratsolis, Emmanuel; Charou, Eleni; Perantonis, Stavros

    2017-10-01

    The detailed three-dimensional modeling of buildings utilizing elevation data, such as those provided by light detection and ranging (LiDAR) airborne scanners, is increasingly demanded today. There are certain application requirements and available datasets to which any research effort has to be adapted. Our dataset includes aerial orthophotos, with a spatial resolution 20 cm, and a digital surface model generated from LiDAR, with a spatial resolution 1 m and an elevation resolution 20 cm, from an area of Athens, Greece. The aerial images are fused with LiDAR, and we classify these data with a multilayer feedforward neural network for building block extraction. The innovation of our approach lies in the preprocessing step in which the original LiDAR data are super-resolution (SR) reconstructed by means of a stochastic regularized technique before their fusion with the aerial images takes place. The Lorentzian estimator combined with the bilateral total variation regularization performs the SR reconstruction. We evaluate the performance of our approach against that of fusing unprocessed LiDAR data with aerial images. We present the classified images and the statistical measures confusion matrix, kappa coefficient, and overall accuracy. The results demonstrate that our approach predominates over that of fusing unprocessed LiDAR data with aerial images.

  16. Genetic and Psychosocial Predictors of Aggression: Variable Selection and Model Building With Component-Wise Gradient Boosting.

    PubMed

    Suchting, Robert; Gowin, Joshua L; Green, Charles E; Walss-Bass, Consuelo; Lane, Scott D

    2018-01-01

    Rationale : Given datasets with a large or diverse set of predictors of aggression, machine learning (ML) provides efficient tools for identifying the most salient variables and building a parsimonious statistical model. ML techniques permit efficient exploration of data, have not been widely used in aggression research, and may have utility for those seeking prediction of aggressive behavior. Objectives : The present study examined predictors of aggression and constructed an optimized model using ML techniques. Predictors were derived from a dataset that included demographic, psychometric and genetic predictors, specifically FK506 binding protein 5 (FKBP5) polymorphisms, which have been shown to alter response to threatening stimuli, but have not been tested as predictors of aggressive behavior in adults. Methods : The data analysis approach utilized component-wise gradient boosting and model reduction via backward elimination to: (a) select variables from an initial set of 20 to build a model of trait aggression; and then (b) reduce that model to maximize parsimony and generalizability. Results : From a dataset of N = 47 participants, component-wise gradient boosting selected 8 of 20 possible predictors to model Buss-Perry Aggression Questionnaire (BPAQ) total score, with R 2 = 0.66. This model was simplified using backward elimination, retaining six predictors: smoking status, psychopathy (interpersonal manipulation and callous affect), childhood trauma (physical abuse and neglect), and the FKBP5_13 gene (rs1360780). The six-factor model approximated the initial eight-factor model at 99.4% of R 2 . Conclusions : Using an inductive data science approach, the gradient boosting model identified predictors consistent with previous experimental work in aggression; specifically psychopathy and trauma exposure. Additionally, allelic variants in FKBP5 were identified for the first time, but the relatively small sample size limits generality of results and calls for replication. This approach provides utility for the prediction of aggression behavior, particularly in the context of large multivariate datasets.

  17. Causal modelling applied to the risk assessment of a wastewater discharge.

    PubMed

    Paul, Warren L; Rokahr, Pat A; Webb, Jeff M; Rees, Gavin N; Clune, Tim S

    2016-03-01

    Bayesian networks (BNs), or causal Bayesian networks, have become quite popular in ecological risk assessment and natural resource management because of their utility as a communication and decision-support tool. Since their development in the field of artificial intelligence in the 1980s, however, Bayesian networks have evolved and merged with structural equation modelling (SEM). Unlike BNs, which are constrained to encode causal knowledge in conditional probability tables, SEMs encode this knowledge in structural equations, which is thought to be a more natural language for expressing causal information. This merger has clarified the causal content of SEMs and generalised the method such that it can now be performed using standard statistical techniques. As it was with BNs, the utility of this new generation of SEM in ecological risk assessment will need to be demonstrated with examples to foster an understanding and acceptance of the method. Here, we applied SEM to the risk assessment of a wastewater discharge to a stream, with a particular focus on the process of translating a causal diagram (conceptual model) into a statistical model which might then be used in the decision-making and evaluation stages of the risk assessment. The process of building and testing a spatial causal model is demonstrated using data from a spatial sampling design, and the implications of the resulting model are discussed in terms of the risk assessment. It is argued that a spatiotemporal causal model would have greater external validity than the spatial model, enabling broader generalisations to be made regarding the impact of a discharge, and greater value as a tool for evaluating the effects of potential treatment plant upgrades. Suggestions are made on how the causal model could be augmented to include temporal as well as spatial information, including suggestions for appropriate statistical models and analyses.

  18. Downscaling modelling system for multi-scale air quality forecasting

    NASA Astrophysics Data System (ADS)

    Nuterman, R.; Baklanov, A.; Mahura, A.; Amstrup, B.; Weismann, J.

    2010-09-01

    Urban modelling for real meteorological situations, in general, considers only a small part of the urban area in a micro-meteorological model, and urban heterogeneities outside a modelling domain affect micro-scale processes. Therefore, it is important to build a chain of models of different scales with nesting of higher resolution models into larger scale lower resolution models. Usually, the up-scaled city- or meso-scale models consider parameterisations of urban effects or statistical descriptions of the urban morphology, whereas the micro-scale (street canyon) models are obstacle-resolved and they consider a detailed geometry of the buildings and the urban canopy. The developed system consists of the meso-, urban- and street-scale models. First, it is the Numerical Weather Prediction (HIgh Resolution Limited Area Model) model combined with Atmospheric Chemistry Transport (the Comprehensive Air quality Model with extensions) model. Several levels of urban parameterisation are considered. They are chosen depending on selected scales and resolutions. For regional scale, the urban parameterisation is based on the roughness and flux corrections approach; for urban scale - building effects parameterisation. Modern methods of computational fluid dynamics allow solving environmental problems connected with atmospheric transport of pollutants within urban canopy in a presence of penetrable (vegetation) and impenetrable (buildings) obstacles. For local- and micro-scales nesting the Micro-scale Model for Urban Environment is applied. This is a comprehensive obstacle-resolved urban wind-flow and dispersion model based on the Reynolds averaged Navier-Stokes approach and several turbulent closures, i.e. k -ɛ linear eddy-viscosity model, k - ɛ non-linear eddy-viscosity model and Reynolds stress model. Boundary and initial conditions for the micro-scale model are used from the up-scaled models with corresponding interpolation conserving the mass. For the boundaries a kind of Dirichlet condition is chosen to provide the values based on interpolation from the coarse to the fine grid. When the roughness approach is changed to the obstacle-resolved one in the nested model, the interpolation procedure will increase the computational time (due to additional iterations) for meteorological/ chemical fields inside the urban sub-layer. In such situations, as a possible alternative, the perturbation approach can be applied. Here, the effects of main meteorological variables and chemical species are considered as a sum of two components: background (large-scale) values, described by the coarse-resolution model, and perturbations (micro-scale) features, obtained from the nested fine resolution model.

  19. Forward Modeling of Large-scale Structure: An Open-source Approach with Halotools

    NASA Astrophysics Data System (ADS)

    Hearin, Andrew P.; Campbell, Duncan; Tollerud, Erik; Behroozi, Peter; Diemer, Benedikt; Goldbaum, Nathan J.; Jennings, Elise; Leauthaud, Alexie; Mao, Yao-Yuan; More, Surhud; Parejko, John; Sinha, Manodeep; Sipöcz, Brigitta; Zentner, Andrew

    2017-11-01

    We present the first stable release of Halotools (v0.2), a community-driven Python package designed to build and test models of the galaxy-halo connection. Halotools provides a modular platform for creating mock universes of galaxies starting from a catalog of dark matter halos obtained from a cosmological simulation. The package supports many of the common forms used to describe galaxy-halo models: the halo occupation distribution, the conditional luminosity function, abundance matching, and alternatives to these models that include effects such as environmental quenching or variable galaxy assembly bias. Satellite galaxies can be modeled to live in subhalos or to follow custom number density profiles within their halos, including spatial and/or velocity bias with respect to the dark matter profile. The package has an optimized toolkit to make mock observations on a synthetic galaxy population—including galaxy clustering, galaxy-galaxy lensing, galaxy group identification, RSD multipoles, void statistics, pairwise velocities and others—allowing direct comparison to observations. Halotools is object-oriented, enabling complex models to be built from a set of simple, interchangeable components, including those of your own creation. Halotools has an automated testing suite and is exhaustively documented on http://halotools.readthedocs.io, which includes quickstart guides, source code notes and a large collection of tutorials. The documentation is effectively an online textbook on how to build and study empirical models of galaxy formation with Python.

  20. Signatures of criticality arise from random subsampling in simple population models.

    PubMed

    Nonnenmacher, Marcel; Behrens, Christian; Berens, Philipp; Bethge, Matthias; Macke, Jakob H

    2017-10-01

    The rise of large-scale recordings of neuronal activity has fueled the hope to gain new insights into the collective activity of neural ensembles. How can one link the statistics of neural population activity to underlying principles and theories? One attempt to interpret such data builds upon analogies to the behaviour of collective systems in statistical physics. Divergence of the specific heat-a measure of population statistics derived from thermodynamics-has been used to suggest that neural populations are optimized to operate at a "critical point". However, these findings have been challenged by theoretical studies which have shown that common inputs can lead to diverging specific heat. Here, we connect "signatures of criticality", and in particular the divergence of specific heat, back to statistics of neural population activity commonly studied in neural coding: firing rates and pairwise correlations. We show that the specific heat diverges whenever the average correlation strength does not depend on population size. This is necessarily true when data with correlations is randomly subsampled during the analysis process, irrespective of the detailed structure or origin of correlations. We also show how the characteristic shape of specific heat capacity curves depends on firing rates and correlations, using both analytically tractable models and numerical simulations of a canonical feed-forward population model. To analyze these simulations, we develop efficient methods for characterizing large-scale neural population activity with maximum entropy models. We find that, consistent with experimental findings, increases in firing rates and correlation directly lead to more pronounced signatures. Thus, previous reports of thermodynamical criticality in neural populations based on the analysis of specific heat can be explained by average firing rates and correlations, and are not indicative of an optimized coding strategy. We conclude that a reliable interpretation of statistical tests for theories of neural coding is possible only in reference to relevant ground-truth models.

  1. Building generic anatomical models using virtual model cutting and iterative registration.

    PubMed

    Xiao, Mei; Soh, Jung; Meruvia-Pastor, Oscar; Schmidt, Eric; Hallgrímsson, Benedikt; Sensen, Christoph W

    2010-02-08

    Using 3D generic models to statistically analyze trends in biological structure changes is an important tool in morphometrics research. Therefore, 3D generic models built for a range of populations are in high demand. However, due to the complexity of biological structures and the limited views of them that medical images can offer, it is still an exceptionally difficult task to quickly and accurately create 3D generic models (a model is a 3D graphical representation of a biological structure) based on medical image stacks (a stack is an ordered collection of 2D images). We show that the creation of a generic model that captures spatial information exploitable in statistical analyses is facilitated by coupling our generalized segmentation method to existing automatic image registration algorithms. The method of creating generic 3D models consists of the following processing steps: (i) scanning subjects to obtain image stacks; (ii) creating individual 3D models from the stacks; (iii) interactively extracting sub-volume by cutting each model to generate the sub-model of interest; (iv) creating image stacks that contain only the information pertaining to the sub-models; (v) iteratively registering the corresponding new 2D image stacks; (vi) averaging the newly created sub-models based on intensity to produce the generic model from all the individual sub-models. After several registration procedures are applied to the image stacks, we can create averaged image stacks with sharp boundaries. The averaged 3D model created from those image stacks is very close to the average representation of the population. The image registration time varies depending on the image size and the desired accuracy of the registration. Both volumetric data and surface model for the generic 3D model are created at the final step. Our method is very flexible and easy to use such that anyone can use image stacks to create models and retrieve a sub-region from it at their ease. Java-based implementation allows our method to be used on various visualization systems including personal computers, workstations, computers equipped with stereo displays, and even virtual reality rooms such as the CAVE Automated Virtual Environment. The technique allows biologists to build generic 3D models of their interest quickly and accurately.

  2. Statistical Analysis of the Indus Script Using n-Grams

    PubMed Central

    Yadav, Nisha; Joglekar, Hrishikesh; Rao, Rajesh P. N.; Vahia, Mayank N.; Adhikari, Ronojoy; Mahadevan, Iravatham

    2010-01-01

    The Indus script is one of the major undeciphered scripts of the ancient world. The small size of the corpus, the absence of bilingual texts, and the lack of definite knowledge of the underlying language has frustrated efforts at decipherment since the discovery of the remains of the Indus civilization. Building on previous statistical approaches, we apply the tools of statistical language processing, specifically n-gram Markov chains, to analyze the syntax of the Indus script. We find that unigrams follow a Zipf-Mandelbrot distribution. Text beginner and ender distributions are unequal, providing internal evidence for syntax. We see clear evidence of strong bigram correlations and extract significant pairs and triplets using a log-likelihood measure of association. Highly frequent pairs and triplets are not always highly significant. The model performance is evaluated using information-theoretic measures and cross-validation. The model can restore doubtfully read texts with an accuracy of about 75%. We find that a quadrigram Markov chain saturates information theoretic measures against a held-out corpus. Our work forms the basis for the development of a stochastic grammar which may be used to explore the syntax of the Indus script in greater detail. PMID:20333254

  3. Building Energy Monitoring and Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hong, Tianzhen; Feng, Wei; Lu, Alison

    This project aimed to develop a standard methodology for building energy data definition, collection, presentation, and analysis; apply the developed methods to a standardized energy monitoring platform, including hardware and software, to collect and analyze building energy use data; and compile offline statistical data and online real-time data in both countries for fully understanding the current status of building energy use. This helps decode the driving forces behind the discrepancy of building energy use between the two countries; identify gaps and deficiencies of current building energy monitoring, data collection, and analysis; and create knowledge and tools to collect and analyzemore » good building energy data to provide valuable and actionable information for key stakeholders.« less

  4. Analysis of foliage effects on mobile propagation in dense urban environments

    NASA Astrophysics Data System (ADS)

    Bronshtein, Alexander; Mazar, Reuven; Lu, I.-Tai

    2000-07-01

    Attempts to reduce the interference level and to increase the spectral efficiency of cellular radio communication systems operating in dense urban and suburban areas lead to the microcellular approach with a consequent requirement to lower antenna heights. In large metropolitan areas having high buildings this requirement causes a situation where the transmitting and receiving antennas are both located below the rooftops, and the city street acts as a type of a waveguiding channel for the propagating signal. In this work, the city street is modeled as a random multislit waveguide with randomly distributed regions of foliage parallel to the building boundaries. The statistical propagation characteristics are expressed in terms of multiple ray-fields approaching the observer. Algorithms for predicting the path-loss along the waveguide and for computing the transverse field structure are presented.

  5. Strategies Used by Students to Compare Two Data Sets

    ERIC Educational Resources Information Center

    Reaburn, Robyn

    2012-01-01

    One of the common tasks of inferential statistics is to compare two data sets. Long before formal statistical procedures, however, students can be encouraged to make comparisons between data sets and therefore build up intuitive statistical reasoning. Such tasks also give meaning to the data collection students may do. This study describes the…

  6. Micronucleus Assay in Exfoliated Buccal Epithelial Cells Using Liquid Based Cytology Preparations in Building Construction Workers.

    PubMed

    Arul, P; Smitha, Shetty; Masilamani, Suresh; Akshatha, C

    2018-01-01

    Cytogenetic damage in exfoliated buccal epithelial cells due to environmental and occupational exposure is often monitored by micronucleus (MN) assay using liquid based cytology (LBC) preparations. This study was performed to evaluate MN in exfoliated buccal epithelial cells of building construction workers using LBC preparations. LBC preparations of exfoliated buccal epithelial cells from 100 subjects [50 building construction workers (cases) and 50 administrative staffs (controls)] was evaluated by May-Grunwald Giemsa, Hematoxylin and Eosin and Papanicolaou stains. Student's t test was used for statistical analysis and a P value of <0.05 was considered as statistically significant. The mean frequencies of MN for cases were significantly higher than controls regardless of staining methods used. There were statistically significant differences between smokers and non-smokers of the controls as well as duration of working exposure (<5 and >5 years) and smokers and non-smokers of cases (P=0.001). However, there were meaningful differences regarding mean frequencies of MN between smokers, non-smokers, those with alcohol consumption or not in cases and controls using various stains (P=0.001). There was an increased risk of cytogenetic damage in building construction workers. However, evaluation of MN of exfoliated buccal epithelial cells in building construction workers serve as a minimally invasive biomarker for cytogenetic damage. LBC preparations can be applied for MN assay as it improves the quality of smears and cell morphology, decreases the confounding factors and reduces false positive results.

  7. FastSim: A Fast Simulation for the SuperB Detector

    NASA Astrophysics Data System (ADS)

    Andreassen, R.; Arnaud, N.; Brown, D. N.; Burmistrov, L.; Carlson, J.; Cheng, C.-h.; Di Simone, A.; Gaponenko, I.; Manoni, E.; Perez, A.; Rama, M.; Roberts, D.; Rotondo, M.; Simi, G.; Sokoloff, M.; Suzuki, A.; Walsh, J.

    2011-12-01

    We have developed a parameterized (fast) simulation for detector optimization and physics reach studies of the proposed SuperB Flavor Factory in Italy. Detector components are modeled as thin sections of planes, cylinders, disks or cones. Particle-material interactions are modeled using simplified cross-sections and formulas. Active detectors are modeled using parameterized response functions. Geometry and response parameters are configured using xml files with a custom-designed schema. Reconstruction algorithms adapted from BaBar are used to build tracks and clusters. Multiple sources of background signals can be merged with primary signals. Pattern recognition errors are modeled statistically by randomly misassigning nearby tracking hits. Standard BaBar analysis tuples are used as an event output. Hadronic B meson pair events can be simulated at roughly 10Hz.

  8. Psychological resilience, pain catastrophizing, and positive emotions: perspectives on comprehensive modeling of individual pain adaptation.

    PubMed

    Sturgeon, John A; Zautra, Alex J

    2013-03-01

    Pain is a complex construct that contributes to profound physical and psychological dysfunction, particularly in individuals coping with chronic pain. The current paper builds upon previous research, describes a balanced conceptual model that integrates aspects of both psychological vulnerability and resilience to pain, and reviews protective and exacerbating psychosocial factors to the process of adaptation to chronic pain, including pain catastrophizing, pain acceptance, and positive psychological resources predictive of enhanced pain coping. The current paper identifies future directions for research that will further enrich the understanding of pain adaptation and espouses an approach that will enhance the ecological validity of psychological pain coping models, including introduction of advanced statistical and conceptual models that integrate behavioral, cognitive, information processing, motivational and affective theories of pain.

  9. Radar cross section models for limited aspect angle windows

    NASA Astrophysics Data System (ADS)

    Robinson, Mark C.

    1992-12-01

    This thesis presents a method for building Radar Cross Section (RCS) models of aircraft based on static data taken from limited aspect angle windows. These models statistically characterize static RCS. This is done to show that a limited number of samples can be used to effectively characterize static aircraft RCS. The optimum models are determined by performing both a Kolmogorov and a Chi-Square goodness-of-fit test comparing the static RCS data with a variety of probability density functions (pdf) that are known to be effective at approximating the static RCS of aircraft. The optimum parameter estimator is also determined by the goodness of-fit tests if there is a difference in pdf parameters obtained by the Maximum Likelihood Estimator (MLE) and the Method of Moments (MoM) estimators.

  10. Infering and Calibrating Triadic Closure in a Dynamic Network

    NASA Astrophysics Data System (ADS)

    Mantzaris, Alexander V.; Higham, Desmond J.

    In the social sciences, the hypothesis of triadic closure contends that new links in a social contact network arise preferentially between those who currently share neighbours. Here, in a proof-of-principle study, we show how to calibrate a recently proposed evolving network model to time-dependent connectivity data. The probabilistic edge birth rate in the model contains a triadic closure term, so we are also able to assess statistically the evidence for this effect. The approach is shown to work on data generated synthetically from the model. We then apply this methodology to some real, large-scale data that records the build up of connections in a business-related social networking site, and find evidence for triadic closure.

  11. Two populations and models of gamma ray bursts

    NASA Technical Reports Server (NTRS)

    Katz, J. I.

    1993-01-01

    Gamma-ray burst statistics are best explained by a source population at cosmological distances, while spectroscopy and intensity histories of some individual bursts imply an origin on Galactic neutron stars. To resolve this inconsistency I suggest the presence of two populations, one at cosmological distances and the other Galactic. I build on ideas of Shemi and Piran (1990) and of Rees and Mesozaros (1992) involving the interaction of fireball debris with surrounding clouds to explain the observed intensity histories in bursts at cosmological distances. The distances to the Galactic population are undetermined because they are too few to affect the statistics of intensity and direction; I explain them as resulting from magnetic reconnection in neutron star magnetospheres. An appendix describes the late evolution of the debris as a relativistic blast wave.

  12. Protein profiles of nasal lavage fluid from individuals with work-related upper airway symptoms associated with moldy and damp buildings.

    PubMed

    Wåhlén, K; Fornander, L; Olausson, P; Ydreborg, K; Flodin, U; Graff, P; Lindahl, M; Ghafouri, B

    2016-10-01

    Upper airway irritation is common among individuals working in moldy and damp buildings. The aim of this study was to investigate effects on the protein composition of the nasal lining fluid. The prevalence of symptoms in relation to work environment was examined in 37 individuals working in two damp buildings. Microbial growth was confirmed in one of the buildings. Nasal lavage fluid was collected from 29 of the exposed subjects and 13 controls, not working in a damp building. Protein profiles were investigated with a proteomic approach and evaluated by multivariate statistical models. Subjects from both workplaces reported upper airway and ocular symptoms. Based on protein profiles, symptomatic subjects in the two workplaces were discriminated from each other and separated from healthy controls. The groups differed in proteins involved in inflammation and host defense. Measurements of innate immunity proteins showed a significant increase in protein S100-A8 and decrease in SPLUNC1 in subjects from one workplace, while alpha-1-antitrypsin was elevated in subjects from the other workplace, compared with healthy controls. The results show that protein profiles in nasal lavage fluid can be used to monitor airway mucosal effects in personnel working in damp buildings and indicate that the profile may be separated when the dampness is associated with the presence of molds. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  13. Use of linear regression models to determine influence factors on the concentration levels of radon in occupied houses

    NASA Astrophysics Data System (ADS)

    Buermeyer, Jonas; Gundlach, Matthias; Grund, Anna-Lisa; Grimm, Volker; Spizyn, Alexander; Breckow, Joachim

    2016-09-01

    This work is part of the analysis of the effects of constructional energy-saving measures to radon concentration levels in dwellings performed on behalf of the German Federal Office for Radiation Protection. In parallel to radon measurements for five buildings, both meteorological data outside the buildings and the indoor climate factors were recorded. In order to access effects of inhabited buildings, the amount of carbon dioxide (CO2) was measured. For a statistical linear regression model, the data of one object was chosen as an example. Three dummy variables were extracted from the process of the CO2 concentration to provide information on the usage and ventilation of the room. The analysis revealed a highly autoregressive model for the radon concentration with additional influence by the natural environmental factors. The autoregression implies a strong dependency on a radon source since it reflects a backward dependency in time. At this point of the investigation, it cannot be determined whether the influence by outside factors affects the source of radon or the habitant’s ventilation behavior resulting in variation of the occurring concentration levels. In any case, the regression analysis might provide further information that would help to distinguish these effects. In the next step, the influence factors will be weighted according to their impact on the concentration levels. This might lead to a model that enables the prediction of radon concentration levels based on the measurement of CO2 in combination with environmental parameters, as well as the development of advices for ventilation.

  14. Data analytics for simplifying thermal efficiency planning in cities

    PubMed Central

    Abdolhosseini Qomi, Mohammad Javad; Noshadravan, Arash; Sobstyl, Jake M.; Toole, Jameson; Ferreira, Joseph; Pellenq, Roland J.-M.; Ulm, Franz-Josef; Gonzalez, Marta C.

    2016-01-01

    More than 44% of building energy consumption in the USA is used for space heating and cooling, and this accounts for 20% of national CO2 emissions. This prompts the need to identify among the 130 million households in the USA those with the greatest energy-saving potential and the associated costs of the path to reach that goal. Whereas current solutions address this problem by analysing each building in detail, we herein reduce the dimensionality of the problem by simplifying the calculations of energy losses in buildings. We present a novel inference method that can be used via a ranking algorithm that allows us to estimate the potential energy saving for heating purposes. To that end, we only need consumption from records of gas bills integrated with a building's footprint. The method entails a statistical screening of the intricate interplay between weather, infrastructural and residents' choice variables to determine building gas consumption and potential savings at a city scale. We derive a general statistical pattern of consumption in an urban settlement, reducing it to a set of the most influential buildings' parameters that operate locally. By way of example, the implications are explored using records of a set of (N = 6200) buildings in Cambridge, MA, USA, which indicate that retrofitting only 16% of buildings entails a 40% reduction in gas consumption of the whole building stock. We find that the inferred heat loss rate of buildings exhibits a power-law data distribution akin to Zipf's law, which provides a means to map an optimum path for gas savings per retrofit at a city scale. These findings have implications for improving the thermal efficiency of cities' building stock, as outlined by current policy efforts seeking to reduce home heating and cooling energy consumption and lower associated greenhouse gas emissions. PMID:27097652

  15. Extracting multistage screening rules from online dating activity data.

    PubMed

    Bruch, Elizabeth; Feinberg, Fred; Lee, Kee Yeun

    2016-09-20

    This paper presents a statistical framework for harnessing online activity data to better understand how people make decisions. Building on insights from cognitive science and decision theory, we develop a discrete choice model that allows for exploratory behavior and multiple stages of decision making, with different rules enacted at each stage. Critically, the approach can identify if and when people invoke noncompensatory screeners that eliminate large swaths of alternatives from detailed consideration. The model is estimated using deidentified activity data on 1.1 million browsing and writing decisions observed on an online dating site. We find that mate seekers enact screeners ("deal breakers") that encode acceptability cutoffs. A nonparametric account of heterogeneity reveals that, even after controlling for a host of observable attributes, mate evaluation differs across decision stages as well as across identified groupings of men and women. Our statistical framework can be widely applied in analyzing large-scale data on multistage choices, which typify searches for "big ticket" items.

  16. Extracting multistage screening rules from online dating activity data

    PubMed Central

    Bruch, Elizabeth; Feinberg, Fred; Lee, Kee Yeun

    2016-01-01

    This paper presents a statistical framework for harnessing online activity data to better understand how people make decisions. Building on insights from cognitive science and decision theory, we develop a discrete choice model that allows for exploratory behavior and multiple stages of decision making, with different rules enacted at each stage. Critically, the approach can identify if and when people invoke noncompensatory screeners that eliminate large swaths of alternatives from detailed consideration. The model is estimated using deidentified activity data on 1.1 million browsing and writing decisions observed on an online dating site. We find that mate seekers enact screeners (“deal breakers”) that encode acceptability cutoffs. A nonparametric account of heterogeneity reveals that, even after controlling for a host of observable attributes, mate evaluation differs across decision stages as well as across identified groupings of men and women. Our statistical framework can be widely applied in analyzing large-scale data on multistage choices, which typify searches for “big ticket” items. PMID:27578870

  17. Raman spectroscopy coupled with advanced statistics for differentiating menstrual and peripheral blood.

    PubMed

    Sikirzhytskaya, Aliaksandra; Sikirzhytski, Vitali; Lednev, Igor K

    2014-01-01

    Body fluids are a common and important type of forensic evidence. In particular, the identification of menstrual blood stains is often a key step during the investigation of rape cases. Here, we report on the application of near-infrared Raman microspectroscopy for differentiating menstrual blood from peripheral blood. We observed that the menstrual and peripheral blood samples have similar but distinct Raman spectra. Advanced statistical analysis of the multiple Raman spectra that were automatically (Raman mapping) acquired from the 40 dried blood stains (20 donors for each group) allowed us to build classification model with maximum (100%) sensitivity and specificity. We also demonstrated that despite certain common constituents, menstrual blood can be readily distinguished from vaginal fluid. All of the classification models were verified using cross-validation methods. The proposed method overcomes the problems associated with currently used biochemical methods, which are destructive, time consuming and expensive. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Inter-speaker speech variability assessment using statistical deformable models from 3.0 tesla magnetic resonance images.

    PubMed

    Vasconcelos, Maria J M; Ventura, Sandra M R; Freitas, Diamantino R S; Tavares, João Manuel R S

    2012-03-01

    The morphological and dynamic characterisation of the vocal tract during speech production has been gaining greater attention due to the motivation of the latest improvements in magnetic resonance (MR) imaging; namely, with the use of higher magnetic fields, such as 3.0 Tesla. In this work, the automatic study of the vocal tract from 3.0 Tesla MR images was assessed through the application of statistical deformable models. Therefore, the primary goal focused on the analysis of the shape of the vocal tract during the articulation of European Portuguese sounds, followed by the evaluation of the results concerning the automatic segmentation, i.e. identification of the vocal tract in new MR images. In what concerns speech production, this is the first attempt to automatically characterise and reconstruct the vocal tract shape of 3.0 Tesla MR images by using deformable models; particularly, by using active and appearance shape models. The achieved results clearly evidence the adequacy and advantage of the automatic analysis of the 3.0 Tesla MR images of these deformable models in order to extract the vocal tract shape and assess the involved articulatory movements. These achievements are mostly required, for example, for a better knowledge of speech production, mainly of patients suffering from articulatory disorders, and to build enhanced speech synthesizer models.

  19. Building integral projection models: a user's guide

    PubMed Central

    Rees, Mark; Childs, Dylan Z; Ellner, Stephen P; Coulson, Tim

    2014-01-01

    In order to understand how changes in individual performance (growth, survival or reproduction) influence population dynamics and evolution, ecologists are increasingly using parameterized mathematical models. For continuously structured populations, where some continuous measure of individual state influences growth, survival or reproduction, integral projection models (IPMs) are commonly used. We provide a detailed description of the steps involved in constructing an IPM, explaining how to: (i) translate your study system into an IPM; (ii) implement your IPM; and (iii) diagnose potential problems with your IPM. We emphasize how the study organism's life cycle, and the timing of censuses, together determine the structure of the IPM kernel and important aspects of the statistical analysis used to parameterize an IPM using data on marked individuals. An IPM based on population studies of Soay sheep is used to illustrate the complete process of constructing, implementing and evaluating an IPM fitted to sample data. We then look at very general approaches to parameterizing an IPM, using a wide range of statistical techniques (e.g. maximum likelihood methods, generalized additive models, nonparametric kernel density estimators). Methods for selecting models for parameterizing IPMs are briefly discussed. We conclude with key recommendations and a brief overview of applications that extend the basic model. The online Supporting Information provides commented R code for all our analyses. PMID:24219157

  20. A building extraction approach for Airborne Laser Scanner data utilizing the Object Based Image Analysis paradigm

    NASA Astrophysics Data System (ADS)

    Tomljenovic, Ivan; Tiede, Dirk; Blaschke, Thomas

    2016-10-01

    In the past two decades Object-Based Image Analysis (OBIA) established itself as an efficient approach for the classification and extraction of information from remote sensing imagery and, increasingly, from non-image based sources such as Airborne Laser Scanner (ALS) point clouds. ALS data is represented in the form of a point cloud with recorded multiple returns and intensities. In our work, we combined OBIA with ALS point cloud data in order to identify and extract buildings as 2D polygons representing roof outlines in a top down mapping approach. We performed rasterization of the ALS data into a height raster for the purpose of the generation of a Digital Surface Model (DSM) and a derived Digital Elevation Model (DEM). Further objects were generated in conjunction with point statistics from the linked point cloud. With the use of class modelling methods, we generated the final target class of objects representing buildings. The approach was developed for a test area in Biberach an der Riß (Germany). In order to point out the possibilities of the adaptation-free transferability to another data set, the algorithm has been applied ;as is; to the ISPRS Benchmarking data set of Toronto (Canada). The obtained results show high accuracies for the initial study area (thematic accuracies of around 98%, geometric accuracy of above 80%). The very high performance within the ISPRS Benchmark without any modification of the algorithm and without any adaptation of parameters is particularly noteworthy.

  1. Generative Topographic Mapping of Conformational Space.

    PubMed

    Horvath, Dragos; Baskin, Igor; Marcou, Gilles; Varnek, Alexandre

    2017-10-01

    Herein, Generative Topographic Mapping (GTM) was challenged to produce planar projections of the high-dimensional conformational space of complex molecules (the 1LE1 peptide). GTM is a probability-based mapping strategy, and its capacity to support property prediction models serves to objectively assess map quality (in terms of regression statistics). The properties to predict were total, non-bonded and contact energies, surface area and fingerprint darkness. Map building and selection was controlled by a previously introduced evolutionary strategy allowed to choose the best-suited conformational descriptors, options including classical terms and novel atom-centric autocorrellograms. The latter condensate interatomic distance patterns into descriptors of rather low dimensionality, yet precise enough to differentiate between close favorable contacts and atom clashes. A subset of 20 K conformers of the 1LE1 peptide, randomly selected from a pool of 2 M geometries (generated by the S4MPLE tool) was employed for map building and cross-validation of property regression models. The GTM build-up challenge reached robust three-fold cross-validated determination coefficients of Q 2 =0.7…0.8, for all modeled properties. Mapping of the full 2 M conformer set produced intuitive and information-rich property landscapes. Functional and folding subspaces appear as well-separated zones, even though RMSD with respect to the PDB structure was never used as a selection criterion of the maps. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Improving phylogenetic analyses by incorporating additional information from genetic sequence databases.

    PubMed

    Liang, Li-Jung; Weiss, Robert E; Redelings, Benjamin; Suchard, Marc A

    2009-10-01

    Statistical analyses of phylogenetic data culminate in uncertain estimates of underlying model parameters. Lack of additional data hinders the ability to reduce this uncertainty, as the original phylogenetic dataset is often complete, containing the entire gene or genome information available for the given set of taxa. Informative priors in a Bayesian analysis can reduce posterior uncertainty; however, publicly available phylogenetic software specifies vague priors for model parameters by default. We build objective and informative priors using hierarchical random effect models that combine additional datasets whose parameters are not of direct interest but are similar to the analysis of interest. We propose principled statistical methods that permit more precise parameter estimates in phylogenetic analyses by creating informative priors for parameters of interest. Using additional sequence datasets from our lab or public databases, we construct a fully Bayesian semiparametric hierarchical model to combine datasets. A dynamic iteratively reweighted Markov chain Monte Carlo algorithm conveniently recycles posterior samples from the individual analyses. We demonstrate the value of our approach by examining the insertion-deletion (indel) process in the enolase gene across the Tree of Life using the phylogenetic software BALI-PHY; we incorporate prior information about indels from 82 curated alignments downloaded from the BAliBASE database.

  3. A (210)Pb-based chronological model for recent sediments with random entries of mass and activities: Model development.

    PubMed

    Abril Hernández, José-María

    2016-01-01

    Unsupported (210)Pb ((210)Pbexc) vs. mass depth profiles do not contain enough information as to extract a unique chronology when both, (210)Pbexc fluxes and mass sediment accumulation rates (SAR) independently vary with time. Restrictive assumptions are needed to develop a suitable dating tool. A statistical correlation between fluxes and SAR seems to be a quite general rule. This paper builds up a new (210)Pb-based dating tool by using such a statistical correlation. It operates with SAR and initial activities that closely follow normal distributions, what leads to the expected correlation between fluxes and SAR. An intelligent algorithm solves their best arrangement downcore to fit the experimental (210)Pbexc vs. mass depth profile, generating then solutions for the chronological line, and for the histories of SAR and fluxes. Parametric maps of a χ-function serve to find out the solution and to support error estimates. Optionally, the model's answers can be better constrained through the use of time markers. The performance of the model is illustrated with a synthetic core, and with real cases using published data for varved sediment cores. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Statistical Analysis of Large Simulated Yield Datasets for Studying Climate Effects

    NASA Technical Reports Server (NTRS)

    Makowski, David; Asseng, Senthold; Ewert, Frank; Bassu, Simona; Durand, Jean-Louis; Martre, Pierre; Adam, Myriam; Aggarwal, Pramod K.; Angulo, Carlos; Baron, Chritian; hide

    2015-01-01

    Many studies have been carried out during the last decade to study the effect of climate change on crop yields and other key crop characteristics. In these studies, one or several crop models were used to simulate crop growth and development for different climate scenarios that correspond to different projections of atmospheric CO2 concentration, temperature, and rainfall changes (Semenov et al., 1996; Tubiello and Ewert, 2002; White et al., 2011). The Agricultural Model Intercomparison and Improvement Project (AgMIP; Rosenzweig et al., 2013) builds on these studies with the goal of using an ensemble of multiple crop models in order to assess effects of climate change scenarios for several crops in contrasting environments. These studies generate large datasets, including thousands of simulated crop yield data. They include series of yield values obtained by combining several crop models with different climate scenarios that are defined by several climatic variables (temperature, CO2, rainfall, etc.). Such datasets potentially provide useful information on the possible effects of different climate change scenarios on crop yields. However, it is sometimes difficult to analyze these datasets and to summarize them in a useful way due to their structural complexity; simulated yield data can differ among contrasting climate scenarios, sites, and crop models. Another issue is that it is not straightforward to extrapolate the results obtained for the scenarios to alternative climate change scenarios not initially included in the simulation protocols. Additional dynamic crop model simulations for new climate change scenarios are an option but this approach is costly, especially when a large number of crop models are used to generate the simulated data, as in AgMIP. Statistical models have been used to analyze responses of measured yield data to climate variables in past studies (Lobell et al., 2011), but the use of a statistical model to analyze yields simulated by complex process-based crop models is a rather new idea. We demonstrate herewith that statistical methods can play an important role in analyzing simulated yield data sets obtained from the ensembles of process-based crop models. Formal statistical analysis is helpful to estimate the effects of different climatic variables on yield, and to describe the between-model variability of these effects.

  5. Structural equation modeling: building and evaluating causal models: Chapter 8

    USGS Publications Warehouse

    Grace, James B.; Scheiner, Samuel M.; Schoolmaster, Donald R.

    2015-01-01

    Scientists frequently wish to study hypotheses about causal relationships, rather than just statistical associations. This chapter addresses the question of how scientists might approach this ambitious task. Here we describe structural equation modeling (SEM), a general modeling framework for the study of causal hypotheses. Our goals are to (a) concisely describe the methodology, (b) illustrate its utility for investigating ecological systems, and (c) provide guidance for its application. Throughout our presentation, we rely on a study of the effects of human activities on wetland ecosystems to make our description of methodology more tangible. We begin by presenting the fundamental principles of SEM, including both its distinguishing characteristics and the requirements for modeling hypotheses about causal networks. We then illustrate SEM procedures and offer guidelines for conducting SEM analyses. Our focus in this presentation is on basic modeling objectives and core techniques. Pointers to additional modeling options are also given.

  6. Fitting mechanistic epidemic models to data: A comparison of simple Markov chain Monte Carlo approaches.

    PubMed

    Li, Michael; Dushoff, Jonathan; Bolker, Benjamin M

    2018-07-01

    Simple mechanistic epidemic models are widely used for forecasting and parameter estimation of infectious diseases based on noisy case reporting data. Despite the widespread application of models to emerging infectious diseases, we know little about the comparative performance of standard computational-statistical frameworks in these contexts. Here we build a simple stochastic, discrete-time, discrete-state epidemic model with both process and observation error and use it to characterize the effectiveness of different flavours of Bayesian Markov chain Monte Carlo (MCMC) techniques. We use fits to simulated data, where parameters (and future behaviour) are known, to explore the limitations of different platforms and quantify parameter estimation accuracy, forecasting accuracy, and computational efficiency across combinations of modeling decisions (e.g. discrete vs. continuous latent states, levels of stochasticity) and computational platforms (JAGS, NIMBLE, Stan).

  7. Good initialization model with constrained body structure for scene text recognition

    NASA Astrophysics Data System (ADS)

    Zhu, Anna; Wang, Guoyou; Dong, Yangbo

    2016-09-01

    Scene text recognition has gained significant attention in the computer vision community. Character detection and recognition are the promise of text recognition and affect the overall performance to a large extent. We proposed a good initialization model for scene character recognition from cropped text regions. We use constrained character's body structures with deformable part-based models to detect and recognize characters in various backgrounds. The character's body structures are achieved by an unsupervised discriminative clustering approach followed by a statistical model and a self-build minimum spanning tree model. Our method utilizes part appearance and location information, and combines character detection and recognition in cropped text region together. The evaluation results on the benchmark datasets demonstrate that our proposed scheme outperforms the state-of-the-art methods both on scene character recognition and word recognition aspects.

  8. A sigmoidal model for biosorption of heavy metal cations from aqueous media.

    PubMed

    Özen, Rümeysa; Sayar, Nihat Alpagu; Durmaz-Sam, Selcen; Sayar, Ahmet Alp

    2015-07-01

    A novel multi-input single output (MISO) black-box sigmoid model is developed to simulate the biosorption of heavy metal cations by the fission yeast from aqueous medium. Validation and verification of the model is done through statistical chi-squared hypothesis tests and the model is evaluated by uncertainty and sensitivity analyses. The simulated results are in agreement with the data of the studied system in which Schizosaccharomyces pombe biosorbs Ni(II) cations at various process conditions. Experimental data is obtained originally for this work using dead cells of an adapted variant of S. Pombe and represented by Freundlich isotherms. A process optimization scheme is proposed using the present model to build a novel application of a cost-merit objective function which would be useful to predict optimal operation conditions. Copyright © 2015. Published by Elsevier Inc.

  9. A female black bear denning habitat model using a geographic information system

    USGS Publications Warehouse

    Clark, J.D.; Hayes, S.G.; Pledger, J.M.

    1998-01-01

    We used the Mahalanobis distance statistic and a raster geographic information system (GIS) to model potential black bear (Ursus americanus) denning habitat in the Ouachita Mountains of Arkansas. The Mahalanobis distance statistic was used to represent the standard squared distance between sample variates in the GIS database (forest cover type, elevation, slope, aspect, distance to streams, distance to roads, and forest cover richness) and variates at known bear dens. Two models were developed: a generalized model for all den locations and another specific to dens in rock cavities. Differences between habitat at den sites and habitat across the study area were represented in 2 new GIS themes as Mahalanobis distance values. Cells similar to the mean vector derived from the known dens had low Mahalanobis distance values, and dissimilar cells had high values. The reliability of the predictive model was tested by overlaying den locations collected subsequent to original model development on the resultant den habitat themes. Although the generalized model demonstrated poor reliability, the model specific to rock dens had good reliability. Bears were more likely to choose rock den locations with low Mahalanobis distance values and less likely to choose those with high values. The model can be used to plan the timing and extent of management actions (e.g., road building, prescribed fire, timber harvest) most appropriate for those sites with high or low denning potential. 

  10. Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics

    NASA Astrophysics Data System (ADS)

    Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.

    2018-03-01

    Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2D seismic reflection data processing flow focused on pre - stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching (BHM), to estimate the uncertainties of the depths of key horizons near the borehole DSDP-258 located in the Mentelle Basin, south west of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ± 2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program (IODP), leg 369.

  11. Uncertainty analysis of depth predictions from seismic reflection data using Bayesian statistics

    NASA Astrophysics Data System (ADS)

    Michelioudakis, Dimitrios G.; Hobbs, Richard W.; Caiado, Camila C. S.

    2018-06-01

    Estimating the depths of target horizons from seismic reflection data is an important task in exploration geophysics. To constrain these depths we need a reliable and accurate velocity model. Here, we build an optimum 2-D seismic reflection data processing flow focused on pre-stack deghosting filters and velocity model building and apply Bayesian methods, including Gaussian process emulation and Bayesian History Matching, to estimate the uncertainties of the depths of key horizons near the Deep Sea Drilling Project (DSDP) borehole 258 (DSDP-258) located in the Mentelle Basin, southwest of Australia, and compare the results with the drilled core from that well. Following this strategy, the tie between the modelled and observed depths from DSDP-258 core was in accordance with the ±2σ posterior credibility intervals and predictions for depths to key horizons were made for the two new drill sites, adjacent to the existing borehole of the area. The probabilistic analysis allowed us to generate multiple realizations of pre-stack depth migrated images, these can be directly used to better constrain interpretation and identify potential risk at drill sites. The method will be applied to constrain the drilling targets for the upcoming International Ocean Discovery Program, leg 369.

  12. Demolition waste generation for development of a regional management chain model.

    PubMed

    Bernardo, Miguel; Gomes, Marta Castilho; de Brito, Jorge

    2016-03-01

    Even though construction and demolition waste (CDW) is the bulkiest waste stream, its estimation and composition in specific regions still faces major difficulties. Therefore new methods are required especially when it comes to make predictions limited to small areas, such as counties. This paper proposes one such method, which makes use of data collected from real demolition works and statistical information on the geographical area under study. Based on a correlation analysis between the demolition waste estimates and indicators such as population density, buildings ageing index, buildings density and land occupation type, relationships are established that can be used to determine demolition waste outputs in a given area. The derived models are presented and explained. This methodology is independent from the specific region with which it is exemplified (the Lisbon Metropolitan Area) and can therefore be applied to any region of the world, from the country to the county level. Generation of demolition waste data at the county level is the basis of the design of a systemic model for CDW management in a region. Future developments proposed include a mixed-integer linear programming formulation of such recycling network. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. The Anatomy of American Football: Evidence from 7 Years of NFL Game Data

    PubMed Central

    Papalexakis, Evangelos

    2016-01-01

    How much does a fumble affect the probability of winning an American football game? How balanced should your offense be in order to increase the probability of winning by 10%? These are questions for which the coaching staff of National Football League teams have a clear qualitative answer. Turnovers are costly; turn the ball over several times and you will certainly lose. Nevertheless, what does “several” mean? How “certain” is certainly? In this study, we collected play-by-play data from the past 7 NFL seasons, i.e., 2009–2015, and we build a descriptive model for the probability of winning a game. Despite the fact that our model incorporates simple box score statistics, such as total offensive yards, number of turnovers etc., its overall cross-validation accuracy is 84%. Furthermore, we combine this descriptive model with a statistical bootstrap module to build FPM (short for Football Prediction Matchup) for predicting future match-ups. The contribution of FPM is pertinent to its simplicity and transparency, which however does not sacrifice the system’s performance. In particular, our evaluations indicate that our prediction engine performs on par with the current state-of-the-art systems (e.g., ESPN’s FPI and Microsoft’s Cortana). The latter are typically proprietary but based on their components described publicly they are significantly more complicated than FPM. Moreover, their proprietary nature does not allow for a head-to-head comparison in terms of the core elements of the systems but it should be evident that the features incorporated in FPM are able to capture a large percentage of the observed variance in NFL games. PMID:28005971

  14. Using open source computational tools for predicting human metabolic stability and additional absorption, distribution, metabolism, excretion, and toxicity properties.

    PubMed

    Gupta, Rishi R; Gifford, Eric M; Liston, Ted; Waller, Chris L; Hohman, Moses; Bunin, Barry A; Ekins, Sean

    2010-11-01

    Ligand-based computational models could be more readily shared between researchers and organizations if they were generated with open source molecular descriptors [e.g., chemistry development kit (CDK)] and modeling algorithms, because this would negate the requirement for proprietary commercial software. We initially evaluated open source descriptors and model building algorithms using a training set of approximately 50,000 molecules and a test set of approximately 25,000 molecules with human liver microsomal metabolic stability data. A C5.0 decision tree model demonstrated that CDK descriptors together with a set of Smiles Arbitrary Target Specification (SMARTS) keys had good statistics [κ = 0.43, sensitivity = 0.57, specificity = 0.91, and positive predicted value (PPV) = 0.64], equivalent to those of models built with commercial Molecular Operating Environment 2D (MOE2D) and the same set of SMARTS keys (κ = 0.43, sensitivity = 0.58, specificity = 0.91, and PPV = 0.63). Extending the dataset to ∼193,000 molecules and generating a continuous model using Cubist with a combination of CDK and SMARTS keys or MOE2D and SMARTS keys confirmed this observation. When the continuous predictions and actual values were binned to get a categorical score we observed a similar κ statistic (0.42). The same combination of descriptor set and modeling method was applied to passive permeability and P-glycoprotein efflux data with similar model testing statistics. In summary, open source tools demonstrated predictive results comparable to those of commercial software with attendant cost savings. We discuss the advantages and disadvantages of open source descriptors and the opportunity for their use as a tool for organizations to share data precompetitively, avoiding repetition and assisting drug discovery.

  15. Hydrological analysis in R: Topmodel and beyond

    NASA Astrophysics Data System (ADS)

    Buytaert, W.; Reusser, D.

    2011-12-01

    R is quickly gaining popularity in the hydrological sciences community. The wide range of statistical and mathematical functionality makes it an excellent tool for data analysis, modelling and uncertainty analysis. Topmodel was one of the first hydrological models being implemented as an R package and distributed through R's own distribution network CRAN. This facilitated pre- and postprocessing of data such as parameter sampling, calculation of prediction bounds, and advanced visualisation. However, apart from these basic functionalities, the package did not use many of the more advanced features of the R environment, especially from R's object oriented functionality. With R's increasing expansion in arenas such as high performance computing, big data analysis, and cloud services, we revisit the topmodel package, and use it as an example of how to build and deploy the next generation of hydrological models. R provides a convenient environment and attractive features to build and couple hydrological - and in extension other environmental - models, to develop flexible and effective data assimilation strategies, and to take the model beyond the individual computer by linking into cloud services for both data provision and computing. However, in order to maximise the benefit of these approaches, it will be necessary to adopt standards and ontologies for model interaction and information exchange. Some of those are currently being developed, such as the OGC web processing standards, while other will need to be developed.

  16. Cost-effectiveness Analysis in R Using a Multi-state Modeling Survival Analysis Framework: A Tutorial.

    PubMed

    Williams, Claire; Lewsey, James D; Briggs, Andrew H; Mackay, Daniel F

    2017-05-01

    This tutorial provides a step-by-step guide to performing cost-effectiveness analysis using a multi-state modeling approach. Alongside the tutorial, we provide easy-to-use functions in the statistics package R. We argue that this multi-state modeling approach using a package such as R has advantages over approaches where models are built in a spreadsheet package. In particular, using a syntax-based approach means there is a written record of what was done and the calculations are transparent. Reproducing the analysis is straightforward as the syntax just needs to be run again. The approach can be thought of as an alternative way to build a Markov decision-analytic model, which also has the option to use a state-arrival extended approach. In the state-arrival extended multi-state model, a covariate that represents patients' history is included, allowing the Markov property to be tested. We illustrate the building of multi-state survival models, making predictions from the models and assessing fits. We then proceed to perform a cost-effectiveness analysis, including deterministic and probabilistic sensitivity analyses. Finally, we show how to create 2 common methods of visualizing the results-namely, cost-effectiveness planes and cost-effectiveness acceptability curves. The analysis is implemented entirely within R. It is based on adaptions to functions in the existing R package mstate to accommodate parametric multi-state modeling that facilitates extrapolation of survival curves.

  17. Building and verifying a severity prediction model of acute pancreatitis (AP) based on BISAP, MEWS and routine test indexes.

    PubMed

    Ye, Jiang-Feng; Zhao, Yu-Xin; Ju, Jian; Wang, Wei

    2017-10-01

    To discuss the value of the Bedside Index for Severity in Acute Pancreatitis (BISAP), Modified Early Warning Score (MEWS), serum Ca2+, similarly hereinafter, and red cell distribution width (RDW) for predicting the severity grade of acute pancreatitis and to develop and verify a more accurate scoring system to predict the severity of AP. In 302 patients with AP, we calculated BISAP and MEWS scores and conducted regression analyses on the relationships of BISAP scoring, RDW, MEWS, and serum Ca2+ with the severity of AP using single-factor logistics. The variables with statistical significance in the single-factor logistic regression were used in a multi-factor logistic regression model; forward stepwise regression was used to screen variables and build a multi-factor prediction model. A receiver operating characteristic curve (ROC curve) was constructed, and the significance of multi- and single-factor prediction models in predicting the severity of AP using the area under the ROC curve (AUC) was evaluated. The internal validity of the model was verified through bootstrapping. Among 302 patients with AP, 209 had mild acute pancreatitis (MAP) and 93 had severe acute pancreatitis (SAP). According to single-factor logistic regression analysis, we found that BISAP, MEWS and serum Ca2+ are prediction indexes of the severity of AP (P-value<0.001), whereas RDW is not a prediction index of AP severity (P-value>0.05). The multi-factor logistic regression analysis showed that BISAP and serum Ca2+ are independent prediction indexes of AP severity (P-value<0.001), and MEWS is not an independent prediction index of AP severity (P-value>0.05); BISAP is negatively related to serum Ca2+ (r=-0.330, P-value<0.001). The constructed model is as follows: ln()=7.306+1.151*BISAP-4.516*serum Ca2+. The predictive ability of each model for SAP follows the order of the combined BISAP and serum Ca2+ prediction model>Ca2+>BISAP. There is no statistical significance for the predictive ability of BISAP and serum Ca2+ (P-value>0.05); however, there is remarkable statistical significance for the predictive ability using the newly built prediction model as well as BISAP and serum Ca2+ individually (P-value<0.01). Verification of the internal validity of the models by bootstrapping is favorable. BISAP and serum Ca2+ have high predictive value for the severity of AP. However, the model built by combining BISAP and serum Ca2+ is remarkably superior to those of BISAP and serum Ca2+ individually. Furthermore, this model is simple, practical and appropriate for clinical use. Copyright © 2016. Published by Elsevier Masson SAS.

  18. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jones, D.W.

    In previous reports, we have identified two potentially important issues, solutions to which would increase the attractiveness of DOE-developed technologies in commercial buildings energy systems. One issue concerns the fact that in addition to saving energy, many new technologies offer non-energy benefits that contribute to building productivity (firm profitability). The second issue is that new technologies are typically unproven in the eyes of decision makers and must bear risk premiums that offset cost advantages resulting from laboratory calculations. Even though a compelling case can be made for the importance of these issues, for building decision makers to incorporate them inmore » business decisions and for DOE to use them in R&D program planning there must be robust empirical evidence of their existence and size. This paper investigates how such measurements could be made and offers recommendations as to preferred options. There is currently little systematic information on either of these concepts in the literature. Of the two there is somewhat more information on non-energy benefits, but little as regards office buildings. Office building productivity impacts can be observed casually, but must be estimated statistically, because buildings have many interacting attributes and observations based on direct behavior can easily confuse the process of attribution. For example, absenteeism can be easily observed. However, absenteeism may be down because a more healthy space conditioning system was put into place, because the weather was milder, or because firm policy regarding sick days had changed. There is also a general dearth of appropriate information for purposes of estimation. To overcome these difficulties, we propose developing a new data base and applying the technique of hedonic price analysis. This technique has been used extensively in the analysis of residential dwellings. There is also a literature on its application to commercial and industrial buildings. Commercially available data bases exist that, if supplemented with engineering survey for equipment and materials use, could be analyzed statistically with a hedonic price model for the valuation of both the energy-saving and productivity effects of building technologies. Uncertainties about technology performance can cause investors to delay deploying new technologies. This behavior is explained by the ''investment under uncertainty'' literature. This literature suggests that under conditions of irrecoverable (''sunk'') costs, uncertain outcomes, and the ability to defer deployment, decision makers focus on potential losses and demand risk premiums and a few support the notion of focusing on losses, the so-called ''bad news principle.'' We describe a series of approaches to isolating buyer perceptions of uncertainty and means for reducing uncertainty.« less

  19. Statistical emulation of landslide-induced tsunamis at the Rockall Bank, NE Atlantic

    PubMed Central

    Guillas, S.; Georgiopoulou, A.; Dias, F.

    2017-01-01

    Statistical methods constitute a useful approach to understand and quantify the uncertainty that governs complex tsunami mechanisms. Numerical experiments may often have a high computational cost. This forms a limiting factor for performing uncertainty and sensitivity analyses, where numerous simulations are required. Statistical emulators, as surrogates of these simulators, can provide predictions of the physical process in a much faster and computationally inexpensive way. They can form a prominent solution to explore thousands of scenarios that would be otherwise numerically expensive and difficult to achieve. In this work, we build a statistical emulator of the deterministic codes used to simulate submarine sliding and tsunami generation at the Rockall Bank, NE Atlantic Ocean, in two stages. First we calibrate, against observations of the landslide deposits, the parameters used in the landslide simulations. This calibration is performed under a Bayesian framework using Gaussian Process (GP) emulators to approximate the landslide model, and the discrepancy function between model and observations. Distributions of the calibrated input parameters are obtained as a result of the calibration. In a second step, a GP emulator is built to mimic the coupled landslide-tsunami numerical process. The emulator propagates the uncertainties in the distributions of the calibrated input parameters inferred from the first step to the outputs. As a result, a quantification of the uncertainty of the maximum free surface elevation at specified locations is obtained. PMID:28484339

  20. Statistical emulation of landslide-induced tsunamis at the Rockall Bank, NE Atlantic.

    PubMed

    Salmanidou, D M; Guillas, S; Georgiopoulou, A; Dias, F

    2017-04-01

    Statistical methods constitute a useful approach to understand and quantify the uncertainty that governs complex tsunami mechanisms. Numerical experiments may often have a high computational cost. This forms a limiting factor for performing uncertainty and sensitivity analyses, where numerous simulations are required. Statistical emulators, as surrogates of these simulators, can provide predictions of the physical process in a much faster and computationally inexpensive way. They can form a prominent solution to explore thousands of scenarios that would be otherwise numerically expensive and difficult to achieve. In this work, we build a statistical emulator of the deterministic codes used to simulate submarine sliding and tsunami generation at the Rockall Bank, NE Atlantic Ocean, in two stages. First we calibrate, against observations of the landslide deposits, the parameters used in the landslide simulations. This calibration is performed under a Bayesian framework using Gaussian Process (GP) emulators to approximate the landslide model, and the discrepancy function between model and observations. Distributions of the calibrated input parameters are obtained as a result of the calibration. In a second step, a GP emulator is built to mimic the coupled landslide-tsunami numerical process. The emulator propagates the uncertainties in the distributions of the calibrated input parameters inferred from the first step to the outputs. As a result, a quantification of the uncertainty of the maximum free surface elevation at specified locations is obtained.

  1. Granular support vector machines with association rules mining for protein homology prediction.

    PubMed

    Tang, Yuchun; Jin, Bo; Zhang, Yan-Qing

    2005-01-01

    Protein homology prediction between protein sequences is one of critical problems in computational biology. Such a complex classification problem is common in medical or biological information processing applications. How to build a model with superior generalization capability from training samples is an essential issue for mining knowledge to accurately predict/classify unseen new samples and to effectively support human experts to make correct decisions. A new learning model called granular support vector machines (GSVM) is proposed based on our previous work. GSVM systematically and formally combines the principles from statistical learning theory and granular computing theory and thus provides an interesting new mechanism to address complex classification problems. It works by building a sequence of information granules and then building support vector machines (SVM) in some of these information granules on demand. A good granulation method to find suitable granules is crucial for modeling a GSVM with good performance. In this paper, we also propose an association rules-based granulation method. For the granules induced by association rules with high enough confidence and significant support, we leave them as they are because of their high "purity" and significant effect on simplifying the classification task. For every other granule, a SVM is modeled to discriminate the corresponding data. In this way, a complex classification problem is divided into multiple smaller problems so that the learning task is simplified. The proposed algorithm, here named GSVM-AR, is compared with SVM by KDDCUP04 protein homology prediction data. The experimental results show that finding the splitting hyperplane is not a trivial task (we should be careful to select the association rules to avoid overfitting) and GSVM-AR does show significant improvement compared to building one single SVM in the whole feature space. Another advantage is that the utility of GSVM-AR is very good because it is easy to be implemented. More importantly and more interestingly, GSVM provides a new mechanism to address complex classification problems.

  2. Building a statistical emulator for prediction of crop yield response to climate change: a global gridded panel data set approach

    NASA Astrophysics Data System (ADS)

    Mistry, Malcolm; De Cian, Enrica; Wing, Ian Sue

    2015-04-01

    There is widespread concern that trends and variability in weather induced by climate change will detrimentally affect global agricultural productivity and food supplies. Reliable quantification of the risks of negative impacts at regional and global scales is a critical research need, which has so far been met by forcing state-of-the-art global gridded crop models with outputs of global climate model (GCM) simulations in exercises such as the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP)-Fastrack. Notwithstanding such progress, it remains challenging to use these simulation-based projections to assess agricultural risk because their gridded fields of crop yields are fundamentally denominated as discrete combinations of warming scenarios, GCMs and crop models, and not as model-specific or model-averaged yield response functions of meteorological shifts, which may have their own independent probability of occurrence. By contrast, the empirical climate economics literature has adeptly represented agricultural responses to meteorological variables as reduced-form statistical response surfaces which identify the crop productivity impacts of additional exposure to different intervals of temperature and precipitation [cf Schlenker and Roberts, 2009]. This raises several important questions: (1) what do the equivalent reduced-form statistical response surfaces look like for crop model outputs, (2) do they exhibit systematic variation over space (e.g., crop suitability zones) or across crop models with different characteristics, (3) how do they compare to estimates based on historical observations, and (4) what are the implications for the characterization of climate risks? We address these questions by estimating statistical yield response functions for four major crops (maize, rice, wheat and soybeans) over the historical period (1971-2004) as well as future climate change scenarios (2005-2099) using ISIMIP-Fastrack data for five GCMs and seven crop models under rain-fed and irrigated management regimes. Our approach, which is patterned after Lobell and Burke [2010], is a novel application of cross-section/time-series statistical techniques from the climate economics literature to large, high-dimension, multi-model datasets, and holds considerable promise as a diagnostic methodology to elucidate uncertainties in the processes simulated by crop models, and to support the development of climate impact intercomparison exercises.

  3. Computational Process Modeling for Additive Manufacturing

    NASA Technical Reports Server (NTRS)

    Bagg, Stacey; Zhang, Wei

    2014-01-01

    Computational Process and Material Modeling of Powder Bed additive manufacturing of IN 718. Optimize material build parameters with reduced time and cost through modeling. Increase understanding of build properties. Increase reliability of builds. Decrease time to adoption of process for critical hardware. Potential to decrease post-build heat treatments. Conduct single-track and coupon builds at various build parameters. Record build parameter information and QM Meltpool data. Refine Applied Optimization powder bed AM process model using data. Report thermal modeling results. Conduct metallography of build samples. Calibrate STK models using metallography findings. Run STK models using AO thermal profiles and report STK modeling results. Validate modeling with additional build. Photodiode Intensity measurements highly linear with power input. Melt Pool Intensity highly correlated to Melt Pool Size. Melt Pool size and intensity increase with power. Applied Optimization will use data to develop powder bed additive manufacturing process model.

  4. A Description of the Building Materials Data Base for Portland, Maine.

    DTIC Science & Technology

    1986-06-01

    WORDS (Continue on reveree side if neceseary and Identify by block number)". Acid precipitation, , Data bases, Damage assessment, Environmental...protection) Damage from acid deposition, Portland, Maine Damage to buildings, - Statistical analysis, . 20. ASsrRACT (Conlaue a reverse e(A It n -cwery md...types and amounts of building surface materials ex- posed to acid deposition. The stratified, systematic, unaligned random sampling approach was used

  5. Body build classes as a method for systematization of age-related anthropometric changes in girls aged 7-8 and 17-18 years.

    PubMed

    Kasmel, Jaan; Kaarma, Helje; Koskel, Säde; Tiit, Ene-Margit

    2004-03-01

    A total of 462 schoolgirls aged 7-8 and 17-18 years were examined anthropometrically (45 body measurements and 10 skinfolds) in a cross-sectional study. The data were processed in two age groups: 7-8-year-olds (n = 205) and 17-18-year-olds (n = 257). Relying on average height and weight in the groups, both groups were divided into five body build classes: small, medium, large, pyknomorphous and leptomorphous. In these classes, the differences in all other body measurements were compared, and in both age groups, analogous systematic differences were found in length, width and depth measurements and circumferences. This enabled us to compare proportional changes in body measurements during ten years, using for this ratios of averages of basic measurements and measurement groups in the same body build classes. Statistical analysis by the sign test revealed statistically significant differences between various body build classes in the growth of averages. Girls belonging to the small class differed from the girls of the large class by an essentially greater increase in their measurements. Our results suggest that the growth rate of body measurements of girls with different body build can be studied by the help of body build classification.

  6. Gaussian process-based surrogate modeling framework for process planning in laser powder-bed fusion additive manufacturing of 316L stainless steel

    DOE PAGES

    Tapia, Gustavo; Khairallah, Saad A.; Matthews, Manyalibo J.; ...

    2017-09-22

    Here, Laser Powder-Bed Fusion (L-PBF) metal-based additive manufacturing (AM) is complex and not fully understood. Successful processing for one material, might not necessarily apply to a different material. This paper describes a workflow process that aims at creating a material data sheet standard that describes regimes where the process can be expected to be robust. The procedure consists of building a Gaussian process-based surrogate model of the L-PBF process that predicts melt pool depth in single-track experiments given a laser power, scan speed, and laser beam size combination. The predictions are then mapped onto a power versus scan speed diagrammore » delimiting the conduction from the keyhole melting controlled regimes. This statistical framework is shown to be robust even for cases where experimental training data might be suboptimal in quality, if appropriate physics-based filters are applied. Additionally, it is demonstrated that a high-fidelity simulation model of L-PBF can equally be successfully used for building a surrogate model, which is beneficial since simulations are getting more efficient and are more practical to study the response of different materials, than to re-tool an AM machine for new material powder.« less

  7. Gaussian process-based surrogate modeling framework for process planning in laser powder-bed fusion additive manufacturing of 316L stainless steel

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tapia, Gustavo; Khairallah, Saad A.; Matthews, Manyalibo J.

    Here, Laser Powder-Bed Fusion (L-PBF) metal-based additive manufacturing (AM) is complex and not fully understood. Successful processing for one material, might not necessarily apply to a different material. This paper describes a workflow process that aims at creating a material data sheet standard that describes regimes where the process can be expected to be robust. The procedure consists of building a Gaussian process-based surrogate model of the L-PBF process that predicts melt pool depth in single-track experiments given a laser power, scan speed, and laser beam size combination. The predictions are then mapped onto a power versus scan speed diagrammore » delimiting the conduction from the keyhole melting controlled regimes. This statistical framework is shown to be robust even for cases where experimental training data might be suboptimal in quality, if appropriate physics-based filters are applied. Additionally, it is demonstrated that a high-fidelity simulation model of L-PBF can equally be successfully used for building a surrogate model, which is beneficial since simulations are getting more efficient and are more practical to study the response of different materials, than to re-tool an AM machine for new material powder.« less

  8. Sampling design for the 1980 commercial and multifamily residential building survey

    NASA Astrophysics Data System (ADS)

    Bowen, W. M.; Olsen, A. R.; Nieves, A. L.

    1981-06-01

    The extent to which new building design practices comply with the proposed 1980 energy budget levels for commercial and multifamily residential building designs (DEB-80) can be assessed by: (1) identifying small number of building types which account for the majority of commercial buildings constructed in the U.S.A.; (2) conducting a separate survey for each building type; and (3) including only buildings designed during 1980. For each building, the design energy consumption (DEC-80) will be determined by the DOE2.1 computer program. The quantity X = (DEC-80 - DEB-80). These X quantities can then be used to compute sample statistics. Inferences about nationwide compliance with DEB-80 may then be made for each building type. Details of the population, sampling frame, stratification, sample size, and implementation of the sampling plan are provided.

  9. Predictor characteristics necessary for building a clinically useful risk prediction model: a simulation study.

    PubMed

    Schummers, Laura; Himes, Katherine P; Bodnar, Lisa M; Hutcheon, Jennifer A

    2016-09-21

    Compelled by the intuitive appeal of predicting each individual patient's risk of an outcome, there is a growing interest in risk prediction models. While the statistical methods used to build prediction models are increasingly well understood, the literature offers little insight to researchers seeking to gauge a priori whether a prediction model is likely to perform well for their particular research question. The objective of this study was to inform the development of new risk prediction models by evaluating model performance under a wide range of predictor characteristics. Data from all births to overweight or obese women in British Columbia, Canada from 2004 to 2012 (n = 75,225) were used to build a risk prediction model for preeclampsia. The data were then augmented with simulated predictors of the outcome with pre-set prevalence values and univariable odds ratios. We built 120 risk prediction models that included known demographic and clinical predictors, and one, three, or five of the simulated variables. Finally, we evaluated standard model performance criteria (discrimination, risk stratification capacity, calibration, and Nagelkerke's r 2 ) for each model. Findings from our models built with simulated predictors demonstrated the predictor characteristics required for a risk prediction model to adequately discriminate cases from non-cases and to adequately classify patients into clinically distinct risk groups. Several predictor characteristics can yield well performing risk prediction models; however, these characteristics are not typical of predictor-outcome relationships in many population-based or clinical data sets. Novel predictors must be both strongly associated with the outcome and prevalent in the population to be useful for clinical prediction modeling (e.g., one predictor with prevalence ≥20 % and odds ratio ≥8, or 3 predictors with prevalence ≥10 % and odds ratios ≥4). Area under the receiver operating characteristic curve values of >0.8 were necessary to achieve reasonable risk stratification capacity. Our findings provide a guide for researchers to estimate the expected performance of a prediction model before a model has been built based on the characteristics of available predictors.

  10. A Seasonal Time-Series Model Based on Gene Expression Programming for Predicting Financial Distress

    PubMed Central

    2018-01-01

    The issue of financial distress prediction plays an important and challenging research topic in the financial field. Currently, there have been many methods for predicting firm bankruptcy and financial crisis, including the artificial intelligence and the traditional statistical methods, and the past studies have shown that the prediction result of the artificial intelligence method is better than the traditional statistical method. Financial statements are quarterly reports; hence, the financial crisis of companies is seasonal time-series data, and the attribute data affecting the financial distress of companies is nonlinear and nonstationary time-series data with fluctuations. Therefore, this study employed the nonlinear attribute selection method to build a nonlinear financial distress prediction model: that is, this paper proposed a novel seasonal time-series gene expression programming model for predicting the financial distress of companies. The proposed model has several advantages including the following: (i) the proposed model is different from the previous models lacking the concept of time series; (ii) the proposed integrated attribute selection method can find the core attributes and reduce high dimensional data; and (iii) the proposed model can generate the rules and mathematical formulas of financial distress for providing references to the investors and decision makers. The result shows that the proposed method is better than the listing classifiers under three criteria; hence, the proposed model has competitive advantages in predicting the financial distress of companies. PMID:29765399

  11. A Seasonal Time-Series Model Based on Gene Expression Programming for Predicting Financial Distress.

    PubMed

    Cheng, Ching-Hsue; Chan, Chia-Pang; Yang, Jun-He

    2018-01-01

    The issue of financial distress prediction plays an important and challenging research topic in the financial field. Currently, there have been many methods for predicting firm bankruptcy and financial crisis, including the artificial intelligence and the traditional statistical methods, and the past studies have shown that the prediction result of the artificial intelligence method is better than the traditional statistical method. Financial statements are quarterly reports; hence, the financial crisis of companies is seasonal time-series data, and the attribute data affecting the financial distress of companies is nonlinear and nonstationary time-series data with fluctuations. Therefore, this study employed the nonlinear attribute selection method to build a nonlinear financial distress prediction model: that is, this paper proposed a novel seasonal time-series gene expression programming model for predicting the financial distress of companies. The proposed model has several advantages including the following: (i) the proposed model is different from the previous models lacking the concept of time series; (ii) the proposed integrated attribute selection method can find the core attributes and reduce high dimensional data; and (iii) the proposed model can generate the rules and mathematical formulas of financial distress for providing references to the investors and decision makers. The result shows that the proposed method is better than the listing classifiers under three criteria; hence, the proposed model has competitive advantages in predicting the financial distress of companies.

  12. Final Status Survey Report for Corrective Action Unit 117 - Pluto Disassembly Facility, Building 2201, Nevada National Security Site, Nevada

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jeremy Gwin and Douglas Frenette

    This document contains the process knowledge, radiological data and subsequent statistical methodology and analysis to support approval for the radiological release of Corrective Action Unit (CAU) 117 – Pluto Disassembly Facility, Building 2201 located in Area 26 of the Nevada National Security Site (NNSS). Preparations for release of the building began in 2009 and followed the methodology described in the Multi-Agency Radiation Survey and Site Investigation Manual (MARSSIM). MARSSIM is the DOE approved process for release of Real Property (buildings and landmasses) to a set of established criteria or authorized limits. The pre-approved authorized limits for surface contamination values andmore » corresponding assumptions were established by DOE O 5400.5. The release criteria coincide with the acceptance criteria of the U10C landfill permit. The U10C landfill is the proposed location to dispose of the radiologically non-impacted, or “clean,” building rubble following demolition. However, other disposition options that include the building and/or waste remaining at the NNSS may be considered providing that the same release limits apply. The Final Status Survey was designed following MARSSIM guidance by reviewing historical documentation and radiological survey data. Following this review a formal radiological characterization survey was performed in two phases. The characterization revealed multiple areas of residual radioactivity above the release criteria. These locations were remediated (decontaminated) and then the surface activity was verified to be less than the release criteria. Once remediation efforts had been successfully completed, a Final Status Survey Plan (10-015, “Final Status Survey Plan for Corrective Action Unit 117 – Pluto Disassembly Facility, Building 2201”) was developed and implemented to complete the final step in the MARSSIM process, the Final Status Survey. The Final Status Survey Plan consisted of categorizing each individual room into one of three categories: Class 1, Class 2 or Class 3 (a fourth category is a “Non-Impacted Class” which in the case of Building 2201 only pertained to exterior surfaces of the building.) The majority of the rooms were determined to fall in the less restrictive Class 3 category, however, Rooms 102, 104, 106, and 107 were identified as containing Class 1 and 2 areas. Building 2201 was divided into “survey units” and surveyed following the requirements of the Final Status Survey Plan for each particular class. As each survey unit was completed and documented, the survey results were evaluated. Each sample (static measurement) with units of counts per minute (cpm) was corrected for the appropriate background and converted to a value with units of dpm/100 cm2. With a surface contamination value in the appropriate units, it was compared to the surface contamination limits, or in this case the derived concentration guideline level (DCGLw). The appropriate statistical test (sign test) was then performed. If the survey unit was statistically determined to be below the DCGLw, then the survey unit passed and the null hypothesis (that the survey unit is above limits) was rejected. If the survey unit was equal to or below the critical value in the sign test, the null hypothesis was not rejected. This process was performed for all survey units within Building 2201. A total of thirty-three “Class 1,” four “Class 2,” and one “Class 3” survey units were developed, surveyed, and evaluated. All survey units successfully passed the statistical test. Building 2201 meets the release criteria commensurate with the Waste Acceptance Criteria (for radiological purposes) of the U10C landfill permit residing within NNSS boundaries. Based on the thorough statistical sampling and scanning of the building’s interior, Building 2201 may be considered radiologically “clean,” or free of contamination.« less

  13. Damage to urban buildings in zones of intensities VIII and VII during the Wenchuan earthquake and discussion on some typical damages

    NASA Astrophysics Data System (ADS)

    Sun, Jingjiang; Tang, Yuhong; Zheng, Chao; Shi, Hongbin; Lin, Lin; Sun, Zhongxian

    2009-04-01

    The outline and typical characteristics of damages to building in Jiangyou city and Anxian county (intensity VIII), Mianyang city and Deyang city (intensity VII) are introduced in the paper. The damage ratios, based on the sample statistics of multi-story brick buildings together with multi-story brick buildings with RC frame at first story (BBF), are presented. Then some typical damages, such as horizontal cricks of brick masonry buildings, X-shaped cricks on the walls under windows, the damages to columns, beams and infill walls of frame buildings and the damage to half circle-shaped masonry walls, are discussed.

  14. Dynamics of embryonic stem cell differentiation inferred from single-cell transcriptomics show a series of transitions through discrete cell states

    PubMed Central

    Jang, Sumin; Choubey, Sandeep; Furchtgott, Leon; Zou, Ling-Nan; Doyle, Adele; Menon, Vilas; Loew, Ethan B; Krostag, Anne-Rachel; Martinez, Refugio A; Madisen, Linda; Levi, Boaz P; Ramanathan, Sharad

    2017-01-01

    The complexity of gene regulatory networks that lead multipotent cells to acquire different cell fates makes a quantitative understanding of differentiation challenging. Using a statistical framework to analyze single-cell transcriptomics data, we infer the gene expression dynamics of early mouse embryonic stem (mES) cell differentiation, uncovering discrete transitions across nine cell states. We validate the predicted transitions across discrete states using flow cytometry. Moreover, using live-cell microscopy, we show that individual cells undergo abrupt transitions from a naïve to primed pluripotent state. Using the inferred discrete cell states to build a probabilistic model for the underlying gene regulatory network, we further predict and experimentally verify that these states have unique response to perturbations, thus defining them functionally. Our study provides a framework to infer the dynamics of differentiation from single cell transcriptomics data and to build predictive models of the gene regulatory networks that drive the sequence of cell fate decisions during development. DOI: http://dx.doi.org/10.7554/eLife.20487.001 PMID:28296635

  15. Feasibility and effectiveness of a brief, intensive phylogenetics workshop in a middle-income country.

    PubMed

    Pollett, S; Leguia, M; Nelson, M I; Maljkovic Berry, I; Rutherford, G; Bausch, D G; Kasper, M; Jarman, R; Melendrez, M

    2016-01-01

    There is an increasing role for bioinformatic and phylogenetic analysis in tropical medicine research. However, scientists working in low- and middle-income regions may lack access to training opportunities in these methods. To help address this gap, a 5-day intensive bioinformatics workshop was offered in Lima, Peru. The syllabus is presented here for others who want to develop similar programs. To assess knowledge gained, a 20-point knowledge questionnaire was administered to participants (21 participants) before and after the workshop, covering topics on sequence quality control, alignment/formatting, database retrieval, models of evolution, sequence statistics, tree building, and results interpretation. Evolution/tree-building methods represented the lowest scoring domain at baseline and after the workshop. There was a considerable median gain in total knowledge scores (increase of 30%, p<0.001) with gains as high as 55%. A 5-day workshop model was effective in improving the pathogen-applied bioinformatics knowledge of scientists working in a middle-income country setting. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  16. Processing system of jaws tomograms for pathology identification and surgical guide modeling

    NASA Astrophysics Data System (ADS)

    Putrik, M. B.; Lavrentyeva, Yu. E.; Ivanov, V. Yu.

    2015-11-01

    The aim of the study is to create an image processing system, which allows dentists to find pathological resorption and to build surgical guide surface automatically. X-rays images of jaws from cone beam tomography or spiral computed tomography are the initial data for processing. One patient's examination always includes up to 600 images (or tomograms), that's why the development of processing system for fast automation search of pathologies is necessary. X-rays images can be useful not for only illness diagnostic but for treatment planning too. We have studied the case of dental implantation - for successful surgical manipulations surgical guides are used. We have created a processing system that automatically builds jaw and teeth boundaries on the x-ray image. After this step, obtained teeth boundaries used for surgical guide surface modeling and jaw boundaries limit the area for further pathologies search. Criterion for the presence of pathological resorption zones inside the limited area is based on statistical investigation. After described actions, it is possible to manufacture surgical guide using 3D printer and apply it in surgical operation.

  17. Statistical Signal Processing and the Motor Cortex

    PubMed Central

    Brockwell, A.E.; Kass, R.E.; Schwartz, A.B.

    2011-01-01

    Over the past few decades, developments in technology have significantly improved the ability to measure activity in the brain. This has spurred a great deal of research into brain function and its relation to external stimuli, and has important implications in medicine and other fields. As a result of improved understanding of brain function, it is now possible to build devices that provide direct interfaces between the brain and the external world. We describe some of the current understanding of function of the motor cortex region. We then discuss a typical likelihood-based state-space model and filtering based approach to address the problems associated with building a motor cortical-controlled cursor or robotic prosthetic device. As a variation on previous work using this approach, we introduce the idea of using Markov chain Monte Carlo methods for parameter estimation in this context. By doing this instead of performing maximum likelihood estimation, it is possible to expand the range of possible models that can be explored, at a cost in terms of computational load. We demonstrate results obtained applying this methodology to experimental data gathered from a monkey. PMID:21765538

  18. Demonstration of reduced-order urban scale building energy models

    DOE PAGES

    Heidarinejad, Mohammad; Mattise, Nicholas; Dahlhausen, Matthew; ...

    2017-09-08

    The aim of this study is to demonstrate a developed framework to rapidly create urban scale reduced-order building energy models using a systematic summary of the simplifications required for the representation of building exterior and thermal zones. These urban scale reduced-order models rely on the contribution of influential variables to the internal, external, and system thermal loads. OpenStudio Application Programming Interface (API) serves as a tool to automate the process of model creation and demonstrate the developed framework. The results of this study show that the accuracy of the developed reduced-order building energy models varies only up to 10% withmore » the selection of different thermal zones. In addition, to assess complexity of the developed reduced-order building energy models, this study develops a novel framework to quantify complexity of the building energy models. Consequently, this study empowers the building energy modelers to quantify their building energy model systematically in order to report the model complexity alongside the building energy model accuracy. An exhaustive analysis on four university campuses suggests that the urban neighborhood buildings lend themselves to simplified typical shapes. Specifically, building energy modelers can utilize the developed typical shapes to represent more than 80% of the U.S. buildings documented in the CBECS database. One main benefits of this developed framework is the opportunity for different models including airflow and solar radiation models to share the same exterior representation, allowing a unifying exchange data. Altogether, the results of this study have implications for a large-scale modeling of buildings in support of urban energy consumption analyses or assessment of a large number of alternative solutions in support of retrofit decision-making in the building industry.« less

  19. Demonstration of reduced-order urban scale building energy models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Heidarinejad, Mohammad; Mattise, Nicholas; Dahlhausen, Matthew

    The aim of this study is to demonstrate a developed framework to rapidly create urban scale reduced-order building energy models using a systematic summary of the simplifications required for the representation of building exterior and thermal zones. These urban scale reduced-order models rely on the contribution of influential variables to the internal, external, and system thermal loads. OpenStudio Application Programming Interface (API) serves as a tool to automate the process of model creation and demonstrate the developed framework. The results of this study show that the accuracy of the developed reduced-order building energy models varies only up to 10% withmore » the selection of different thermal zones. In addition, to assess complexity of the developed reduced-order building energy models, this study develops a novel framework to quantify complexity of the building energy models. Consequently, this study empowers the building energy modelers to quantify their building energy model systematically in order to report the model complexity alongside the building energy model accuracy. An exhaustive analysis on four university campuses suggests that the urban neighborhood buildings lend themselves to simplified typical shapes. Specifically, building energy modelers can utilize the developed typical shapes to represent more than 80% of the U.S. buildings documented in the CBECS database. One main benefits of this developed framework is the opportunity for different models including airflow and solar radiation models to share the same exterior representation, allowing a unifying exchange data. Altogether, the results of this study have implications for a large-scale modeling of buildings in support of urban energy consumption analyses or assessment of a large number of alternative solutions in support of retrofit decision-making in the building industry.« less

  20. Protein structure modeling and refinement by global optimization in CASP12.

    PubMed

    Hong, Seung Hwan; Joung, InSuk; Flores-Canales, Jose C; Manavalan, Balachandran; Cheng, Qianyi; Heo, Seungryong; Kim, Jong Yun; Lee, Sun Young; Nam, Mikyung; Joo, Keehyoung; Lee, In-Ho; Lee, Sung Jong; Lee, Jooyoung

    2018-03-01

    For protein structure modeling in the CASP12 experiment, we have developed a new protocol based on our previous CASP11 approach. The global optimization method of conformational space annealing (CSA) was applied to 3 stages of modeling: multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain re-modeling. For better template selection and model selection, we updated our model quality assessment (QA) method with the newly developed SVMQA (support vector machine for quality assessment). For 3D chain building, we updated our energy function by including restraints generated from predicted residue-residue contacts. New energy terms for the predicted secondary structure and predicted solvent accessible surface area were also introduced. For difficult targets, we proposed a new method, LEEab, where the template term played a less significant role than it did in LEE, complemented by increased contributions from other terms such as the predicted contact term. For TBM (template-based modeling) targets, LEE performed better than LEEab, but for FM targets, LEEab was better. For model refinement, we modified our CASP11 molecular dynamics (MD) based protocol by using explicit solvents and tuning down restraint weights. Refinement results from MD simulations that used a new augmented statistical energy term in the force field were quite promising. Finally, when using inaccurate information (such as the predicted contacts), it was important to use the Lorentzian function for which the maximal penalty arising from wrong information is always bounded. © 2017 Wiley Periodicals, Inc.

  1. Next Day Building Load Predictions based on Limited Input Features Using an On-Line Laterally Primed Adaptive Resonance Theory Artificial Neural Network.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jones, Christian Birk; Robinson, Matt; Yasaei, Yasser

    Optimal integration of thermal energy storage within commercial building applications requires accurate load predictions. Several methods exist that provide an estimate of a buildings future needs. Methods include component-based models and data-driven algorithms. This work implemented a previously untested algorithm for this application that is called a Laterally Primed Adaptive Resonance Theory (LAPART) artificial neural network (ANN). The LAPART algorithm provided accurate results over a two month period where minimal historical data and a small amount of input types were available. These results are significant, because common practice has often overlooked the implementation of an ANN. ANN have often beenmore » perceived to be too complex and require large amounts of data to provide accurate results. The LAPART neural network was implemented in an on-line learning manner. On-line learning refers to the continuous updating of training data as time occurs. For this experiment, training began with a singe day and grew to two months of data. This approach provides a platform for immediate implementation that requires minimal time and effort. The results from the LAPART algorithm were compared with statistical regression and a component-based model. The comparison was based on the predictions linear relationship with the measured data, mean squared error, mean bias error, and cost savings achieved by the respective prediction techniques. The results show that the LAPART algorithm provided a reliable and cost effective means to predict the building load for the next day.« less

  2. Library Research and Statistics. Research on Libraries and Librarianship in 2002; Number of Libraries in the United States and Canada; Highlights of NCES Surveys; Library Acquisition Expenditures, 2001-2002: U.S. Public, Academic, Special, and Government Libraries; LJ Budget Report: A Precarious Holding Pattern; Price Indexes for Public and Academic Libraries; Library Buildings 2002: The Building Buck Doesn't Stop Here.

    ERIC Educational Resources Information Center

    Lynch, Mary Jo; Oder, Norman; Halstead, Kent; Fox, Bette-Lee

    2003-01-01

    Includes seven reports that discuss research on libraries and librarianship, including academic, public, and school libraries; awards and grants; number of libraries in the United States and Canada; National Center for Education Statistics results; library expenditures for public, academic, special, and government libraries; library budgets; price…

  3. Statistical and Conceptual Model Testing Geomorphic Principles through Quantification in the Middle Rio Grande River, NM.

    NASA Astrophysics Data System (ADS)

    Posner, A. J.

    2017-12-01

    The Middle Rio Grande River (MRG) traverses New Mexico from Cochiti to Elephant Butte reservoirs. Since the 1100s, cultivating and inhabiting the valley of this alluvial river has required various river training works. The mid-20th century saw a concerted effort to tame the river through channelization, Jetty Jacks, and dam construction. A challenge for river managers is to better understand the interactions between a river training works, dam construction, and the geomorphic adjustments of a desert river driven by spring snowmelt and summer thunderstorms carrying water and large sediment inputs from upstream and ephemeral tributaries. Due to its importance to the region, a vast wealth of data exists for conditions along the MRG. The investigation presented herein builds upon previous efforts by combining hydraulic model results, digitized planforms, and stream gage records in various statistical and conceptual models in order to test our understanding of this complex system. Spatially continuous variables were clipped by a set of river cross section data that is collected at decadal intervals since the early 1960s, creating a spatially homogenous database upon which various statistical testing was implemented. Conceptual models relate forcing variables and response variables to estimate river planform changes. The developed database, represents a unique opportunity to quantify and test geomorphic conceptual models in the unique characteristics of the MRG. The results of this investigation provides a spatially distributed characterization of planform variable changes, permitting managers to predict planform at a much higher resolution than previously available, and a better understanding of the relationship between flow regime and planform changes such as changes to longitudinal slope, sinuosity, and width. Lastly, data analysis and model interpretation led to the development of a new conceptual model for the impact of ephemeral tributaries in alluvial rivers.

  4. Statistical analysis of whole-body absorption depending on anatomical human characteristics at a frequency of 2.1 GHz.

    PubMed

    Habachi, A El; Conil, E; Hadjem, A; Vazquez, E; Wong, M F; Gati, A; Fleury, G; Wiart, J

    2010-04-07

    In this paper, we propose identification of the morphological factors that may impact the whole-body averaged specific absorption rate (WBSAR). This study is conducted for the case of exposure to a front plane wave at a 2100 MHz frequency carrier. This study is based on the development of different regression models for estimating the WBSAR as a function of morphological factors. For this purpose, a database of 12 anatomical human models (phantoms) has been considered. Also, 18 supplementary phantoms obtained using the morphing technique were generated to build the required relation. This paper presents three models based on external morphological factors such as the body surface area, the body mass index or the body mass. These models show good results in estimating the WBSAR (<10%) for families obtained by the morphing technique, but these are still less accurate (30%) when applied to different original phantoms. This study stresses the importance of the internal morphological factors such as muscle and fat proportions in characterization of the WBSAR. The regression models are then improved using internal morphological factors with an estimation error of approximately 10% on the WBSAR. Finally, this study is suitable for establishing the statistical distribution of the WBSAR for a given population characterized by its morphology.

  5. Statistical analysis of whole-body absorption depending on anatomical human characteristics at a frequency of 2.1 GHz

    NASA Astrophysics Data System (ADS)

    El Habachi, A.; Conil, E.; Hadjem, A.; Vazquez, E.; Wong, M. F.; Gati, A.; Fleury, G.; Wiart, J.

    2010-04-01

    In this paper, we propose identification of the morphological factors that may impact the whole-body averaged specific absorption rate (WBSAR). This study is conducted for the case of exposure to a front plane wave at a 2100 MHz frequency carrier. This study is based on the development of different regression models for estimating the WBSAR as a function of morphological factors. For this purpose, a database of 12 anatomical human models (phantoms) has been considered. Also, 18 supplementary phantoms obtained using the morphing technique were generated to build the required relation. This paper presents three models based on external morphological factors such as the body surface area, the body mass index or the body mass. These models show good results in estimating the WBSAR (<10%) for families obtained by the morphing technique, but these are still less accurate (30%) when applied to different original phantoms. This study stresses the importance of the internal morphological factors such as muscle and fat proportions in characterization of the WBSAR. The regression models are then improved using internal morphological factors with an estimation error of approximately 10% on the WBSAR. Finally, this study is suitable for establishing the statistical distribution of the WBSAR for a given population characterized by its morphology.

  6. Fold assessment for comparative protein structure modeling.

    PubMed

    Melo, Francisco; Sali, Andrej

    2007-11-01

    Accurate and automated assessment of both geometrical errors and incompleteness of comparative protein structure models is necessary for an adequate use of the models. Here, we describe a composite score for discriminating between models with the correct and incorrect fold. To find an accurate composite score, we designed and applied a genetic algorithm method that searched for a most informative subset of 21 input model features as well as their optimized nonlinear transformation into the composite score. The 21 input features included various statistical potential scores, stereochemistry quality descriptors, sequence alignment scores, geometrical descriptors, and measures of protein packing. The optimized composite score was found to depend on (1) a statistical potential z-score for residue accessibilities and distances, (2) model compactness, and (3) percentage sequence identity of the alignment used to build the model. The accuracy of the composite score was compared with the accuracy of assessment by single and combined features as well as by other commonly used assessment methods. The testing set was representative of models produced by automated comparative modeling on a genomic scale. The composite score performed better than any other tested score in terms of the maximum correct classification rate (i.e., 3.3% false positives and 2.5% false negatives) as well as the sensitivity and specificity across the whole range of thresholds. The composite score was implemented in our program MODELLER-8 and was used to assess models in the MODBASE database that contains comparative models for domains in approximately 1.3 million protein sequences.

  7. The second phase in creating the cardiac center for the next generation: beyond structure to process improvement.

    PubMed

    Woods, J

    2001-01-01

    The third generation cardiac institute will build on the successes of the past in structuring the service line, re-organizing to assimilate specialist interests, and re-positioning to expand cardiac services into cardiovascular services. To meet the challenges of an increasingly competitive marketplace and complex delivery system, the focus for this new model will shift away from improved structures, and toward improved processes. This shift will require a sound methodology for statistically measuring and sustaining process changes related to the optimization of cardiovascular care. In recent years, GE Medical Systems has successfully applied Six Sigma methodologies to enable cardiac centers to control key clinical and market development processes through its DMADV, DMAIC and Change Acceleration processes. Data indicates Six Sigma is having a positive impact within organizations across the United States, and when appropriately implemented, this approach can serve as a solid foundation for building the next generation cardiac institute.

  8. Technology Solutions Case Study: Predicting Envelope Leakage in Attached Dwellings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    2013-11-01

    The most common method of measuring air leakage is to perform single (or solo) blower door pressurization and/or depressurization test. In detached housing, the single blower door test measures leakage to the outside. In attached housing, however, this “solo” test method measures both air leakage to the outside and air leakage between adjacent units through common surfaces. In an attempt to create a simplified tool for predicting leakage to the outside, Building America team Consortium for Advanced Residential Buildings (CARB) performed a preliminary statistical analysis on blower door test results from 112 attached dwelling units in four apartment complexes. Althoughmore » the subject data set is limited in size and variety, the preliminary analyses suggest significant predictors are present and support the development of a predictive model. Further data collection is underway to create a more robust prediction tool for use across different construction types, climate zones, and unit configurations.« less

  9. Brace for the Next Threat to a Safe School Environment.

    ERIC Educational Resources Information Center

    Reecer, Marcia

    1988-01-01

    Outlines the environmental health hazard caused by radon gas percolating into buildings. Statistical data on school buildings tested for radon has found the levels low. Discusses how to test for the gas and effective ways to prevent buildup of the gas. Includes a sidebar describing radon. (MD)

  10. Schoolhouse Systems Project: SSP. 3rd Report.

    ERIC Educational Resources Information Center

    Florida State Dept. of Education, Tallahassee.

    This brochure provides statistical bid breakdown for Programs 1A and 2 of the Florida Schoolhouse Systems Project. Tabular information is provided on bidders, compatible building subsystems, bid tabulation by compatibility, "per school" building subsystems, nominated bidders and lump sums, and a comparison of programs 1A and 2 bids. Data…

  11. Enhancing the Teaching of Statistics: Portfolio Theory, an Application of Statistics in Finance

    ERIC Educational Resources Information Center

    Christou, Nicolas

    2008-01-01

    In this paper we present an application of statistics using real stock market data. Most, if not all, students have some familiarity with the stock market (or at least they have heard about it) and therefore can understand the problem easily. It is the real data analysis that students find interesting. Here we explore the building of efficient…

  12. Building Damage Extraction Triggered by Earthquake Using the Uav Imagery

    NASA Astrophysics Data System (ADS)

    Li, S.; Tang, H.

    2018-04-01

    When extracting building damage information, we can only determine whether the building is collapsed using the post-earthquake satellite images. Even the satellite images have the sub-meter resolution, the identification of slightly damaged buildings is still a challenge. As the complementary data to satellite images, the UAV images have unique advantages, such as stronger flexibility and higher resolution. In this paper, according to the spectral feature of UAV images and the morphological feature of the reconstructed point clouds, the building damage was classified into four levels: basically intact buildings, slightly damaged buildings, partially collapsed buildings and totally collapsed buildings, and give the rules of damage grades. In particular, the slightly damaged buildings are determined using the detected roof-holes. In order to verify the approach, we conduct experimental simulations in the cases of Wenchuan and Ya'an earthquakes. By analyzing the post-earthquake UAV images of the two earthquakes, the building damage was classified into four levels, and the quantitative statistics of the damaged buildings is given in the experiments.

  13. Null-space and statistical significance of first-arrival traveltime inversion

    NASA Astrophysics Data System (ADS)

    Morozov, Igor B.

    2004-03-01

    The strong uncertainty inherent in the traveltime inversion of first arrivals from surface sources is usually removed by using a priori constraints or regularization. This leads to the null-space (data-independent model variability) being inadequately sampled, and consequently, model uncertainties may be underestimated in traditional (such as checkerboard) resolution tests. To measure the full null-space model uncertainties, we use unconstrained Monte Carlo inversion and examine the statistics of the resulting model ensembles. In an application to 1-D first-arrival traveltime inversion, the τ-p method is used to build a set of models that are equivalent to the IASP91 model within small, ~0.02 per cent, time deviations. The resulting velocity variances are much larger, ~2-3 per cent within the regions above the mantle discontinuities, and are interpreted as being due to the null-space. Depth-variant depth averaging is required for constraining the velocities within meaningful bounds, and the averaging scalelength could also be used as a measure of depth resolution. Velocity variances show structure-dependent, negative correlation with the depth-averaging scalelength. Neither the smoothest (Herglotz-Wiechert) nor the mean velocity-depth functions reproduce the discontinuities in the IASP91 model; however, the discontinuities can be identified by the increased null-space velocity (co-)variances. Although derived for a 1-D case, the above conclusions also relate to higher dimensions.

  14. The Influence of Spatial Configuration of Residential Area and Vector Populations on Dengue Incidence Patterns in an Individual-Level Transmission Model.

    PubMed

    Kang, Jeon-Young; Aldstadt, Jared

    2017-07-15

    Dengue is a mosquito-borne infectious disease that is endemic in tropical and subtropical countries. Many individual-level simulation models have been developed to test hypotheses about dengue virus transmission. Often these efforts assume that human host and mosquito vector populations are randomly or uniformly distributed in the environment. Although, the movement of mosquitoes is affected by spatial configuration of buildings and mosquito populations are highly clustered in key buildings, little research has focused on the influence of the local built environment in dengue transmission models. We developed an agent-based model of dengue transmission in a village setting to test the importance of using realistic environments in individual-level models of dengue transmission. The results from one-way ANOVA analysis of simulations indicated that the differences between scenarios in terms of infection rates as well as serotype-specific dominance are statistically significant. Specifically, the infection rates in scenarios of a realistic environment are more variable than those of a synthetic spatial configuration. With respect to dengue serotype-specific cases, we found that a single dengue serotype is more often dominant in realistic environments than in synthetic environments. An agent-based approach allows a fine-scaled analysis of simulated dengue incidence patterns. The results provide a better understanding of the influence of spatial heterogeneity on dengue transmission at a local scale.

  15. Assessing the optimality of ASHRAE climate zones using high resolution meteorological data sets

    NASA Astrophysics Data System (ADS)

    Fils, P. D.; Kumar, J.; Collier, N.; Hoffman, F. M.; Xu, M.; Forbes, W.

    2017-12-01

    Energy consumed by built infrastructure constitutes a significant fraction of the nation's energy budget. According to 2015 US Energy Information Agency report, 41% of the energy used in the US was going to residential and commercial buildings. Additional research has shown that 32% of commercial building energy goes into heating and cooling the building. The American National Standards Institute and the American Society of Heating Refrigerating and Air-Conditioning Engineers Standard 90.1 provides climate zones for current state-of-practice since heating and cooling demands are strongly influenced by spatio-temporal weather variations. For this reason, we have been assessing the optimality of the climate zones using high resolution daily climate data from NASA's DAYMET database. We analyzed time series of meteorological data sets for all ASHRAE climate zones between 1980-2016 inclusively. We computed the mean, standard deviation, and other statistics for a set of meteorological variables (solar radiation, maximum and minimum temperature)within each zone. By plotting all the zonal statistics, we analyzed patterns and trends in those data over the past 36 years. We compared the means of each zone to its standard deviation to determine the range of spatial variability that exist within each zone. If the band around the mean is too large, it indicates that regions in the zone experience a wide range of weather conditions and perhaps a common set of building design guidelines would lead to a non-optimal energy consumption scenario. In this study we have observed a strong variation in the different climate zones. Some have shown consistent patterns in the past 36 years, indicating that the zone was well constructed, while others have greatly deviated from their mean indicating that the zone needs to be reconstructed. We also looked at redesigning the climate zones based on high resolution climate data. We are using building simulations models like EnergyPlus to develop optimal energy guidelines for each climate zone and quantify potential energy savings that can be realized by redesigning climate zones using state-of-the art data sets.

  16. The Creation of Space Vector Models of Buildings From RPAS Photogrammetry Data

    NASA Astrophysics Data System (ADS)

    Trhan, Ondrej

    2017-06-01

    The results of Remote Piloted Aircraft System (RPAS) photogrammetry are digital surface models and orthophotos. The main problem of the digital surface models obtained is that buildings are not perpendicular and the shape of roofs is deformed. The task of this paper is to obtain a more accurate digital surface model using building reconstructions. The paper discusses the problem of obtaining and approximating building footprints, reconstructing the final spatial vector digital building model, and modifying the buildings on the digital surface model.

  17. Learning physical descriptors for materials science by compressed sensing

    NASA Astrophysics Data System (ADS)

    Ghiringhelli, Luca M.; Vybiral, Jan; Ahmetcik, Emre; Ouyang, Runhai; Levchenko, Sergey V.; Draxl, Claudia; Scheffler, Matthias

    2017-02-01

    The availability of big data in materials science offers new routes for analyzing materials properties and functions and achieving scientific understanding. Finding structure in these data that is not directly visible by standard tools and exploitation of the scientific information requires new and dedicated methodology based on approaches from statistical learning, compressed sensing, and other recent methods from applied mathematics, computer science, statistics, signal processing, and information science. In this paper, we explain and demonstrate a compressed-sensing based methodology for feature selection, specifically for discovering physical descriptors, i.e., physical parameters that describe the material and its properties of interest, and associated equations that explicitly and quantitatively describe those relevant properties. As showcase application and proof of concept, we describe how to build a physical model for the quantitative prediction of the crystal structure of binary compound semiconductors.

  18. Watershed Regressions for Pesticides (WARP) for Predicting Annual Maximum and Annual Maximum Moving-Average Concentrations of Atrazine in Streams

    USGS Publications Warehouse

    Stone, Wesley W.; Gilliom, Robert J.; Crawford, Charles G.

    2008-01-01

    Regression models were developed for predicting annual maximum and selected annual maximum moving-average concentrations of atrazine in streams using the Watershed Regressions for Pesticides (WARP) methodology developed by the National Water-Quality Assessment Program (NAWQA) of the U.S. Geological Survey (USGS). The current effort builds on the original WARP models, which were based on the annual mean and selected percentiles of the annual frequency distribution of atrazine concentrations. Estimates of annual maximum and annual maximum moving-average concentrations for selected durations are needed to characterize the levels of atrazine and other pesticides for comparison to specific water-quality benchmarks for evaluation of potential concerns regarding human health or aquatic life. Separate regression models were derived for the annual maximum and annual maximum 21-day, 60-day, and 90-day moving-average concentrations. Development of the regression models used the same explanatory variables, transformations, model development data, model validation data, and regression methods as those used in the original development of WARP. The models accounted for 72 to 75 percent of the variability in the concentration statistics among the 112 sampling sites used for model development. Predicted concentration statistics from the four models were within a factor of 10 of the observed concentration statistics for most of the model development and validation sites. Overall, performance of the models for the development and validation sites supports the application of the WARP models for predicting annual maximum and selected annual maximum moving-average atrazine concentration in streams and provides a framework to interpret the predictions in terms of uncertainty. For streams with inadequate direct measurements of atrazine concentrations, the WARP model predictions for the annual maximum and the annual maximum moving-average atrazine concentrations can be used to characterize the probable levels of atrazine for comparison to specific water-quality benchmarks. Sites with a high probability of exceeding a benchmark for human health or aquatic life can be prioritized for monitoring.

  19. Predictive Modeling of Risk Associated with Temperature Extremes over Continental US

    NASA Astrophysics Data System (ADS)

    Kravtsov, S.; Roebber, P.; Brazauskas, V.

    2016-12-01

    We build an extremely statistically accurate, essentially bias-free empirical emulator of atmospheric surface temperature and apply it for meteorological risk assessment over the domain of continental US. The resulting prediction scheme achieves an order-of-magnitude or larger gain of numerical efficiency compared with the schemes based on high-resolution dynamical atmospheric models, leading to unprecedented accuracy of the estimated risk distributions. The empirical model construction methodology is based on our earlier work, but is further modified to account for the influence of large-scale, global climate change on regional US weather and climate. The resulting estimates of the time-dependent, spatially extended probability of temperature extremes over the simulation period can be used as a risk management tool by insurance companies and regulatory governmental agencies.

  20. The interplay between cooperativity and diversity in model threshold ensembles

    PubMed Central

    Cervera, Javier; Manzanares, José A.; Mafe, Salvador

    2014-01-01

    The interplay between cooperativity and diversity is crucial for biological ensembles because single molecule experiments show a significant degree of heterogeneity and also for artificial nanostructures because of the high individual variability characteristic of nanoscale units. We study the cross-effects between cooperativity and diversity in model threshold ensembles composed of individually different units that show a cooperative behaviour. The units are modelled as statistical distributions of parameters (the individual threshold potentials here) characterized by central and width distribution values. The simulations show that the interplay between cooperativity and diversity results in ensemble-averaged responses of interest for the understanding of electrical transduction in cell membranes, the experimental characterization of heterogeneous groups of biomolecules and the development of biologically inspired engineering designs with individually different building blocks. PMID:25142516

  1. Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers

    PubMed Central

    Mustafa, Mumtaz Begum; Salim, Siti Salwah; Mohamed, Noraini; Al-Qatab, Bassam; Siong, Chng Eng

    2014-01-01

    Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data. PMID:24466004

  2. Systems biology by the rules: hybrid intelligent systems for pathway modeling and discovery.

    PubMed

    Bosl, William J

    2007-02-15

    Expert knowledge in journal articles is an important source of data for reconstructing biological pathways and creating new hypotheses. An important need for medical research is to integrate this data with high throughput sources to build useful models that span several scales. Researchers traditionally use mental models of pathways to integrate information and development new hypotheses. Unfortunately, the amount of information is often overwhelming and these are inadequate for predicting the dynamic response of complex pathways. Hierarchical computational models that allow exploration of semi-quantitative dynamics are useful systems biology tools for theoreticians, experimentalists and clinicians and may provide a means for cross-communication. A novel approach for biological pathway modeling based on hybrid intelligent systems or soft computing technologies is presented here. Intelligent hybrid systems, which refers to several related computing methods such as fuzzy logic, neural nets, genetic algorithms, and statistical analysis, has become ubiquitous in engineering applications for complex control system modeling and design. Biological pathways may be considered to be complex control systems, which medicine tries to manipulate to achieve desired results. Thus, hybrid intelligent systems may provide a useful tool for modeling biological system dynamics and computational exploration of new drug targets. A new modeling approach based on these methods is presented in the context of hedgehog regulation of the cell cycle in granule cells. Code and input files can be found at the Bionet website: www.chip.ord/~wbosl/Software/Bionet. This paper presents the algorithmic methods needed for modeling complicated biochemical dynamics using rule-based models to represent expert knowledge in the context of cell cycle regulation and tumor growth. A notable feature of this modeling approach is that it allows biologists to build complex models from their knowledge base without the need to translate that knowledge into mathematical form. Dynamics on several levels, from molecular pathways to tissue growth, are seamlessly integrated. A number of common network motifs are examined and used to build a model of hedgehog regulation of the cell cycle in cerebellar neurons, which is believed to play a key role in the etiology of medulloblastoma, a devastating childhood brain cancer.

  3. Are building-level characteristics associated with indoor allergens in the household?

    PubMed

    Rosenfeld, Lindsay; Chew, Ginger L; Rudd, Rima; Emmons, Karen; Acosta, Luis; Perzanowski, Matt; Acevedo-García, Dolores

    2011-02-01

    Building-level characteristics are structural factors largely beyond the control of those who live in them. We explored whether building-level characteristics and indoor allergens in the household are related. We examined the relationship between building-level characteristics and indoor allergens: dust mite, cat, cockroach, and mouse. Building-level characteristics measured were presence of pests (seeing cockroaches and rodents), building type (public housing, buildings zoned commercially and residentially, and building size), and building condition (building age and violations). Allergen cutpoints were used for categorical analyses and defined as follows: dust mite: >0.25 μg/g; cat: >1 μg/g; cockroach: >1 U/g; mouse: >1.6 μg/g. In fully adjusted linear analyses, neither dust mite nor cat allergen were statistically significantly associated with any building-level characteristics. Cockroach allergen was associated with the presence of cockroaches (2.07; 95% CI, 1.23, 3.49) and living in public housing (2.14; 95% CI, 1.07, 4.31). Mouse allergen was associated with the presence of rodents (1.70; 95% CI, 1.29, 2.23), and building size: living in a low-rise (<8 floors; 0.60; 95% CI, 0.42, 0.87) or high-rise (8 + floors; 0.50; 95% CI, 0.29, 0.88; compared with house/duplex). In fully adjusted logistic analyses, cat allergen was statistically significantly associated with living in a high-rise (6.29; 95% CI, 1.51, 26.21; compared with a house/duplex). Mouse allergen was associated with living in public housing (6.20; 95% CI, 1.01, 37.95) and building size: living in a low-rise (0.16; 95% CI, 0.05, 0.52) or high-rise (0.06; 95% CI, 0.01, 0.50; compared with a house/duplex). Issues concerning building size and public housing may be particularly critical factors in reducing asthma morbidity. We suggest that future research explore the possible improvement of these factors through changes to building code and violations adherence, design standards, and incentives for landlords.

  4. Ontology for Life-Cycle Modeling of Electrical Distribution Systems: Model View Definition

    DTIC Science & Technology

    2013-06-01

    building information models ( BIM ) at the coordinated design stage of building construction. 1.3 Approach To...standard for exchanging Building Information Modeling ( BIM ) data, which defines hundreds of classes for common use in software, currently supported by...specifications, Construction Operations Building in- formation exchange (COBie), Building Information Modeling ( BIM ) 16. SECURITY CLASSIFICATION OF:

  5. Forward Modeling of Large-scale Structure: An Open-source Approach with Halotools

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hearin, Andrew P.; Campbell, Duncan; Tollerud, Erik

    We present the first stable release of Halotools (v0.2), a community-driven Python package designed to build and test models of the galaxy-halo connection. Halotools provides a modular platform for creating mock universes of galaxies starting from a catalog of dark matter halos obtained from a cosmological simulation. The package supports many of the common forms used to describe galaxy-halo models: the halo occupation distribution, the conditional luminosity function, abundance matching, and alternatives to these models that include effects such as environmental quenching or variable galaxy assembly bias. Satellite galaxies can be modeled to live in subhalos or to follow custommore » number density profiles within their halos, including spatial and/or velocity bias with respect to the dark matter profile. The package has an optimized toolkit to make mock observations on a synthetic galaxy population—including galaxy clustering, galaxy–galaxy lensing, galaxy group identification, RSD multipoles, void statistics, pairwise velocities and others—allowing direct comparison to observations. Halotools is object-oriented, enabling complex models to be built from a set of simple, interchangeable components, including those of your own creation. Halotools has an automated testing suite and is exhaustively documented on http://halotools.readthedocs.io, which includes quickstart guides, source code notes and a large collection of tutorials. The documentation is effectively an online textbook on how to build and study empirical models of galaxy formation with Python.« less

  6. Uncertainty quantification in structural health monitoring: Applications on cultural heritage buildings

    NASA Astrophysics Data System (ADS)

    Lorenzoni, Filippo; Casarin, Filippo; Caldon, Mauro; Islami, Kleidi; Modena, Claudio

    2016-01-01

    In the last decades the need for an effective seismic protection and vulnerability reduction of cultural heritage buildings and sites determined a growing interest in structural health monitoring (SHM) as a knowledge-based assessment tool to quantify and reduce uncertainties regarding their structural performance. Monitoring can be successfully implemented in some cases as an alternative to interventions or to control the medium- and long-term effectiveness of already applied strengthening solutions. The research group at the University of Padua, in collaboration with public administrations, has recently installed several SHM systems on heritage structures. The paper reports the application of monitoring strategies implemented to avoid (or at least minimize) the execution of strengthening interventions/repairs and control the response as long as a clear worsening or damaging process is detected. Two emblematic case studies are presented and discussed: the Roman Amphitheatre (Arena) of Verona and the Conegliano Cathedral. Both are excellent examples of on-going monitoring activities, performed through static and dynamic approaches in combination with automated procedures to extract meaningful structural features from collected data. In parallel to the application of innovative monitoring techniques, statistical models and data processing algorithms have been developed and applied in order to reduce uncertainties and exploit monitoring results for an effective assessment and protection of historical constructions. Processing software for SHM was implemented to perform the continuous real time treatment of static data and the identification of modal parameters based on the structural response to ambient vibrations. Statistical models were also developed to filter out the environmental effects and thermal cycles from the extracted features.

  7. Geostatistics - a tool applied to the distribution of Legionella pneumophila in a hospital water system.

    PubMed

    Laganà, Pasqualina; Moscato, Umberto; Poscia, Andrea; La Milia, Daniele Ignazio; Boccia, Stefania; Avventuroso, Emanuela; Delia, Santi

    2015-01-01

    Legionnaires' disease is normally acquired by inhalation of legionellae from a contaminated environmental source. Water systems of large buildings, such as hospitals, are often contaminated with legionellae and therefore represent a potential risk for the hospital population. The aim of this study was to evaluate the potential contamination of Legionella pneumophila (LP) in a large hospital in Italy through georeferential statistical analysis to assess the possible sources of dispersion and, consequently, the risk of exposure for both health care staff and patients. LP serogroups 1 and 2-14 distribution was considered in the wards housed on two consecutive floors of the hospital building. On the basis of information provided by 53 bacteriological analysis, a 'random' grid of points was chosen and spatial geostatistics or FAIk Kriging was applied and compared with the results of classical statistical analysis. Over 50% of the examined samples were positive for Legionella pneumophila. LP 1 was isolated in 69% of samples from the ground floor and in 60% of sample from the first floor; LP 2-14 in 36% of sample from the ground floor and 24% from the first. The iso-estimation maps show clearly the most contaminated pipe and the difference in the diffusion of the different L. pneumophila serogroups. Experimental work has demonstrated that geostatistical methods applied to the microbiological analysis of water matrices allows a better modeling of the phenomenon under study, a greater potential for risk management and a greater choice of methods of prevention and environmental recovery to be put in place with respect to the classical statistical analysis.

  8. Application of meandering centreline migration modelling and object-based approach of Long Nab member

    NASA Astrophysics Data System (ADS)

    Saadi, Saad

    2017-04-01

    Characterizing the complexity and heterogeneity of the geometries and deposits in meandering river system is an important concern for the reservoir modelling of fluvial environments. Re-examination of the Long Nab member in the Scalby formation of the Ravenscar Group (Yorkshire, UK), integrating digital outcrop data and forward modelling approaches, will lead to a geologically realistic numerical model of the meandering river geometry. The methodology is based on extracting geostatistics from modern analogous, meandering rivers that exemplify both the confined and non-confined meandering point bars deposits and morphodynamics of Long Nab member. The parameters derived from the modern systems (i.e. channel width, amplitude, radius of curvature, sinuosity, wavelength, channel length and migration rate) are used as a statistical control for the forward simulation and resulting object oriented channel models. The statistical data derived from the modern analogues is multi-dimensional in nature, making analysis difficult. We apply data mining techniques such as parallel coordinates to investigate and identify the important relationships within the modern analogue data, which can then be used drive the development of, and as input to the forward model. This work will increase our understanding of meandering river morphodynamics, planform architecture and stratigraphic signature of various fluvial deposits and features. We will then use these forward modelling based channel objects to build reservoir models, and compare the behaviour of the forward modelled channels with traditional object modelling in hydrocarbon flow simulations.

  9. Hybrid Modeling Based on Scsg-Br and Orthophoto

    NASA Astrophysics Data System (ADS)

    Zhou, G.; Huang, Y.; Yue, T.; Li, X.; Huang, W.; He, C.; Wu, Z.

    2018-05-01

    With the development of digital city, digital applications are more and more widespread, while the urban buildings are more complex. Therefore, establishing an effective data model is the key to express urban building models accurately. In addition, the combination of 3D building model and remote sensing data become a trend to build digital city there are a large amount of data resulting in data redundancy. In order to solve the limitation of single modelling of constructive solid geometry (CSG), this paper presents a mixed modelling method based on SCSG-BR for urban buildings representation. On one hand, the improved CSG method, which is called as "Spatial CSG (SCSG)" representation method, is used to represent the exterior shape of urban buildings. On the other hand, the boundary representation (BR) method represents the topological relationship between geometric elements of urban building, in which the textures is considered as the attribute data of the wall and the roof of urban building. What's more, the method combined file database and relational database is used to manage the data of three-dimensional building model, which can decrease the complex processes in texture mapping. During the data processing, the least-squares algorithm with constraints is used to orthogonalize the building polygons and adjust the polygons topology to ensure the accuracy of the modelling data. Finally, this paper matches the urban building model with the corresponding orthophoto. This paper selects data of Denver, Colorado, USA to establish urban building realistic model. The results show that the SCSG-BR method can represent the topological relations of building more precisely. The organization and management of urban building model data reduce the redundancy of data and improve modelling speed. The combination of orthophoto and urban building model further strengthens the application in view analysis and spatial query, which enhance the scope of digital city applications.

  10. A SIGNIFICANCE TEST FOR THE LASSO1

    PubMed Central

    Lockhart, Richard; Taylor, Jonathan; Tibshirani, Ryan J.; Tibshirani, Robert

    2014-01-01

    In the sparse linear regression setting, we consider testing the significance of the predictor variable that enters the current lasso model, in the sequence of models visited along the lasso solution path. We propose a simple test statistic based on lasso fitted values, called the covariance test statistic, and show that when the true model is linear, this statistic has an Exp(1) asymptotic distribution under the null hypothesis (the null being that all truly active variables are contained in the current lasso model). Our proof of this result for the special case of the first predictor to enter the model (i.e., testing for a single significant predictor variable against the global null) requires only weak assumptions on the predictor matrix X. On the other hand, our proof for a general step in the lasso path places further technical assumptions on X and the generative model, but still allows for the important high-dimensional case p > n, and does not necessarily require that the current lasso model achieves perfect recovery of the truly active variables. Of course, for testing the significance of an additional variable between two nested linear models, one typically uses the chi-squared test, comparing the drop in residual sum of squares (RSS) to a χ12 distribution. But when this additional variable is not fixed, and has been chosen adaptively or greedily, this test is no longer appropriate: adaptivity makes the drop in RSS stochastically much larger than χ12 under the null hypothesis. Our analysis explicitly accounts for adaptivity, as it must, since the lasso builds an adaptive sequence of linear models as the tuning parameter λ decreases. In this analysis, shrinkage plays a key role: though additional variables are chosen adaptively, the coefficients of lasso active variables are shrunken due to the l1 penalty. Therefore, the test statistic (which is based on lasso fitted values) is in a sense balanced by these two opposing properties—adaptivity and shrinkage—and its null distribution is tractable and asymptotically Exp(1). PMID:25574062

  11. Ontology for Life-Cycle Modeling of Electrical Distribution Systems: Application of Model View Definition Attributes

    DTIC Science & Technology

    2013-06-01

    Building in- formation exchange (COBie), Building Information Modeling ( BIM ) 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...to develop a life-cycle building model have resulted in the definition of a “core” building information model that contains general information de...develop an information -exchange Model View Definition (MVD) for building electrical systems. The objective of the current work was to document the

  12. Statistical mechanics of unsupervised feature learning in a restricted Boltzmann machine with binary synapses

    NASA Astrophysics Data System (ADS)

    Huang, Haiping

    2017-05-01

    Revealing hidden features in unlabeled data is called unsupervised feature learning, which plays an important role in pretraining a deep neural network. Here we provide a statistical mechanics analysis of the unsupervised learning in a restricted Boltzmann machine with binary synapses. A message passing equation to infer the hidden feature is derived, and furthermore, variants of this equation are analyzed. A statistical analysis by replica theory describes the thermodynamic properties of the model. Our analysis confirms an entropy crisis preceding the non-convergence of the message passing equation, suggesting a discontinuous phase transition as a key characteristic of the restricted Boltzmann machine. Continuous phase transition is also confirmed depending on the embedded feature strength in the data. The mean-field result under the replica symmetric assumption agrees with that obtained by running message passing algorithms on single instances of finite sizes. Interestingly, in an approximate Hopfield model, the entropy crisis is absent, and a continuous phase transition is observed instead. We also develop an iterative equation to infer the hyper-parameter (temperature) hidden in the data, which in physics corresponds to iteratively imposing Nishimori condition. Our study provides insights towards understanding the thermodynamic properties of the restricted Boltzmann machine learning, and moreover important theoretical basis to build simplified deep networks.

  13. Identifying Galactic Cosmic Ray Origins With Super-TIGER

    NASA Technical Reports Server (NTRS)

    deNolfo, Georgia; Binns, W. R.; Israel, M. H.; Christian, E. R.; Mitchell, J. W.; Hams, T.; Link, J. T.; Sasaki, M.; Labrador, A. W.; Mewaldt, R. A.; hide

    2009-01-01

    Super-TIGER (Super Trans-Iron Galactic Element Recorder) is a new long-duration balloon-borne instrument designed to test and clarify an emerging model of cosmic-ray origins and models for atomic processes by which nuclei are selected for acceleration. A sensitive test of the origin of cosmic rays is the measurement of ultra heavy elemental abundances (Z > or equal 30). Super-TIGER is a large-area (5 sq m) instrument designed to measure the elements in the interval 30 < or equal Z < or equal 42 with individual-element resolution and high statistical precision, and make exploratory measurements through Z = 60. It will also measure with high statistical accuracy the energy spectra of the more abundant elements in the interval 14 < or equal Z < or equal 30 at energies 0.8 < or equal E < or equal 10 GeV/nucleon. These spectra will give a sensitive test of the hypothesis that microquasars or other sources could superpose spectral features on the otherwise smooth energy spectra previously measured with less statistical accuracy. Super-TIGER builds on the heritage of the smaller TIGER, which produced the first well-resolved measurements of elemental abundances of the elements Ga-31, Ge-32, and Se-34. We present the Super-TIGER design, schedule, and progress to date, and discuss the relevance of UH measurements to cosmic-ray origins.

  14. On the Statistical Errors of RADAR Location Sensor Networks with Built-In Wi-Fi Gaussian Linear Fingerprints

    PubMed Central

    Zhou, Mu; Xu, Yu Bin; Ma, Lin; Tian, Shuo

    2012-01-01

    The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis. PMID:22737027

  15. On the statistical errors of RADAR location sensor networks with built-in Wi-Fi Gaussian linear fingerprints.

    PubMed

    Zhou, Mu; Xu, Yu Bin; Ma, Lin; Tian, Shuo

    2012-01-01

    The expected errors of RADAR sensor networks with linear probabilistic location fingerprints inside buildings with varying Wi-Fi Gaussian strength are discussed. As far as we know, the statistical errors of equal and unequal-weighted RADAR networks have been suggested as a better way to evaluate the behavior of different system parameters and the deployment of reference points (RPs). However, up to now, there is still not enough related work on the relations between the statistical errors, system parameters, number and interval of the RPs, let alone calculating the correlated analytical expressions of concern. Therefore, in response to this compelling problem, under a simple linear distribution model, much attention will be paid to the mathematical relations of the linear expected errors, number of neighbors, number and interval of RPs, parameters in logarithmic attenuation model and variations of radio signal strength (RSS) at the test point (TP) with the purpose of constructing more practical and reliable RADAR location sensor networks (RLSNs) and also guaranteeing the accuracy requirements for the location based services in future ubiquitous context-awareness environments. Moreover, the numerical results and some real experimental evaluations of the error theories addressed in this paper will also be presented for our future extended analysis.

  16. Ensemble Statistical Post-Processing of the National Air Quality Forecast Capability: Enhancing Ozone Forecasts in Baltimore, Maryland

    NASA Technical Reports Server (NTRS)

    Garner, Gregory G.; Thompson, Anne M.

    2013-01-01

    An ensemble statistical post-processor (ESP) is developed for the National Air Quality Forecast Capability (NAQFC) to address the unique challenges of forecasting surface ozone in Baltimore, MD. Air quality and meteorological data were collected from the eight monitors that constitute the Baltimore forecast region. These data were used to build the ESP using a moving-block bootstrap, regression tree models, and extreme-value theory. The ESP was evaluated using a 10-fold cross-validation to avoid evaluation with the same data used in the development process. Results indicate that the ESP is conditionally biased, likely due to slight overfitting while training the regression tree models. When viewed from the perspective of a decision-maker, the ESP provides a wealth of additional information previously not available through the NAQFC alone. The user is provided the freedom to tailor the forecast to the decision at hand by using decision-specific probability thresholds that define a forecast for an ozone exceedance. Taking advantage of the ESP, the user not only receives an increase in value over the NAQFC, but also receives value for An ensemble statistical post-processor (ESP) is developed for the National Air Quality Forecast Capability (NAQFC) to address the unique challenges of forecasting surface ozone in Baltimore, MD. Air quality and meteorological data were collected from the eight monitors that constitute the Baltimore forecast region. These data were used to build the ESP using a moving-block bootstrap, regression tree models, and extreme-value theory. The ESP was evaluated using a 10-fold cross-validation to avoid evaluation with the same data used in the development process. Results indicate that the ESP is conditionally biased, likely due to slight overfitting while training the regression tree models. When viewed from the perspective of a decision-maker, the ESP provides a wealth of additional information previously not available through the NAQFC alone. The user is provided the freedom to tailor the forecast to the decision at hand by using decision-specific probability thresholds that define a forecast for an ozone exceedance. Taking advantage of the ESP, the user not only receives an increase in value over the NAQFC, but also receives value for

  17. Data analytics for simplifying thermal efficiency planning in cities.

    PubMed

    Abdolhosseini Qomi, Mohammad Javad; Noshadravan, Arash; Sobstyl, Jake M; Toole, Jameson; Ferreira, Joseph; Pellenq, Roland J-M; Ulm, Franz-Josef; Gonzalez, Marta C

    2016-04-01

    More than 44% of building energy consumption in the USA is used for space heating and cooling, and this accounts for 20% of national CO2emissions. This prompts the need to identify among the 130 million households in the USA those with the greatest energy-saving potential and the associated costs of the path to reach that goal. Whereas current solutions address this problem by analysing each building in detail, we herein reduce the dimensionality of the problem by simplifying the calculations of energy losses in buildings. We present a novel inference method that can be used via a ranking algorithm that allows us to estimate the potential energy saving for heating purposes. To that end, we only need consumption from records of gas bills integrated with a building's footprint. The method entails a statistical screening of the intricate interplay between weather, infrastructural and residents' choice variables to determine building gas consumption and potential savings at a city scale. We derive a general statistical pattern of consumption in an urban settlement, reducing it to a set of the most influential buildings' parameters that operate locally. By way of example, the implications are explored using records of a set of (N= 6200) buildings in Cambridge, MA, USA, which indicate that retrofitting only 16% of buildings entails a 40% reduction in gas consumption of the whole building stock. We find that the inferred heat loss rate of buildings exhibits a power-law data distribution akin to Zipf's law, which provides a means to map an optimum path for gas savings per retrofit at a city scale. These findings have implications for improving the thermal efficiency of cities' building stock, as outlined by current policy efforts seeking to reduce home heating and cooling energy consumption and lower associated greenhouse gas emissions. © 2016 The Author(s).

  18. Kernel methods and flexible inference for complex stochastic dynamics

    NASA Astrophysics Data System (ADS)

    Capobianco, Enrico

    2008-07-01

    Approximation theory suggests that series expansions and projections represent standard tools for random process applications from both numerical and statistical standpoints. Such instruments emphasize the role of both sparsity and smoothness for compression purposes, the decorrelation power achieved in the expansion coefficients space compared to the signal space, and the reproducing kernel property when some special conditions are met. We consider these three aspects central to the discussion in this paper, and attempt to analyze the characteristics of some known approximation instruments employed in a complex application domain such as financial market time series. Volatility models are often built ad hoc, parametrically and through very sophisticated methodologies. But they can hardly deal with stochastic processes with regard to non-Gaussianity, covariance non-stationarity or complex dependence without paying a big price in terms of either model mis-specification or computational efficiency. It is thus a good idea to look at other more flexible inference tools; hence the strategy of combining greedy approximation and space dimensionality reduction techniques, which are less dependent on distributional assumptions and more targeted to achieve computationally efficient performances. Advantages and limitations of their use will be evaluated by looking at algorithmic and model building strategies, and by reporting statistical diagnostics.

  19. A data storage, retrieval and analysis system for endocrine research. [for Skylab

    NASA Technical Reports Server (NTRS)

    Newton, L. E.; Johnston, D. A.

    1975-01-01

    This retrieval system builds, updates, retrieves, and performs basic statistical analyses on blood, urine, and diet parameters for the M071 and M073 Skylab and Apollo experiments. This system permits data entry from cards to build an indexed sequential file. Programs are easily modified for specialized analyses.

  20. Ascertainment-adjusted parameter estimation approach to improve robustness against misspecification of health monitoring methods

    NASA Astrophysics Data System (ADS)

    Juesas, P.; Ramasso, E.

    2016-12-01

    Condition monitoring aims at ensuring system safety which is a fundamental requirement for industrial applications and that has become an inescapable social demand. This objective is attained by instrumenting the system and developing data analytics methods such as statistical models able to turn data into relevant knowledge. One difficulty is to be able to correctly estimate the parameters of those methods based on time-series data. This paper suggests the use of the Weighted Distribution Theory together with the Expectation-Maximization algorithm to improve parameter estimation in statistical models with latent variables with an application to health monotonic under uncertainty. The improvement of estimates is made possible by incorporating uncertain and possibly noisy prior knowledge on latent variables in a sound manner. The latent variables are exploited to build a degradation model of dynamical system represented as a sequence of discrete states. Examples on Gaussian Mixture Models, Hidden Markov Models (HMM) with discrete and continuous outputs are presented on both simulated data and benchmarks using the turbofan engine datasets. A focus on the application of a discrete HMM to health monitoring under uncertainty allows to emphasize the interest of the proposed approach in presence of different operating conditions and fault modes. It is shown that the proposed model depicts high robustness in presence of noisy and uncertain prior.

  1. Geometry of behavioral spaces: A computational approach to analysis and understanding of agent based models and agent behaviors

    NASA Astrophysics Data System (ADS)

    Cenek, Martin; Dahl, Spencer K.

    2016-11-01

    Systems with non-linear dynamics frequently exhibit emergent system behavior, which is important to find and specify rigorously to understand the nature of the modeled phenomena. Through this analysis, it is possible to characterize phenomena such as how systems assemble or dissipate and what behaviors lead to specific final system configurations. Agent Based Modeling (ABM) is one of the modeling techniques used to study the interaction dynamics between a system's agents and its environment. Although the methodology of ABM construction is well understood and practiced, there are no computational, statistically rigorous, comprehensive tools to evaluate an ABM's execution. Often, a human has to observe an ABM's execution in order to analyze how the ABM functions, identify the emergent processes in the agent's behavior, or study a parameter's effect on the system-wide behavior. This paper introduces a new statistically based framework to automatically analyze agents' behavior, identify common system-wide patterns, and record the probability of agents changing their behavior from one pattern of behavior to another. We use network based techniques to analyze the landscape of common behaviors in an ABM's execution. Finally, we test the proposed framework with a series of experiments featuring increasingly emergent behavior. The proposed framework will allow computational comparison of ABM executions, exploration of a model's parameter configuration space, and identification of the behavioral building blocks in a model's dynamics.

  2. Geometry of behavioral spaces: A computational approach to analysis and understanding of agent based models and agent behaviors.

    PubMed

    Cenek, Martin; Dahl, Spencer K

    2016-11-01

    Systems with non-linear dynamics frequently exhibit emergent system behavior, which is important to find and specify rigorously to understand the nature of the modeled phenomena. Through this analysis, it is possible to characterize phenomena such as how systems assemble or dissipate and what behaviors lead to specific final system configurations. Agent Based Modeling (ABM) is one of the modeling techniques used to study the interaction dynamics between a system's agents and its environment. Although the methodology of ABM construction is well understood and practiced, there are no computational, statistically rigorous, comprehensive tools to evaluate an ABM's execution. Often, a human has to observe an ABM's execution in order to analyze how the ABM functions, identify the emergent processes in the agent's behavior, or study a parameter's effect on the system-wide behavior. This paper introduces a new statistically based framework to automatically analyze agents' behavior, identify common system-wide patterns, and record the probability of agents changing their behavior from one pattern of behavior to another. We use network based techniques to analyze the landscape of common behaviors in an ABM's execution. Finally, we test the proposed framework with a series of experiments featuring increasingly emergent behavior. The proposed framework will allow computational comparison of ABM executions, exploration of a model's parameter configuration space, and identification of the behavioral building blocks in a model's dynamics.

  3. Building integral projection models: a user's guide.

    PubMed

    Rees, Mark; Childs, Dylan Z; Ellner, Stephen P

    2014-05-01

    In order to understand how changes in individual performance (growth, survival or reproduction) influence population dynamics and evolution, ecologists are increasingly using parameterized mathematical models. For continuously structured populations, where some continuous measure of individual state influences growth, survival or reproduction, integral projection models (IPMs) are commonly used. We provide a detailed description of the steps involved in constructing an IPM, explaining how to: (i) translate your study system into an IPM; (ii) implement your IPM; and (iii) diagnose potential problems with your IPM. We emphasize how the study organism's life cycle, and the timing of censuses, together determine the structure of the IPM kernel and important aspects of the statistical analysis used to parameterize an IPM using data on marked individuals. An IPM based on population studies of Soay sheep is used to illustrate the complete process of constructing, implementing and evaluating an IPM fitted to sample data. We then look at very general approaches to parameterizing an IPM, using a wide range of statistical techniques (e.g. maximum likelihood methods, generalized additive models, nonparametric kernel density estimators). Methods for selecting models for parameterizing IPMs are briefly discussed. We conclude with key recommendations and a brief overview of applications that extend the basic model. The online Supporting Information provides commented R code for all our analyses. © 2014 The Authors. Journal of Animal Ecology published by John Wiley & Sons Ltd on behalf of British Ecological Society.

  4. A Binomial Test of Group Differences with Correlated Outcome Measures

    ERIC Educational Resources Information Center

    Onwuegbuzie, Anthony J.; Levin, Joel R.; Ferron, John M.

    2011-01-01

    Building on previous arguments for why educational researchers should not provide effect-size estimates in the face of statistically nonsignificant outcomes (Robinson & Levin, 1997), Onwuegbuzie and Levin (2005) proposed a 3-step statistical approach for assessing group differences when multiple outcome measures are individually analyzed…

  5. 77 FR 15365 - Agency Information Collection Extension

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-03-15

    ... you anticipate that you will be submitting comments, but find it difficult to do so within the period.... The mailing address is Office of Survey Development and Statistical Integration, (EI-21), Forrestal... Statistical Integration, (EI-21), Forrestal Building, U.S. Department of Energy, 1000 Independence Ave. SW...

  6. 75 FR 5565 - Submission for OMB Review; Comment Request

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-02-03

    ... and dynamic construction industry. Given the importance of this industry, several of the statistical... Authorized by Building Permits, (2) Housing Starts, and (3) New One-Family Houses Sold. These statistics help... economic indicators, to estimate the number of housing units started, completed, and sold (single-family...

  7. Sodium Hydroxide Pinpoint Pressing Permeation Method for the Animal Modeling of Sick Sinus Syndrome.

    PubMed

    Geng, Naizhi; Jiang, Ning; Peng, Cailiang; Wang, Huaiping; Zhang, Shuoxin; Chen, Tianyu; Liu, Lixia; Wu, Yaping; Liu, Dandan

    2015-01-01

    Sodium hydroxide pinpoint pressing permeation (SHPPP) was investigated in order to build a rat model of sick sinus syndrome (SSS), which is easy to operate and control the degree of damage, with fewer complications and applicable for large and small animals.Thirty healthy Wistar rats (15 males and 15 females, weighing 250-350 g) were randomly divided into 3 groups, namely a formaldehyde thoracotomy wet compressing group (FTWC), formaldehyde pinpoint pressing permeation group (FPPP) group, and SHPPP group. The number of surviving rats, heart rate (HR), sinoatrial node recovery time (SNRT), corrected SNRT (CSNRT), and sinoatrial conduction time (SACT) were recorded 3 days, one week, and two weeks after modeling.The achievement ratio of modeling was 10% in the FTWC group, 40% in the FPPP group, and 70% in the SHPPP group, and the differences were statistically significant (χ(2) = 7.250, P = 0.007). Meanwhile, the HR was reduced by about 37% in these 3 groups 3 days after modeling, while the reduction was maintained only in SHPPP (P > 0.05) and the HR was re-elevated in the FTWC and FPPP groups 2 weeks after modeling (P < 0.05). Additionally, the SNRT, cS-NRT, and SACT were significantly prolonged compared with pre-modeling in all 3 groups (P < 0.01).SHPPP was the best method with which to build an SSS model with stable and lasting low HR and high success rate of modeling, which might be helpful for further studies on the SSS mechanisms and drugs.

  8. A Computing Infrastructure for Supporting Climate Studies

    NASA Astrophysics Data System (ADS)

    Yang, C.; Bambacus, M.; Freeman, S. M.; Huang, Q.; Li, J.; Sun, M.; Xu, C.; Wojcik, G. S.; Cahalan, R. F.; NASA Climate @ Home Project Team

    2011-12-01

    Climate change is one of the major challenges facing us on the Earth planet in the 21st century. Scientists build many models to simulate the past and predict the climate change for the next decades or century. Most of the models are at a low resolution with some targeting high resolution in linkage to practical climate change preparedness. To calibrate and validate the models, millions of model runs are needed to find the best simulation and configuration. This paper introduces the NASA effort on Climate@Home project to build a supercomputer based-on advanced computing technologies, such as cloud computing, grid computing, and others. Climate@Home computing infrastructure includes several aspects: 1) a cloud computing platform is utilized to manage the potential spike access to the centralized components, such as grid computing server for dispatching and collecting models runs results; 2) a grid computing engine is developed based on MapReduce to dispatch models, model configuration, and collect simulation results and contributing statistics; 3) a portal serves as the entry point for the project to provide the management, sharing, and data exploration for end users; 4) scientists can access customized tools to configure model runs and visualize model results; 5) the public can access twitter and facebook to get the latest about the project. This paper will introduce the latest progress of the project and demonstrate the operational system during the AGU fall meeting. It will also discuss how this technology can become a trailblazer for other climate studies and relevant sciences. It will share how the challenges in computation and software integration were solved.

  9. Complementarity of Historic Building Information Modelling and Geographic Information Systems

    NASA Astrophysics Data System (ADS)

    Yang, X.; Koehl, M.; Grussenmeyer, P.; Macher, H.

    2016-06-01

    In this paper, we discuss the potential of integrating both semantically rich models from Building Information Modelling (BIM) and Geographical Information Systems (GIS) to build the detailed 3D historic model. BIM contributes to the creation of a digital representation having all physical and functional building characteristics in several dimensions, as e.g. XYZ (3D), time and non-architectural information that are necessary for construction and management of buildings. GIS has potential in handling and managing spatial data especially exploring spatial relationships and is widely used in urban modelling. However, when considering heritage modelling, the specificity of irregular historical components makes it problematic to create the enriched model according to its complex architectural elements obtained from point clouds. Therefore, some open issues limiting the historic building 3D modelling will be discussed in this paper: how to deal with the complex elements composing historic buildings in BIM and GIS environment, how to build the enriched historic model, and why to construct different levels of details? By solving these problems, conceptualization, documentation and analysis of enriched Historic Building Information Modelling are developed and compared to traditional 3D models aimed primarily for visualization.

  10. 7 CFR Exhibit E to Subpart A of... - Voluntary National Model Building Codes

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 7 Agriculture 12 2013-01-01 2013-01-01 false Voluntary National Model Building Codes E Exhibit E... National Model Building Codes The following documents address the health and safety aspects of buildings and related structures and are voluntary national model building codes as defined in § 1924.4(h)(2) of...

  11. 7 CFR Exhibit E to Subpart A of... - Voluntary National Model Building Codes

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 7 Agriculture 12 2014-01-01 2013-01-01 true Voluntary National Model Building Codes E Exhibit E to... Model Building Codes The following documents address the health and safety aspects of buildings and related structures and are voluntary national model building codes as defined in § 1924.4(h)(2) of this...

  12. 7 CFR Exhibit E to Subpart A of... - Voluntary National Model Building Codes

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 7 Agriculture 12 2012-01-01 2012-01-01 false Voluntary National Model Building Codes E Exhibit E... National Model Building Codes The following documents address the health and safety aspects of buildings and related structures and are voluntary national model building codes as defined in § 1924.4(h)(2) of...

  13. Multivariate statistical analysis of a high rate biofilm process treating kraft mill bleach plant effluent.

    PubMed

    Goode, C; LeRoy, J; Allen, D G

    2007-01-01

    This study reports on a multivariate analysis of the moving bed biofilm reactor (MBBR) wastewater treatment system at a Canadian pulp mill. The modelling approach involved a data overview by principal component analysis (PCA) followed by partial least squares (PLS) modelling with the objective of explaining and predicting changes in the BOD output of the reactor. Over two years of data with 87 process measurements were used to build the models. Variables were collected from the MBBR control scheme as well as upstream in the bleach plant and in digestion. To account for process dynamics, a variable lagging approach was used for variables with significant temporal correlations. It was found that wood type pulped at the mill was a significant variable governing reactor performance. Other important variables included flow parameters, faults in the temperature or pH control of the reactor, and some potential indirect indicators of biomass activity (residual nitrogen and pH out). The most predictive model was found to have an RMSEP value of 606 kgBOD/d, representing a 14.5% average error. This was a good fit, given the measurement error of the BOD test. Overall, the statistical approach was effective in describing and predicting MBBR treatment performance.

  14. Model-Driven Energy Intelligence

    DTIC Science & Technology

    2015-03-01

    building information model ( BIM ) for operations...estimate of the potential impact on energy performance at Fort Jackson. 15. SUBJECT TERMS Building Information Modeling ( BIM ), Energy, ECMs, monitoring...dimensional AHU Air Handling Unit API Application Programming Interface BIM building information model BLCC Building Life Cycle Cost

  15. Statistical Learning Theory for High Dimensional Prediction: Application to Criterion-Keyed Scale Development

    PubMed Central

    Chapman, Benjamin P.; Weiss, Alexander; Duberstein, Paul

    2016-01-01

    Statistical learning theory (SLT) is the statistical formulation of machine learning theory, a body of analytic methods common in “big data” problems. Regression-based SLT algorithms seek to maximize predictive accuracy for some outcome, given a large pool of potential predictors, without overfitting the sample. Research goals in psychology may sometimes call for high dimensional regression. One example is criterion-keyed scale construction, where a scale with maximal predictive validity must be built from a large item pool. Using this as a working example, we first introduce a core principle of SLT methods: minimization of expected prediction error (EPE). Minimizing EPE is fundamentally different than maximizing the within-sample likelihood, and hinges on building a predictive model of sufficient complexity to predict the outcome well, without undue complexity leading to overfitting. We describe how such models are built and refined via cross-validation. We then illustrate how three common SLT algorithms–Supervised Principal Components, Regularization, and Boosting—can be used to construct a criterion-keyed scale predicting all-cause mortality, using a large personality item pool within a population cohort. Each algorithm illustrates a different approach to minimizing EPE. Finally, we consider broader applications of SLT predictive algorithms, both as supportive analytic tools for conventional methods, and as primary analytic tools in discovery phase research. We conclude that despite their differences from the classic null-hypothesis testing approach—or perhaps because of them–SLT methods may hold value as a statistically rigorous approach to exploratory regression. PMID:27454257

  16. Statistical Analyses for Probabilistic Assessments of the Reactor Pressure Vessel Structural Integrity: Building a Master Curve on an Extract of the 'Euro' Fracture Toughness Dataset, Controlling Statistical Uncertainty for Both Mono-Temperature and multi-temperature tests

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Josse, Florent; Lefebvre, Yannick; Todeschini, Patrick

    2006-07-01

    Assessing the structural integrity of a nuclear Reactor Pressure Vessel (RPV) subjected to pressurized-thermal-shock (PTS) transients is extremely important to safety. In addition to conventional deterministic calculations to confirm RPV integrity, Electricite de France (EDF) carries out probabilistic analyses. Probabilistic analyses are interesting because some key variables, albeit conventionally taken at conservative values, can be modeled more accurately through statistical variability. One variable which significantly affects RPV structural integrity assessment is cleavage fracture initiation toughness. The reference fracture toughness method currently in use at EDF is the RCCM and ASME Code lower-bound K{sub IC} based on the indexing parameter RT{submore » NDT}. However, in order to quantify the toughness scatter for probabilistic analyses, the master curve method is being analyzed at present. Furthermore, the master curve method is a direct means of evaluating fracture toughness based on K{sub JC} data. In the framework of the master curve investigation undertaken by EDF, this article deals with the following two statistical items: building a master curve from an extract of a fracture toughness dataset (from the European project 'Unified Reference Fracture Toughness Design curves for RPV Steels') and controlling statistical uncertainty for both mono-temperature and multi-temperature tests. Concerning the first point, master curve temperature dependence is empirical in nature. To determine the 'original' master curve, Wallin postulated that a unified description of fracture toughness temperature dependence for ferritic steels is possible, and used a large number of data corresponding to nuclear-grade pressure vessel steels and welds. Our working hypothesis is that some ferritic steels may behave in slightly different ways. Therefore we focused exclusively on the basic french reactor vessel metal of types A508 Class 3 and A 533 grade B Class 1, taking the sampling level and direction into account as well as the test specimen type. As for the second point, the emphasis is placed on the uncertainties in applying the master curve approach. For a toughness dataset based on different specimens of a single product, application of the master curve methodology requires the statistical estimation of one parameter: the reference temperature T{sub 0}. Because of the limited number of specimens, estimation of this temperature is uncertain. The ASTM standard provides a rough evaluation of this statistical uncertainty through an approximate confidence interval. In this paper, a thorough study is carried out to build more meaningful confidence intervals (for both mono-temperature and multi-temperature tests). These results ensure better control over uncertainty, and allow rigorous analysis of the impact of its influencing factors: the number of specimens and the temperatures at which they have been tested. (authors)« less

  17. Exploring complex dynamics in multi agent-based intelligent systems: Theoretical and experimental approaches using the Multi Agent-based Behavioral Economic Landscape (MABEL) model

    NASA Astrophysics Data System (ADS)

    Alexandridis, Konstantinos T.

    This dissertation adopts a holistic and detailed approach to modeling spatially explicit agent-based artificial intelligent systems, using the Multi Agent-based Behavioral Economic Landscape (MABEL) model. The research questions that addresses stem from the need to understand and analyze the real-world patterns and dynamics of land use change from a coupled human-environmental systems perspective. Describes the systemic, mathematical, statistical, socio-economic and spatial dynamics of the MABEL modeling framework, and provides a wide array of cross-disciplinary modeling applications within the research, decision-making and policy domains. Establishes the symbolic properties of the MABEL model as a Markov decision process, analyzes the decision-theoretic utility and optimization attributes of agents towards comprising statistically and spatially optimal policies and actions, and explores the probabilogic character of the agents' decision-making and inference mechanisms via the use of Bayesian belief and decision networks. Develops and describes a Monte Carlo methodology for experimental replications of agent's decisions regarding complex spatial parcel acquisition and learning. Recognizes the gap on spatially-explicit accuracy assessment techniques for complex spatial models, and proposes an ensemble of statistical tools designed to address this problem. Advanced information assessment techniques such as the Receiver-Operator Characteristic curve, the impurity entropy and Gini functions, and the Bayesian classification functions are proposed. The theoretical foundation for modular Bayesian inference in spatially-explicit multi-agent artificial intelligent systems, and the ensembles of cognitive and scenario assessment modular tools build for the MABEL model are provided. Emphasizes the modularity and robustness as valuable qualitative modeling attributes, and examines the role of robust intelligent modeling as a tool for improving policy-decisions related to land use change. Finally, the major contributions to the science are presented along with valuable directions for future research.

  18. Inlining 3d Reconstruction, Multi-Source Texture Mapping and Semantic Analysis Using Oblique Aerial Imagery

    NASA Astrophysics Data System (ADS)

    Frommholz, D.; Linkiewicz, M.; Poznanska, A. M.

    2016-06-01

    This paper proposes an in-line method for the simplified reconstruction of city buildings from nadir and oblique aerial images that at the same time are being used for multi-source texture mapping with minimal resampling. Further, the resulting unrectified texture atlases are analyzed for façade elements like windows to be reintegrated into the original 3D models. Tests on real-world data of Heligoland/ Germany comprising more than 800 buildings exposed a median positional deviation of 0.31 m at the façades compared to the cadastral map, a correctness of 67% for the detected windows and good visual quality when being rendered with GPU-based perspective correction. As part of the process building reconstruction takes the oriented input images and transforms them into dense point clouds by semi-global matching (SGM). The point sets undergo local RANSAC-based regression and topology analysis to detect adjacent planar surfaces and determine their semantics. Based on this information the roof, wall and ground surfaces found get intersected and limited in their extension to form a closed 3D building hull. For texture mapping the hull polygons are projected into each possible input bitmap to find suitable color sources regarding the coverage and resolution. Occlusions are detected by ray-casting a full-scale digital surface model (DSM) of the scene and stored in pixel-precise visibility maps. These maps are used to derive overlap statistics and radiometric adjustment coefficients to be applied when the visible image parts for each building polygon are being copied into a compact texture atlas without resampling whenever possible. The atlas bitmap is passed to a commercial object-based image analysis (OBIA) tool running a custom rule set to identify windows on the contained façade patches. Following multi-resolution segmentation and classification based on brightness and contrast differences potential window objects are evaluated against geometric constraints and conditionally grown, fused and filtered morphologically. The output polygons are vectorized and reintegrated into the previously reconstructed buildings by sparsely ray-tracing their vertices. Finally the enhanced 3D models get stored as textured geometry for visualization and semantically annotated "LOD-2.5" CityGML objects for GIS applications.

  19. Molecular Dynamic Simulation of Water Vapor and Determination of Diffusion Characteristics in the Pore

    NASA Astrophysics Data System (ADS)

    Nikonov, Eduard G.; Pavluš, Miron; Popovičová, Mária

    2018-02-01

    One of the varieties of pores, often found in natural or artificial building materials, are the so-called blind pores of dead-end or saccate type. Three-dimensional model of such kind of pore has been developed in this work. This model has been used for simulation of water vapor interaction with individual pore by molecular dynamics in combination with the diffusion equation method. Special investigations have been done to find dependencies between thermostats implementations and conservation of thermodynamic and statistical values of water vapor - pore system. The two types of evolution of water - pore system have been investigated: drying and wetting of the pore. Full research of diffusion coefficient, diffusion velocity and other diffusion parameters has been made.

  20. Implicit Wiener series analysis of epileptic seizure recordings.

    PubMed

    Barbero, Alvaro; Franz, Matthias; van Drongelen, Wim; Dorronsoro, José R; Schölkopf, Bernhard; Grosse-Wentrup, Moritz

    2009-01-01

    Implicit Wiener series are a powerful tool to build Volterra representations of time series with any degree of non-linearity. A natural question is then whether higher order representations yield more useful models. In this work we shall study this question for ECoG data channel relationships in epileptic seizure recordings, considering whether quadratic representations yield more accurate classifiers than linear ones. To do so we first show how to derive statistical information on the Volterra coefficient distribution and how to construct seizure classification patterns over that information. As our results illustrate, a quadratic model seems to provide no advantages over a linear one. Nevertheless, we shall also show that the interpretability of the implicit Wiener series provides insights into the inter-channel relationships of the recordings.

  1. Probabilistic, sediment-geochemical parameterisation of the groundwater compartment of the Netherlands for spatially distributed, reactive transport modelling

    NASA Astrophysics Data System (ADS)

    Janssen, Gijs; Gunnink, Jan; van Vliet, Marielle; Goldberg, Tanya; Griffioen, Jasper

    2017-04-01

    Pollution of groundwater aquifers with contaminants as nitrate is a common problem. Reactive transport models are useful to predict the fate of such contaminants and to characterise the efficiency of mitigating or preventive measures. Parameterisation of a groundwater transport model on reaction capacity is a necessary step during building the model. Two Dutch, national programs are combined to establish a methodology for building a probabilistic model on reaction capacity of the groundwater compartment at the national scale: the Geological Survey program and the NHI Netherlands Hydrological Instrument program. Reaction capacity is considered as a series of geochemical characteristics that control acid/base condition, redox condition and sorption capacity. Five primary reaction capacity variables are characterised: 1. pyrite, 2. non-pyrite, reactive iron (oxides, siderite and glauconite), 3. clay fraction, 4. organic matter and 5. Ca-carbonate. Important reaction capacity variables that are determined by more than one solid compound are also deduced: 1. potential reduction capacity (PRC) by pyrite and organic matter, 2. cation-exchange capacity (CEC) by organic matter and clay content, 3. carbonate buffering upon pyrite oxidation (CPBO) by carbonate and pyrite. Statistical properties of these variables are established based on c. 16,000 sediment geochemical analyses. The first tens of meters are characterised based on 25 regions using combinations of lithological class and geological formation as strata. Because of both less data and more geochemical uniformity, the deeper subsurface is characterised in a similar way based on 3 regions. The statistical data is used as input in an algoritm that probabilistically calculates the reaction capacity per grid cell. First, the cumulative frequency distribution (cfd) functions are calculated from the statistical data for the geochemical strata. Second, all voxel cells are classified into the geochemical strata. Third, the cfd functions are used to put random reaction capacity variables into the hydrological voxel model. Here, the distribution can be conditioned on two variables. Two important variables are clay content and depth. The first is valid because more dense data is available for clay content than for geochemical variables as pyrite and probabilistic, lithological models are also built at TNO Geological Survey. The second is important to account for locally different depths at which the redox cline between NO3-rich and Fe(II)-rich groundwater occurs within the first tens of meters of the subsurface. An extensive data-set of groundwater quality analyses is used to derive criteria for depth variability of the redox cline. The result is a unique algoritm in order to obtain heterogeneous geochemical reaction capacity models of the entire groundwater compartment of the Netherlands.

  2. Contam airflow models of three large buildings: Model descriptions and validation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Black, Douglas R.; Price, Phillip N.

    2009-09-30

    Airflow and pollutant transport models are useful for several reasons, including protection from or response to biological terrorism. In recent years they have been used for deciding how many biological agent samplers are needed in a given building to detect the release of an agent; to figure out where those samplers should be located; to predict the number of people at risk in the event of a release of a given size and location; to devise response strategies in the event of a release; to determine optimal trade-offs between sampler characteristics (such as detection limit and response time); and somore » on. For some of these purposes it is necessary to model a specific building of interest: if you are trying to determine optimal sampling locations, you must have a model of your building and not some different building. But for many purposes generic or 'prototypical' building models would suffice. For example, for determining trade-offs between sampler characteristics, results from one building will carry over other, similar buildings. Prototypical building models are also useful for comparing or testing different algorithms or computational pproaches: different researchers can use the same models, thus allowing direct comparison of results in a way that is not otherwise possible. This document discusses prototypical building models developed by the Airflow and Pollutant Transport Group at Lawrence Berkeley National Laboratory. The models are implemented in the Contam v2.4c modeling program, available from the National Institutes for Standards and Technology. We present Contam airflow models of three virtual buildings: a convention center, an airport terminal, and a multi-story office building. All of the models are based to some extent on specific real buildings. Our goal is to produce models that are realistic, in terms of approximate magnitudes, directions, and speeds of airflow and pollutant transport. The three models vary substantially in detail. The airport model is the simplest; the onvention center model is more detailed; and the large office building model is quite complicated. We give several simplified floor plans in this document, to explain basic features of the buildings. The actual models are somewhat more complicated; for instance, spaces that are represented as rectangles in this document sometimes have more complicated shapes in the models. (However, note that the shape of a zone is irrelevant in Contam). Consult the Contam models themselves for detailed floor plans. Each building model is provided with three ventilation conditions, representing mechanical systems in which 20%, 50%, or 80% of the building air is recirculated and the rest is provided from outdoors. Please see the section on 'Use of the models' for important information about issues to consider if you wish to modify the models to provide no mechanical ventilation or eliminate provision of outdoor air.« less

  3. Estimation of the amount of asbestos-cement roofing in Poland.

    PubMed

    Wilk, Ewa; Krówczyńska, Małgorzata; Pabjanek, Piotr; Mędrzycki, Piotr

    2017-05-01

    The unique set of physical and chemical properties has led to many industrial applications of asbestos worldwide; one of them was roof covering. Asbestos is harmful to human health, and therefore its use was legally forbidden. Since in Poland there is no adequate data on the amount of asbestos-cement roofing, the objective of this study was to estimate its quantity on the basis of physical inventory taking with the use of aerial imagery, and the application of selected statistical features. Data pre-processing and analysis was executed in R Statistical Environment v. 3.1.0. Best random forest models were computed; model explaining 72.9% of the variance was subsequently used to prepare the prediction map of the amount of asbestos-cement roofing in Poland. Variables defining the number of farms, number and age of buildings, and regional differences were crucial for the analysis. The total amount of asbestos roofing in Poland was estimated at 738,068,000 m 2 (8.2m t). It is crucial for the landfill development programme, financial resources distribution, and application of monitoring policies.

  4. A Predictive Approach to Network Reverse-Engineering

    NASA Astrophysics Data System (ADS)

    Wiggins, Chris

    2005-03-01

    A central challenge of systems biology is the ``reverse engineering" of transcriptional networks: inferring which genes exert regulatory control over which other genes. Attempting such inference at the genomic scale has only recently become feasible, via data-intensive biological innovations such as DNA microrrays (``DNA chips") and the sequencing of whole genomes. In this talk we present a predictive approach to network reverse-engineering, in which we integrate DNA chip data and sequence data to build a model of the transcriptional network of the yeast S. cerevisiae capable of predicting the response of genes in unseen experiments. The technique can also be used to extract ``motifs,'' sequence elements which act as binding sites for regulatory proteins. We validate by a number of approaches and present comparison of theoretical prediction vs. experimental data, along with biological interpretations of the resulting model. En route, we will illustrate some basic notions in statistical learning theory (fitting vs. over-fitting; cross- validation; assessing statistical significance), highlighting ways in which physicists can make a unique contribution in data- driven approaches to reverse engineering.

  5. Little Green Lies: Dissecting the Hype of Renewables

    DTIC Science & Technology

    2011-05-11

    Sources: 2009 BP Statistical Energy Analysis , US Energy Information Administration Per Capita Energy Use (Kg Oil Equivalent) World 1,819 USA 7,766...Equivalent BUILDING STRONG® Energy Trends Sources: 2006 BP Statistical Energy Analysis Oil 37% Nuclear 6o/o Coal 25% Gas 23o/o Biomass 4% Hydro 3% Wind

  6. Probing the Statistical Validity of the Ductile-to-Brittle Transition in Metallic Nanowires Using GPU Computing.

    PubMed

    French, William R; Pervaje, Amulya K; Santos, Andrew P; Iacovella, Christopher R; Cummings, Peter T

    2013-12-10

    We perform a large-scale statistical analysis (>2000 independent simulations) of the elongation and rupture of gold nanowires, probing the validity and scope of the recently proposed ductile-to-brittle transition that occurs with increasing nanowire length [Wu et al. Nano Lett. 2012, 12, 910-914]. To facilitate a high-throughput simulation approach, we implement the second-moment approximation to the tight-binding (TB-SMA) potential within HOOMD-Blue, a molecular dynamics package which runs on massively parallel graphics processing units (GPUs). In a statistical sense, we find that the nanowires obey the ductile-to-brittle model quite well; however, we observe several unexpected features from the simulations that build on our understanding of the ductile-to-brittle transition. First, occasional failure behavior is observed that qualitatively differs from that predicted by the model prediction; this is attributed to stochastic thermal motion of the Au atoms and occurs at temperatures as low as 10 K. In addition, we also find that the ductile-to-brittle model, which was developed using classical dislocation theory, holds for nanowires as small as 3 nm in diameter. Finally, we demonstrate that the nanowire critical length is higher at 298 K relative to 10 K, a result that is not predicted by the ductile-to-brittle model. These results offer practical design strategies for adjusting nanowire failure and structure and also demonstrate that GPU computing is an excellent tool for studies requiring a large number of independent trajectories in order to fully characterize a system's behavior.

  7. Analyzing Data for Systems Biology: Working at the Intersection of Thermodynamics and Data Analytics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cannon, William R.; Baxter, Douglas J.

    2012-08-15

    Many challenges in systems biology have to do with analyzing data within the framework of molecular phenomena and cellular pathways. How does this relate to thermodynamics that we know govern the behavior of molecules? Making progress in relating data analysis to thermodynamics is essential in systems biology if we are to build predictive models that enable the field of synthetic biology. This report discusses work at the crossroads of thermodynamics and data analysis, and demonstrates that statistical mechanical free energy is a multinomial log likelihood. Applications to systems biology are presented.

  8. A Systems Approach to High Performance Buildings: A Computational Systems Engineering R&D Program to Increase DoD Energy Efficiency

    DTIC Science & Technology

    2012-02-01

    for Low Energy Building Ventilation and Space Conditioning Systems...Building Energy Models ................... 162 APPENDIX D: Reduced-Order Modeling and Control Design for Low Energy Building Systems .... 172 D.1...Design for Low Energy Building Ventilation and Space Conditioning Systems This section focuses on the modeling and control of airflow in buildings

  9. Application of pedagogy reflective in statistical methods course and practicum statistical methods

    NASA Astrophysics Data System (ADS)

    Julie, Hongki

    2017-08-01

    Subject Elementary Statistics, Statistical Methods and Statistical Methods Practicum aimed to equip students of Mathematics Education about descriptive statistics and inferential statistics. The students' understanding about descriptive and inferential statistics were important for students on Mathematics Education Department, especially for those who took the final task associated with quantitative research. In quantitative research, students were required to be able to present and describe the quantitative data in an appropriate manner, to make conclusions from their quantitative data, and to create relationships between independent and dependent variables were defined in their research. In fact, when students made their final project associated with quantitative research, it was not been rare still met the students making mistakes in the steps of making conclusions and error in choosing the hypothetical testing process. As a result, they got incorrect conclusions. This is a very fatal mistake for those who did the quantitative research. There were some things gained from the implementation of reflective pedagogy on teaching learning process in Statistical Methods and Statistical Methods Practicum courses, namely: 1. Twenty two students passed in this course and and one student did not pass in this course. 2. The value of the most accomplished student was A that was achieved by 18 students. 3. According all students, their critical stance could be developed by them, and they could build a caring for each other through a learning process in this course. 4. All students agreed that through a learning process that they undergo in the course, they can build a caring for each other.

  10. Building Hybrid Rover Models for NASA: Lessons Learned

    NASA Technical Reports Server (NTRS)

    Willeke, Thomas; Dearden, Richard

    2004-01-01

    Particle filters have recently become popular for diagnosis and monitoring of hybrid systems. In this paper we describe our experiences using particle filters on a real diagnosis problem, the NASA Ames Research Center's K-9 rover. As well as the challenge of modelling the dynamics of the system, there are two major issues in applying a particle filter to such a model. The first is the asynchronous nature of the system-observations from different subsystems arrive at different rates, and occasionally out of order, leading to large amounts of uncertainty in the state of the system. The second issue is data interpretation. The particle filter produces a probability distribution over the state of the system, from which summary statistics that can be used for control or higher-level diagnosis must be extracted. We describe our approaches to both these problems, as well as other modelling issues that arose in this domain.

  11. Hidden Markov models incorporating fuzzy measures and integrals for protein sequence identification and alignment.

    PubMed

    Bidargaddi, Niranjan P; Chetty, Madhu; Kamruzzaman, Joarder

    2008-06-01

    Profile hidden Markov models (HMMs) based on classical HMMs have been widely applied for protein sequence identification. The formulation of the forward and backward variables in profile HMMs is made under statistical independence assumption of the probability theory. We propose a fuzzy profile HMM to overcome the limitations of that assumption and to achieve an improved alignment for protein sequences belonging to a given family. The proposed model fuzzifies the forward and backward variables by incorporating Sugeno fuzzy measures and Choquet integrals, thus further extends the generalized HMM. Based on the fuzzified forward and backward variables, we propose a fuzzy Baum-Welch parameter estimation algorithm for profiles. The strong correlations and the sequence preference involved in the protein structures make this fuzzy architecture based model as a suitable candidate for building profiles of a given family, since the fuzzy set can handle uncertainties better than classical methods.

  12. Probabilistic Survivability Versus Time Modeling

    NASA Technical Reports Server (NTRS)

    Joyner, James J., Sr.

    2016-01-01

    This presentation documents Kennedy Space Center's Independent Assessment work completed on three assessments for the Ground Systems Development and Operations (GSDO) Program to assist the Chief Safety and Mission Assurance Officer during key programmatic reviews and provided the GSDO Program with analyses of how egress time affects the likelihood of astronaut and ground worker survival during an emergency. For each assessment, a team developed probability distributions for hazard scenarios to address statistical uncertainty, resulting in survivability plots over time. The first assessment developed a mathematical model of probabilistic survivability versus time to reach a safe location using an ideal Emergency Egress System at Launch Complex 39B (LC-39B); the second used the first model to evaluate and compare various egress systems under consideration at LC-39B. The third used a modified LC-39B model to determine if a specific hazard decreased survivability more rapidly than other events during flight hardware processing in Kennedy's Vehicle Assembly Building.

  13. Microscale Obstacle Resolving Air Quality Model Evaluation with the Michelstadt Case

    PubMed Central

    Rakai, Anikó; Kristóf, Gergely

    2013-01-01

    Modelling pollutant dispersion in cities is challenging for air quality models as the urban obstacles have an important effect on the flow field and thus the dispersion. Computational Fluid Dynamics (CFD) models with an additional scalar dispersion transport equation are a possible way to resolve the flowfield in the urban canopy and model dispersion taking into consideration the effect of the buildings explicitly. These models need detailed evaluation with the method of verification and validation to gain confidence in their reliability and use them as a regulatory purpose tool in complex urban geometries. This paper shows the performance of an open source general purpose CFD code, OpenFOAM for a complex urban geometry, Michelstadt, which has both flow field and dispersion measurement data. Continuous release dispersion results are discussed to show the strengths and weaknesses of the modelling approach, focusing on the value of the turbulent Schmidt number, which was found to give best statistical metric results with a value of 0.7. PMID:24027450

  14. Microscale obstacle resolving air quality model evaluation with the Michelstadt case.

    PubMed

    Rakai, Anikó; Kristóf, Gergely

    2013-01-01

    Modelling pollutant dispersion in cities is challenging for air quality models as the urban obstacles have an important effect on the flow field and thus the dispersion. Computational Fluid Dynamics (CFD) models with an additional scalar dispersion transport equation are a possible way to resolve the flowfield in the urban canopy and model dispersion taking into consideration the effect of the buildings explicitly. These models need detailed evaluation with the method of verification and validation to gain confidence in their reliability and use them as a regulatory purpose tool in complex urban geometries. This paper shows the performance of an open source general purpose CFD code, OpenFOAM for a complex urban geometry, Michelstadt, which has both flow field and dispersion measurement data. Continuous release dispersion results are discussed to show the strengths and weaknesses of the modelling approach, focusing on the value of the turbulent Schmidt number, which was found to give best statistical metric results with a value of 0.7.

  15. The Shock and Vibration Digest, Volume 17, Number 8

    DTIC Science & Technology

    1985-08-01

    ate, transmit, and radiate audible sound. dures are based on acoustic power flow, statistical energy analysis (SEA), and modal methods [22-283. A...modified partition area. features of the acoustic field. I.--1 85-1642 Statistical Energy Analysis , Structural Reso- nances, and Beam Networks BUILDING...energy methods, Structural resonance L.J. Lee Heriot-Watt Univ., Chambers St., Edinburgh The statistical energy analysis method is EHI 1HX, Scotland

  16. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.

    PubMed

    Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet

    2017-01-01

    RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  17. On statistical inference in time series analysis of the evolution of road safety.

    PubMed

    Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora

    2013-11-01

    Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. Copyright © 2012 Elsevier Ltd. All rights reserved.

  18. Nesting behavior of house mice (Mus domesticus) selected for increased wheel-running activity.

    PubMed

    Carter, P A; Swallow, J G; Davis, S J; Garland, T

    2000-03-01

    Nest building was measured in "active" (housed with access to running wheels) and "sedentary" (without wheel access) mice (Mus domesticus) from four replicate lines selected for 10 generations for high voluntary wheel-running behavior, and from four randombred control lines. Based on previous studies of mice bidirectionally selected for thermoregulatory nest building, it was hypothesized that nest building would show a negative correlated response to selection on wheel-running. Such a response could constrain the evolution of high voluntary activity because nesting has also been shown to be positively genetically correlated with successful production of weaned pups. With wheel access, selected mice of both sexes built significantly smaller nests than did control mice. Without wheel access, selected females also built significantly smaller nests than did control females, but only when body mass was excluded from the statistical model, suggesting that body mass mediated this correlated response to selection. Total distance run and mean running speed on wheels was significantly higher in selected mice than in controls, but no differences in amount of time spent running were measured, indicating a complex cause of the response of nesting to selection for voluntary wheel running.

  19. Temporal Characteristics of Electron Flux Events at Geosynchronous Orbit

    NASA Astrophysics Data System (ADS)

    Olson, D. K.; Larsen, B.; Henderson, M. G.

    2017-12-01

    Geosynchronous satellites such as the LANL-GEO fleet are exposed to hazardous conditions when they encounter regions of hot, intense plasma such as that from the plasma sheet. These conditions can lead to the build-up of charge on the surface of a spacecraft, with undesired, and often dangerous, side effects. Observation of electron flux levels at geosynchronous orbit (GEO) with multiple satellites provides a unique view of plasma sheet access to that region. Flux "events", or periods when fluxes are elevated continuously above the LANL-GEO spacecraft charging threshold, can be characterized by duration in two dimensions: a spatial dimension of local time, describing the duration of an event from the perspective of a single spacecraft, and a temporal dimension describing the duration in time in which high energy plasma sheet particles have access to geosynchronous orbit. We examine the statistical properties of the temporal duration of 8 keV electron flux events at geosynchronous orbit over a twelve-year period. These results, coupled with the spatial duration characteristics, provide the key information needed to formulate a statistical model for forecasting the electron flux conditions at GEO that are correlated with LANL-GEO surface charging. Forecasting models are an essential component to understanding space weather and mitigating the dangers of surface charging on our satellites. We also examine the correlation of flux event durations with solar wind parameters and geomagnetic indices, identifying the data needed to improve upon a statistical forecasting model

  20. 4D-Fingerprint Categorical QSAR Models for Skin Sensitization Based on Classification Local Lymph Node Assay Measures

    PubMed Central

    Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.

    2008-01-01

    Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934

  1. DECIDE: a software for computer-assisted evaluation of diagnostic test performance.

    PubMed

    Chiecchio, A; Bo, A; Manzone, P; Giglioli, F

    1993-05-01

    The evaluation of the performance of clinical tests is a complex problem involving different steps and many statistical tools, not always structured in an organic and rational system. This paper presents a software which provides an organic system of statistical tools helping evaluation of clinical test performance. The program allows (a) the building and the organization of a working database, (b) the selection of the minimal set of tests with the maximum information content, (c) the search of the model best fitting the distribution of the test values, (d) the selection of optimal diagnostic cut-off value of the test for every positive/negative situation, (e) the evaluation of performance of the combinations of correlated and uncorrelated tests. The uncertainty associated with all the variables involved is evaluated. The program works in a MS-DOS environment with EGA or higher performing graphic card.

  2. MOLA-Based Landing Site Characterization

    NASA Technical Reports Server (NTRS)

    Duxbury, T. C.; Ivanov, A. B.

    2001-01-01

    The Mars Global Surveyor (MGS) Mars Orbiter Laser Altimeter (MOLA) data provide the basis for site characterization and selection never before possible. The basic MOLA information includes absolute radii, elevation and 1 micrometer albedo with derived datasets including digital image models (DIM's illuminated elevation data), slopes maps and slope statistics and small scale surface roughness maps and statistics. These quantities are useful in downsizing potential sites from descent engineering constraints and landing/roving hazard and mobility assessments. Slope baselines at the few hundred meter level and surface roughness at the 10 meter level are possible. Additionally, the MOLA-derived Mars surface offers the possibility to precisely register and map project other instrument datasets (images, ultraviolet, infrared, radar, etc.) taken at different resolution, viewing and lighting geometry, building multiple layers of an information cube for site characterization and selection. Examples of direct MOLA data, data derived from MOLA and other instruments data registered to MOLA arc given for the Hematite area.

  3. Effects of sampling interval on spatial patterns and statistics of watershed nitrogen concentration

    USGS Publications Warehouse

    Wu, S.-S.D.; Usery, E.L.; Finn, M.P.; Bosch, D.D.

    2009-01-01

    This study investigates how spatial patterns and statistics of a 30 m resolution, model-simulated, watershed nitrogen concentration surface change with sampling intervals from 30 m to 600 m for every 30 m increase for the Little River Watershed (Georgia, USA). The results indicate that the mean, standard deviation, and variogram sills do not have consistent trends with increasing sampling intervals, whereas the variogram ranges remain constant. A sampling interval smaller than or equal to 90 m is necessary to build a representative variogram. The interpolation accuracy, clustering level, and total hot spot areas show decreasing trends approximating a logarithmic function. The trends correspond to the nitrogen variogram and start to level at a sampling interval of 360 m, which is therefore regarded as a critical spatial scale of the Little River Watershed. Copyright ?? 2009 by Bellwether Publishing, Ltd. All right reserved.

  4. A statistical approach based on accumulated degree-days to predict decomposition-related processes in forensic studies.

    PubMed

    Michaud, Jean-Philippe; Moreau, Gaétan

    2011-01-01

    Using pig carcasses exposed over 3 years in rural fields during spring, summer, and fall, we studied the relationship between decomposition stages and degree-day accumulation (i) to verify the predictability of the decomposition stages used in forensic entomology to document carcass decomposition and (ii) to build a degree-day accumulation model applicable to various decomposition-related processes. Results indicate that the decomposition stages can be predicted with accuracy from temperature records and that a reliable degree-day index can be developed to study decomposition-related processes. The development of degree-day indices opens new doors for researchers and allows for the application of inferential tools unaffected by climatic variability, as well as for the inclusion of statistics in a science that is primarily descriptive and in need of validation methods in courtroom proceedings. © 2010 American Academy of Forensic Sciences.

  5. Hybrid LCA model for assessing the embodied environmental impacts of buildings in South Korea

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jang, Minho, E-mail: minmin40@hanmail.net; Hong, Taehoon, E-mail: hong7@yonsei.ac.kr; Ji, Changyoon, E-mail: chnagyoon@yonsei.ac.kr

    2015-01-15

    The assessment of the embodied environmental impacts of buildings can help decision-makers plan environment-friendly buildings and reduce environmental impacts. For a more comprehensive assessment of the embodied environmental impacts of buildings, a hybrid life cycle assessment model was developed in this study. The developed model can assess the embodied environmental impacts (global warming, ozone layer depletion, acidification, eutrophication, photochemical ozone creation, abiotic depletion, and human toxicity) generated directly and indirectly in the material manufacturing, transportation, and construction phases. To demonstrate the application and validity of the developed model, the environmental impacts of an elementary school building were assessed using themore » developed model and compared with the results of a previous model used in a case study. The embodied environmental impacts from the previous model were lower than those from the developed model by 4.6–25.2%. Particularly, human toxicity potential (13 kg C{sub 6}H{sub 6} eq.) calculated by the previous model was much lower (1965 kg C{sub 6}H{sub 6} eq.) than what was calculated by the developed model. The results indicated that the developed model can quantify the embodied environmental impacts of buildings more comprehensively, and can be used by decision-makers as a tool for selecting environment-friendly buildings. - Highlights: • The model was developed to assess the embodied environmental impacts of buildings. • The model evaluates GWP, ODP, AP, EP, POCP, ADP, and HTP as environmental impacts. • The model presents more comprehensive results than the previous model by 4.6–100%. • The model can present the HTP of buildings, which the previous models cannot do. • Decision-makers can use the model for selecting environment-friendly buildings.« less

  6. Time as a Basic Concept for Theory Building in Social Gerontology.

    ERIC Educational Resources Information Center

    Pastorello, Thomas

    A typology of time-related concepts is put forth as a step toward the building of comprehensive theory in aging. The concepts derive from statistics (age, cohort, period effects), the theoretical writings of Sorokin (life course role sequences, durations and rates), the writings of Riley (on the synchronization of life course socialization and…

  7. 78 FR 63500 - Bureau of Labor Statistics Technical Advisory Committee; Notice of Meeting and Agenda

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-10-24

    ... meet on Friday, November 8, 2013. The meeting will be held in the Postal Square Building, 2... held in rooms 1 and 2 of the Postal Square Building Conference Center. The schedule and agenda for the.... Individuals who require special accommodations should contact Ms. Fieldhouse at least two days prior to the...

  8. Utilizing Building Usage Assessment: Determining Deployment of Student Workers in an Academic Library

    ERIC Educational Resources Information Center

    Buller, Ryan F.

    2014-01-01

    Generating, collecting, and analyzing building usage statistics can greatly increase the ability of an access services unit to meet the changing dynamic of patron needs in an academic library. By analyzing three different data points, the Access Services Unit in Malpass Library at Western Illinois University was able to determine the most…

  9. Turbulence and Air Exchange in a Two-Dimensional Urban Street Canyon Between Gable Roof Buildings

    NASA Astrophysics Data System (ADS)

    Garau, Michela; Badas, Maria Grazia; Ferrari, Simone; Seoni, Alessandro; Querzoli, Giorgio

    2018-04-01

    We experimentally investigate the effect of a typical building covering: the gable roof, on the flow and air exchange in urban canyons. In general, the morphology of the urban canopy is very varied and complex, depending on a large number of factors, such as building arrangement, or the morphology of the terrain. Therefore we focus on a simple, prototypal shape, the two-dimensional canyon, with the aim of elucidating some fundamental phenomena driving the street-canyon ventilation. Experiments are performed in a water channel, over an array of identical prismatic obstacles representing an idealized urban canopy. The aspect ratio, i.e. canyon-width to building-height ratio, ranges from 1 to 6. Gable roof buildings with 1:1 pitch are compared with flat roofed buildings. Velocity is measured using a particle-image-velocimetry technique with flow dynamics discussed in terms of mean flow and second- and third-order statistical moments of the velocity. The ventilation is interpreted by means of a simple well-mixed box model and the outflow rate and mean residence time are computed. Results show that gable roofs tend to delay the transition from the skimming-flow to the wake-interference regime and promote the development of a deeper and more turbulent roughness layer. The presence of a gable roof significantly increases the momentum flux, especially for high packing density. The air exchange is improved compared to the flat roof buildings, and the beneficial effect is more significant for narrow canyons. Accordingly, for unit aspect ratio gable roofs reduce the mean residence time by a factor of 0.37 compared to flat roofs, whereas the decrease is only by a factor of 0.9 at the largest aspect ratio. Data analysis indicates that, for flat roof buildings, the mean residence time increases by 30% when the aspect ratio is decreased from 6 to 2, whereas this parameter is only weakly dependent on aspect ratio in the case of gable roofs.

  10. Optimizing lighting, thermal performance, and energy production of building facades by using automated blinds and PV cells

    NASA Astrophysics Data System (ADS)

    Alzoubi, Hussain Hendi

    Energy consumption in buildings has recently become a major concern for environmental designers. Within this field, daylighting and solar energy design are attractive strategies for saving energy. This study seeks the integrity and the optimality of building envelopes' performance. It focuses on the transparent parts of building facades, specifically, the windows and their shading devices. It suggests a new automated method of utilizing solar energy while keeping optimal solutions for indoor daylighting. The method utilizes a statistical approach to produce mathematical equations based on physical experimentation. A full-scale mock-up representing an actual office was built. Heat gain and lighting levels were measured empirically and correlated with blind angles. Computational methods were used to estimate the power production from photovoltaic cells. Mathematical formulas were derived from the results of the experiments; these formulas were utilized to construct curves as well as mathematical equations for the purpose of optimization. The mathematical equations resulting from the optimization process were coded using Java programming language to enable future users to deal with generic locations of buildings with a broader context of various climatic conditions. For the purpose of optimization by automation under different climatic conditions, a blind control system was developed based on the findings of this study. This system calibrates the blind angles instantaneously based upon the sun position, the indoor daylight, and the power production from the photovoltaic cells. The functions of this system guarantee full control of the projected solar energy on buildings' facades for indoor lighting and heat gain. In winter, the system automatically blows heat into the space, whereas it expels heat from the space during the summer season. The study showed that the optimality of building facades' performance is achievable for integrated thermal, energy, and lighting models in buildings. There are blind angles that produce maximum energy from the photovoltaic cells while keeping indoor light within the acceptable limits that prevent undesired heat gain in summer.

  11. Development of Automated Procedures to Generate Reference Building Models for ASHRAE Standard 90.1 and India’s Building Energy Code and Implementation in OpenStudio

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parker, Andrew; Haves, Philip; Jegi, Subhash

    This paper describes a software system for automatically generating a reference (baseline) building energy model from the proposed (as-designed) building energy model. This system is built using the OpenStudio Software Development Kit (SDK) and is designed to operate on building energy models in the OpenStudio file format.

  12. Building flexible real-time systems using the Flex language

    NASA Technical Reports Server (NTRS)

    Kenny, Kevin B.; Lin, Kwei-Jay

    1991-01-01

    The design and implementation of a real-time programming language called Flex, which is a derivative of C++, are presented. It is shown how different types of timing requirements might be expressed and enforced in Flex, how they might be fulfilled in a flexible way using different program models, and how the programming environment can help in making binding and scheduling decisions. The timing constraint primitives in Flex are easy to use yet powerful enough to define both independent and relative timing constraints. Program models like imprecise computation and performance polymorphism can carry out flexible real-time programs. In addition, programmers can use a performance measurement tool that produces statistically correct timing models to predict the expected execution time of a program and to help make binding decisions. A real-time programming environment is also presented.

  13. The interplay between cooperativity and diversity in model threshold ensembles.

    PubMed

    Cervera, Javier; Manzanares, José A; Mafe, Salvador

    2014-10-06

    The interplay between cooperativity and diversity is crucial for biological ensembles because single molecule experiments show a significant degree of heterogeneity and also for artificial nanostructures because of the high individual variability characteristic of nanoscale units. We study the cross-effects between cooperativity and diversity in model threshold ensembles composed of individually different units that show a cooperative behaviour. The units are modelled as statistical distributions of parameters (the individual threshold potentials here) characterized by central and width distribution values. The simulations show that the interplay between cooperativity and diversity results in ensemble-averaged responses of interest for the understanding of electrical transduction in cell membranes, the experimental characterization of heterogeneous groups of biomolecules and the development of biologically inspired engineering designs with individually different building blocks. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  14. Bootstrapping in a language of thought: a formal model of numerical concept learning.

    PubMed

    Piantadosi, Steven T; Tenenbaum, Joshua B; Goodman, Noah D

    2012-05-01

    In acquiring number words, children exhibit a qualitative leap in which they transition from understanding a few number words, to possessing a rich system of interrelated numerical concepts. We present a computational framework for understanding this inductive leap as the consequence of statistical inference over a sufficiently powerful representational system. We provide an implemented model that is powerful enough to learn number word meanings and other related conceptual systems from naturalistic data. The model shows that bootstrapping can be made computationally and philosophically well-founded as a theory of number learning. Our approach demonstrates how learners may combine core cognitive operations to build sophisticated representations during the course of development, and how this process explains observed developmental patterns in number word learning. Copyright © 2011 Elsevier B.V. All rights reserved.

  15. Microsimulation Modeling for Health Decision Sciences Using R: A Tutorial.

    PubMed

    Krijkamp, Eline M; Alarid-Escudero, Fernando; Enns, Eva A; Jalal, Hawre J; Hunink, M G Myriam; Pechlivanoglou, Petros

    2018-04-01

    Microsimulation models are becoming increasingly common in the field of decision modeling for health. Because microsimulation models are computationally more demanding than traditional Markov cohort models, the use of computer programming languages in their development has become more common. R is a programming language that has gained recognition within the field of decision modeling. It has the capacity to perform microsimulation models more efficiently than software commonly used for decision modeling, incorporate statistical analyses within decision models, and produce more transparent models and reproducible results. However, no clear guidance for the implementation of microsimulation models in R exists. In this tutorial, we provide a step-by-step guide to build microsimulation models in R and illustrate the use of this guide on a simple, but transferable, hypothetical decision problem. We guide the reader through the necessary steps and provide generic R code that is flexible and can be adapted for other models. We also show how this code can be extended to address more complex model structures and provide an efficient microsimulation approach that relies on vectorization solutions.

  16. Energy Efficiency Potential in the U.S. Single-Family Housing Stock

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wilson, Eric J.; Christensen, Craig B.; Horowitz, Scott G.

    Typical approaches for assessing energy efficiency potential in buildings use a limited number of prototypes, and therefore suffer from inadequate resolution when pass-fail cost-effectiveness tests are applied, which can significantly underestimate or overestimate the economic potential of energy efficiency technologies. This analysis applies a new approach to large-scale residential energy analysis, combining the use of large public and private data sources, statistical sampling, detailed building simulations, and high-performance computing to achieve unprecedented granularity - and therefore accuracy - in modeling the diversity of the single-family housing stock. The result is a comprehensive set of maps, tables, and figures showing themore » technical and economic potential of 50 plus residential energy efficiency upgrades and packages for each state. Policymakers, program designers, and manufacturers can use these results to identify upgrades with the highest potential for cost-effective savings in a particular state or region, as well as help identify customer segments for targeted marketing and deployment. The primary finding of this analysis is that there is significant technical and economic potential to save electricity and on-site fuel use in the single-family housing stock. However, the economic potential is very sensitive to the cost-effectiveness criteria used for analysis. Additionally, the savings of particular energy efficiency upgrades is situation-specific within the housing stock (depending on climate, building vintage, heating fuel type, building physical characteristics, etc.).« less

  17. The prediction of engineering cost for green buildings based on information entropy

    NASA Astrophysics Data System (ADS)

    Liang, Guoqiang; Huang, Jinglian

    2018-03-01

    Green building is the developing trend in the world building industry. Additionally, construction costs are an essential consideration in building constructions. Therefore, it is necessary to investigate the problems of cost prediction in green building. On the basis of analyzing the cost of green building, this paper proposes the forecasting method of actual cost in green building based on information entropy and provides the forecasting working procedure. Using the probability density obtained from statistical data, such as labor costs, material costs, machinery costs, administration costs, profits, risk costs a unit project quotation and etc., situations can be predicted which lead to cost variations between budgeted cost and actual cost in constructions, through estimating the information entropy of budgeted cost and actual cost. The research results of this article have a practical significance in cost control of green building. Additionally, the method proposed in this article can be generalized and applied to a variety of other aspects in building management.

  18. Building a database for statistical characterization of ELMs on DIII-D

    NASA Astrophysics Data System (ADS)

    Fritch, B. J.; Marinoni, A.; Bortolon, A.

    2017-10-01

    Edge localized modes (ELMs) are bursty instabilities which occur in the edge region of H-mode plasmas and have the potential to damage in-vessel components of future fusion machines by exposing the divertor region to large energy and particle fluxes during each ELM event. While most ELM studies focus on average quantities (e.g. energy loss per ELM), this work investigates the statistical distributions of ELM characteristics, as a function of plasma parameters. A semi-automatic algorithm is being used to create a database documenting trigger times of the tens of thousands of ELMs for DIII-D discharges in scenarios relevant to ITER, thus allowing statistically significant analysis. Probability distributions of inter-ELM periods and energy losses will be determined and related to relevant plasma parameters such as density, stored energy, and current in order to constrain models and improve estimates of the expected inter-ELM periods and sizes, both of which must be controlled in future reactors. Work supported in part by US DoE under the Science Undergraduate Laboratory Internships (SULI) program, DE-FC02-04ER54698 and DE-FG02- 94ER54235.

  19. Empirical evidence for acceleration-dependent amplification factors

    USGS Publications Warehouse

    Borcherdt, R.D.

    2002-01-01

    Site-specific amplification factors, Fa and Fv, used in current U.S. building codes decrease with increasing base acceleration level as implied by the Loma Prieta earthquake at 0.1g and extrapolated using numerical models and laboratory results. The Northridge earthquake recordings of 17 January 1994 and subsequent geotechnical data permit empirical estimates of amplification at base acceleration levels up to 0.5g. Distance measures and normalization procedures used to infer amplification ratios from soil-rock pairs in predetermined azimuth-distance bins significantly influence the dependence of amplification estimates on base acceleration. Factors inferred using a hypocentral distance norm do not show a statistically significant dependence on base acceleration. Factors inferred using norms implied by the attenuation functions of Abrahamson and Silva show a statistically significant decrease with increasing base acceleration. The decrease is statistically more significant for stiff clay and sandy soil (site class D) sites than for stiffer sites underlain by gravely soils and soft rock (site class C). The decrease in amplification with increasing base acceleration is more pronounced for the short-period amplification factor, Fa, than for the midperiod factor, Fv.

  20. Recognizing stationary and locomotion activities using combinational of spectral analysis with statistical descriptors features

    NASA Astrophysics Data System (ADS)

    Zainudin, M. N. Shah; Sulaiman, Md Nasir; Mustapha, Norwati; Perumal, Thinagaran

    2017-10-01

    Prior knowledge in pervasive computing recently garnered a lot of attention due to its high demand in various application domains. Human activity recognition (HAR) considered as the applications that are widely explored by the expertise that provides valuable information to the human. Accelerometer sensor-based approach is utilized as devices to undergo the research in HAR since their small in size and this sensor already build-in in the various type of smartphones. However, the existence of high inter-class similarities among the class tends to degrade the recognition performance. Hence, this work presents the method for activity recognition using our proposed features from combinational of spectral analysis with statistical descriptors that able to tackle the issue of differentiating stationary and locomotion activities. The noise signal is filtered using Fourier Transform before it will be extracted using two different groups of features, spectral frequency analysis, and statistical descriptors. Extracted signal later will be classified using random forest ensemble classifier models. The recognition results show the good accuracy performance for stationary and locomotion activities based on USC HAD datasets.

Top