Sample records for parameter subset selection

  1. A Parameter Subset Selection Algorithm for Mixed-Effects Models

    DOE PAGES

    Schmidt, Kathleen L.; Smith, Ralph C.

    2016-01-01

    Mixed-effects models are commonly used to statistically model phenomena that include attributes associated with a population or general underlying mechanism as well as effects specific to individuals or components of the general mechanism. This can include individual effects associated with data from multiple experiments. However, the parameterizations used to incorporate the population and individual effects are often unidentifiable in the sense that parameters are not uniquely specified by the data. As a result, the current literature focuses on model selection, by which insensitive parameters are fixed or removed from the model. Model selection methods that employ information criteria are applicablemore » to both linear and nonlinear mixed-effects models, but such techniques are limited in that they are computationally prohibitive for large problems due to the number of possible models that must be tested. To limit the scope of possible models for model selection via information criteria, we introduce a parameter subset selection (PSS) algorithm for mixed-effects models, which orders the parameters by their significance. In conclusion, we provide examples to verify the effectiveness of the PSS algorithm and to test the performance of mixed-effects model selection that makes use of parameter subset selection.« less

  2. On the reliable and flexible solution of practical subset regression problems

    NASA Technical Reports Server (NTRS)

    Verhaegen, M. H.

    1987-01-01

    A new algorithm for solving subset regression problems is described. The algorithm performs a QR decomposition with a new column-pivoting strategy, which permits subset selection directly from the originally defined regression parameters. This, in combination with a number of extensions of the new technique, makes the method a very flexible tool for analyzing subset regression problems in which the parameters have a physical meaning.

  3. Algorithm For Solution Of Subset-Regression Problems

    NASA Technical Reports Server (NTRS)

    Verhaegen, Michel

    1991-01-01

    Reliable and flexible algorithm for solution of subset-regression problem performs QR decomposition with new column-pivoting strategy, enables selection of subset directly from originally defined regression parameters. This feature, in combination with number of extensions, makes algorithm very flexible for use in analysis of subset-regression problems in which parameters have physical meanings. Also extended to enable joint processing of columns contaminated by noise with those free of noise, without using scaling techniques.

  4. Which products are available for subsetting?

    Atmospheric Science Data Center

    2014-12-08

    ... users to create smaller files (subsets) of the original data by selecting desired parameters, parameter criterion, or latitude and ... fluxes, where the net flux is constrained to the global heat storage in netCDF format. Single Scanner Footprint TOA/Surface Fluxes ...

  5. A non-linear data mining parameter selection algorithm for continuous variables

    PubMed Central

    Razavi, Marianne; Brady, Sean

    2017-01-01

    In this article, we propose a new data mining algorithm, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, a preferred selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we present here has the potential to produce an optimal subset of variables, rendering the overall process of model selection more efficient. This algorithm introduces interpretable parameters by transforming the original inputs and also a faithful fit to the data. The core objective of this paper is to introduce a new estimation technique for the classical least square regression framework. This new automatic variable transformation and model selection method could offer an optimal and stable model that minimizes the mean square error and variability, while combining all possible subset selection methodology with the inclusion variable transformations and interactions. Moreover, this method controls multicollinearity, leading to an optimal set of explanatory variables. PMID:29131829

  6. Systematic wavelength selection for improved multivariate spectral analysis

    DOEpatents

    Thomas, Edward V.; Robinson, Mark R.; Haaland, David M.

    1995-01-01

    Methods and apparatus for determining in a biological material one or more unknown values of at least one known characteristic (e.g. the concentration of an analyte such as glucose in blood or the concentration of one or more blood gas parameters) with a model based on a set of samples with known values of the known characteristics and a multivariate algorithm using several wavelength subsets. The method includes selecting multiple wavelength subsets, from the electromagnetic spectral region appropriate for determining the known characteristic, for use by an algorithm wherein the selection of wavelength subsets improves the model's fitness of the determination for the unknown values of the known characteristic. The selection process utilizes multivariate search methods that select both predictive and synergistic wavelengths within the range of wavelengths utilized. The fitness of the wavelength subsets is determined by the fitness function F=.function.(cost, performance). The method includes the steps of: (1) using one or more applications of a genetic algorithm to produce one or more count spectra, with multiple count spectra then combined to produce a combined count spectrum; (2) smoothing the count spectrum; (3) selecting a threshold count from a count spectrum to select these wavelength subsets which optimize the fitness function; and (4) eliminating a portion of the selected wavelength subsets. The determination of the unknown values can be made: (1) noninvasively and in vivo; (2) invasively and in vivo; or (3) in vitro.

  7. Feature selection with harmony search.

    PubMed

    Diao, Ren; Shen, Qiang

    2012-12-01

    Many search strategies have been exploited for the task of feature selection (FS), in an effort to identify more compact and better quality subsets. Such work typically involves the use of greedy hill climbing (HC), or nature-inspired heuristics, in order to discover the optimal solution without going through exhaustive search. In this paper, a novel FS approach based on harmony search (HS) is presented. It is a general approach that can be used in conjunction with many subset evaluation techniques. The simplicity of HS is exploited to reduce the overall complexity of the search process. The proposed approach is able to escape from local solutions and identify multiple solutions owing to the stochastic nature of HS. Additional parameter control schemes are introduced to reduce the effort and impact of parameter configuration. These can be further combined with the iterative refinement strategy, tailored to enforce the discovery of quality subsets. The resulting approach is compared with those that rely on HC, genetic algorithms, and particle swarm optimization, accompanied by in-depth studies of the suggested improvements.

  8. A Systematic Approach to Sensor Selection for Aircraft Engine Health Estimation

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Garg, Sanjay

    2009-01-01

    A systematic approach for selecting an optimal suite of sensors for on-board aircraft gas turbine engine health estimation is presented. The methodology optimally chooses the engine sensor suite and the model tuning parameter vector to minimize the Kalman filter mean squared estimation error in the engine s health parameters or other unmeasured engine outputs. This technique specifically addresses the underdetermined estimation problem where there are more unknown system health parameters representing degradation than available sensor measurements. This paper presents the theoretical estimation error equations, and describes the optimization approach that is applied to select the sensors and model tuning parameters to minimize these errors. Two different model tuning parameter vector selection approaches are evaluated: the conventional approach of selecting a subset of health parameters to serve as the tuning parameters, and an alternative approach that selects tuning parameters as a linear combination of all health parameters. Results from the application of the technique to an aircraft engine simulation are presented, and compared to those from an alternative sensor selection strategy.

  9. Selection of noisy measurement locations for error reduction in static parameter identification

    NASA Astrophysics Data System (ADS)

    Sanayei, Masoud; Onipede, Oladipo; Babu, Suresh R.

    1992-09-01

    An incomplete set of noisy static force and displacement measurements is used for parameter identification of structures at the element level. Measurement location and the level of accuracy in the measured data can drastically affect the accuracy of the identified parameters. A heuristic method is presented to select a limited number of degrees of freedom (DOF) to perform a successful parameter identification and to reduce the impact of measurement errors on the identified parameters. This pretest simulation uses an error sensitivity analysis to determine the effect of measurement errors on the parameter estimates. The selected DOF can be used for nondestructive testing and health monitoring of structures. Two numerical examples, one for a truss and one for a frame, are presented to demonstrate that using the measurements at the selected subset of DOF can limit the error in the parameter estimates.

  10. A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

    PubMed Central

    Zhang, Daqing; Xiao, Jianfeng; Zhou, Nannan; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian

    2015-01-01

    Blood-brain barrier (BBB) is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM) is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA) to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA)/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration. PMID:26504797

  11. On Some Multiple Decision Problems

    DTIC Science & Technology

    1976-08-01

    parameter space. Some recent results in the area of subset selection formulation are Gnanadesikan and Gupta [28], Gupta and Studden [43], Gupta and...York, pp. 363-376. [27) Gnanadesikan , M. (1966). Some Selection and Ranking Procedures for Multivariate Normal Populations. Ph.D. Thesis. Dept. of...Statist., Purdue Univ., West Lafayette, Indiana 47907. [28) Gnanadesikan , M. and Gupta, S. S. (1970). Selection procedures for multivariate normal

  12. The DREO Elint Browser Utility (DEBU) reference manual

    NASA Astrophysics Data System (ADS)

    Ford, Barbara; Jones, David

    1992-04-01

    An electronic intelligent database browsing tool called DEBU has been developed that allows databases such as ELP, Kilting, EWIR, and AFEWC to be reviewed and analyzed from a user-friendly environment on a personal computer. DEBU's basic function is to allow users to examine the contents of user-selected subfiles of user-selected emitters of user-selected databases. DEBU augments this functionality with support for selecting (filtering) and combining subsets of emitters by user-selected attributes such as name, parameter type, or parameter value. DEBU provides facilities for examining histograms and x-y plots of selected parameters, for doing ambiguity analysis and mode level analysis, and for generating and printing a variety of reports. A manual is provided for users of DEBU, including descriptions and illustrations of menus and windows.

  13. Water quality parameter measurement using spectral signatures

    NASA Technical Reports Server (NTRS)

    White, P. E.

    1973-01-01

    Regression analysis is applied to the problem of measuring water quality parameters from remote sensing spectral signature data. The equations necessary to perform regression analysis are presented and methods of testing the strength and reliability of a regression are described. An efficient algorithm for selecting an optimal subset of the independent variables available for a regression is also presented.

  14. A hybrid genetic algorithm-extreme learning machine approach for accurate significant wave height reconstruction

    NASA Astrophysics Data System (ADS)

    Alexandre, E.; Cuadra, L.; Nieto-Borge, J. C.; Candil-García, G.; del Pino, M.; Salcedo-Sanz, S.

    2015-08-01

    Wave parameters computed from time series measured by buoys (significant wave height Hs, mean wave period, etc.) play a key role in coastal engineering and in the design and operation of wave energy converters. Storms or navigation accidents can make measuring buoys break down, leading to missing data gaps. In this paper we tackle the problem of locally reconstructing Hs at out-of-operation buoys by using wave parameters from nearby buoys, based on the spatial correlation among values at neighboring buoy locations. The novelty of our approach for its potential application to problems in coastal engineering is twofold. On one hand, we propose a genetic algorithm hybridized with an extreme learning machine that selects, among the available wave parameters from the nearby buoys, a subset FnSP with nSP parameters that minimizes the Hs reconstruction error. On the other hand, we evaluate to what extent the selected parameters in subset FnSP are good enough in assisting other machine learning (ML) regressors (extreme learning machines, support vector machines and gaussian process regression) to reconstruct Hs. The results show that all the ML method explored achieve a good Hs reconstruction in the two different locations studied (Caribbean Sea and West Atlantic).

  15. Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients.

    PubMed

    Capela, Nicole A; Lemaire, Edward D; Baddour, Natalie

    2015-01-01

    Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.

  16. Feature Selection for Wearable Smartphone-Based Human Activity Recognition with Able bodied, Elderly, and Stroke Patients

    PubMed Central

    2015-01-01

    Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations. PMID:25885272

  17. Articular cartilage degeneration classification by means of high-frequency ultrasound.

    PubMed

    Männicke, N; Schöne, M; Oelze, M; Raum, K

    2014-10-01

    To date only single ultrasound parameters were regarded in statistical analyses to characterize osteoarthritic changes in articular cartilage and the potential benefit of using parameter combinations for characterization remains unclear. Therefore, the aim of this work was to utilize feature selection and classification of a Mankin subset score (i.e., cartilage surface and cell sub-scores) using ultrasound-based parameter pairs and investigate both classification accuracy and the sensitivity towards different degeneration stages. 40 punch biopsies of human cartilage were previously scanned ex vivo with a 40-MHz transducer. Ultrasound-based surface parameters, as well as backscatter and envelope statistics parameters were available. Logistic regression was performed with each unique US parameter pair as predictor and different degeneration stages as response variables. The best ultrasound-based parameter pair for each Mankin subset score value was assessed by highest classification accuracy and utilized in receiver operating characteristics (ROC) analysis. The classifications discriminating between early degenerations yielded area under the ROC curve (AUC) values of 0.94-0.99 (mean ± SD: 0.97 ± 0.03). In contrast, classifications among higher Mankin subset scores resulted in lower AUC values: 0.75-0.91 (mean ± SD: 0.84 ± 0.08). Variable sensitivities of the different ultrasound features were observed with respect to different degeneration stages. Our results strongly suggest that combinations of high-frequency ultrasound-based parameters exhibit potential to characterize different, particularly very early, degeneration stages of hyaline cartilage. Variable sensitivities towards different degeneration stages suggest that a concurrent estimation of multiple ultrasound-based parameters is diagnostically valuable. In-vivo application of the present findings is conceivable in both minimally invasive arthroscopic ultrasound and high-frequency transcutaneous ultrasound. Copyright © 2014 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.

  18. Optimal Tuner Selection for Kalman-Filter-Based Aircraft Engine Performance Estimation

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Garg, Sanjay

    2011-01-01

    An emerging approach in the field of aircraft engine controls and system health management is the inclusion of real-time, onboard models for the inflight estimation of engine performance variations. This technology, typically based on Kalman-filter concepts, enables the estimation of unmeasured engine performance parameters that can be directly utilized by controls, prognostics, and health-management applications. A challenge that complicates this practice is the fact that an aircraft engine s performance is affected by its level of degradation, generally described in terms of unmeasurable health parameters such as efficiencies and flow capacities related to each major engine module. Through Kalman-filter-based estimation techniques, the level of engine performance degradation can be estimated, given that there are at least as many sensors as health parameters to be estimated. However, in an aircraft engine, the number of sensors available is typically less than the number of health parameters, presenting an under-determined estimation problem. A common approach to address this shortcoming is to estimate a subset of the health parameters, referred to as model tuning parameters. The problem/objective is to optimally select the model tuning parameters to minimize Kalman-filterbased estimation error. A tuner selection technique has been developed that specifically addresses the under-determined estimation problem, where there are more unknown parameters than available sensor measurements. A systematic approach is applied to produce a model tuning parameter vector of appropriate dimension to enable estimation by a Kalman filter, while minimizing the estimation error in the parameters of interest. Tuning parameter selection is performed using a multi-variable iterative search routine that seeks to minimize the theoretical mean-squared estimation error of the Kalman filter. This approach can significantly reduce the error in onboard aircraft engine parameter estimation applications such as model-based diagnostic, controls, and life usage calculations. The advantage of the innovation is the significant reduction in estimation errors that it can provide relative to the conventional approach of selecting a subset of health parameters to serve as the model tuning parameter vector. Because this technique needs only to be performed during the system design process, it places no additional computation burden on the onboard Kalman filter implementation. The technique has been developed for aircraft engine onboard estimation applications, as this application typically presents an under-determined estimation problem. However, this generic technique could be applied to other industries using gas turbine engine technology.

  19. Selecting Sensitive Parameter Subsets in Dynamical Models With Application to Biomechanical System Identification.

    PubMed

    Ramadan, Ahmed; Boss, Connor; Choi, Jongeun; Peter Reeves, N; Cholewicki, Jacek; Popovich, John M; Radcliffe, Clark J

    2018-07-01

    Estimating many parameters of biomechanical systems with limited data may achieve good fit but may also increase 95% confidence intervals in parameter estimates. This results in poor identifiability in the estimation problem. Therefore, we propose a novel method to select sensitive biomechanical model parameters that should be estimated, while fixing the remaining parameters to values obtained from preliminary estimation. Our method relies on identifying the parameters to which the measurement output is most sensitive. The proposed method is based on the Fisher information matrix (FIM). It was compared against the nonlinear least absolute shrinkage and selection operator (LASSO) method to guide modelers on the pros and cons of our FIM method. We present an application identifying a biomechanical parametric model of a head position-tracking task for ten human subjects. Using measured data, our method (1) reduced model complexity by only requiring five out of twelve parameters to be estimated, (2) significantly reduced parameter 95% confidence intervals by up to 89% of the original confidence interval, (3) maintained goodness of fit measured by variance accounted for (VAF) at 82%, (4) reduced computation time, where our FIM method was 164 times faster than the LASSO method, and (5) selected similar sensitive parameters to the LASSO method, where three out of five selected sensitive parameters were shared by FIM and LASSO methods.

  20. Classifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech

    PubMed Central

    Álvarez, Aitor; Sierra, Basilio; Arruti, Andoni; López-Gil, Juan-Miguel; Garay-Vitoria, Nestor

    2015-01-01

    In this paper, a new supervised classification paradigm, called classifier subset selection for stacked generalization (CSS stacking), is presented to deal with speech emotion recognition. The new approach consists of an improvement of a bi-level multi-classifier system known as stacking generalization by means of an integration of an estimation of distribution algorithm (EDA) in the first layer to select the optimal subset from the standard base classifiers. The good performance of the proposed new paradigm was demonstrated over different configurations and datasets. First, several CSS stacking classifiers were constructed on the RekEmozio dataset, using some specific standard base classifiers and a total of 123 spectral, quality and prosodic features computed using in-house feature extraction algorithms. These initial CSS stacking classifiers were compared to other multi-classifier systems and the employed standard classifiers built on the same set of speech features. Then, new CSS stacking classifiers were built on RekEmozio using a different set of both acoustic parameters (extended version of the Geneva Minimalistic Acoustic Parameter Set (eGeMAPS)) and standard classifiers and employing the best meta-classifier of the initial experiments. The performance of these two CSS stacking classifiers was evaluated and compared. Finally, the new paradigm was tested on the well-known Berlin Emotional Speech database. We compared the performance of single, standard stacking and CSS stacking systems using the same parametrization of the second phase. All of the classifications were performed at the categorical level, including the six primary emotions plus the neutral one. PMID:26712757

  1. An analysis of the least-squares problem for the DSN systematic pointing error model

    NASA Technical Reports Server (NTRS)

    Alvarez, L. S.

    1991-01-01

    A systematic pointing error model is used to calibrate antennas in the Deep Space Network. The least squares problem is described and analyzed along with the solution methods used to determine the model's parameters. Specifically studied are the rank degeneracy problems resulting from beam pointing error measurement sets that incorporate inadequate sky coverage. A least squares parameter subset selection method is described and its applicability to the systematic error modeling process is demonstrated on Voyager 2 measurement distribution.

  2. Biochemical Sensors Using Carbon Nanotube Arrays

    NASA Technical Reports Server (NTRS)

    Meyyappan, Meyya (Inventor); Cassell, Alan M. (Inventor); Li, Jun (Inventor)

    2011-01-01

    Method and system for detecting presence of biomolecules in a selected subset, or in each of several selected subsets, in a fluid. Each of an array of two or more carbon nanotubes ("CNTs") is connected at a first CNT end to one or more electronics devices, each of which senses a selected electrochemical signal that is generated when a target biomolecule in the selected subset becomes attached to a functionalized second end of the CNT, which is covalently bonded with a probe molecule. This approach indicates when target biomolecules in the selected subset are present and indicates presence or absence of target biomolecules in two or more selected subsets. Alternatively, presence of absence of an analyte can be detected.

  3. Linear retrieval and global measurements of wind speed from the Seasat SMMR

    NASA Technical Reports Server (NTRS)

    Pandey, P. C.

    1983-01-01

    Retrievals of wind speed (WS) from Seasat Scanning Multichannel Microwave Radiometer (SMMR) were performed using a two-step statistical technique. Nine subsets of two to five SMMR channels were examined for wind speed retrieval. These subsets were derived by using a leaps and bound procedure based on the coefficient of determination selection criteria to a statistical data base of brightness temperatures and geophysical parameters. Analysis of Monsoon Experiment and ocean station PAPA data showed a strong correlation between sea surface temperature and water vapor. This relation was used in generating the statistical data base. Global maps of WS were produced for one and three month periods.

  4. Safety monitoring and reactor transient interpreter

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hench, J. E.; Fukushima, T. Y.

    1983-12-20

    An apparatus which monitors a subset of control panel inputs in a nuclear reactor power plant, the subset being those indicators of plant status which are of a critical nature during an unusual event. A display (10) is provided for displaying primary information (14) as to whether the core is covered and likely to remain covered, including information as to the status of subsystems needed to cool the core and maintain core integrity. Secondary display information (18,20) is provided which can be viewed selectively for more detailed information when an abnormal condition occurs. The primary display information has messages (24)more » for prompting an operator as to which one of a number of pushbuttons (16) to press to bring up the appropriate secondary display (18,20). The apparatus utilizes a thermal-hydraulic analysis to more accurately determine key parameters (such as water level) from other measured parameters, such as power, pressure, and flow rate.« less

  5. Intelligent Weather Agent

    NASA Technical Reports Server (NTRS)

    Spirkovska, Liljana (Inventor)

    2006-01-01

    Method and system for automatically displaying, visually and/or audibly and/or by an audible alarm signal, relevant weather data for an identified aircraft pilot, when each of a selected subset of measured or estimated aviation situation parameters, corresponding to a given aviation situation, has a value lying in a selected range. Each range for a particular pilot may be a default range, may be entered by the pilot and/or may be automatically determined from experience and may be subsequently edited by the pilot to change a range and to add or delete parameters describing a situation for which a display should be provided. The pilot can also verbally activate an audible display or visual display of selected information by verbal entry of a first command or a second command, respectively, that specifies the information required.

  6. Data-driven confounder selection via Markov and Bayesian networks.

    PubMed

    Häggström, Jenny

    2018-06-01

    To unbiasedly estimate a causal effect on an outcome unconfoundedness is often assumed. If there is sufficient knowledge on the underlying causal structure then existing confounder selection criteria can be used to select subsets of the observed pretreatment covariates, X, sufficient for unconfoundedness, if such subsets exist. Here, estimation of these target subsets is considered when the underlying causal structure is unknown. The proposed method is to model the causal structure by a probabilistic graphical model, for example, a Markov or Bayesian network, estimate this graph from observed data and select the target subsets given the estimated graph. The approach is evaluated by simulation both in a high-dimensional setting where unconfoundedness holds given X and in a setting where unconfoundedness only holds given subsets of X. Several common target subsets are investigated and the selected subsets are compared with respect to accuracy in estimating the average causal effect. The proposed method is implemented with existing software that can easily handle high-dimensional data, in terms of large samples and large number of covariates. The results from the simulation study show that, if unconfoundedness holds given X, this approach is very successful in selecting the target subsets, outperforming alternative approaches based on random forests and LASSO, and that the subset estimating the target subset containing all causes of outcome yields smallest MSE in the average causal effect estimation. © 2017, The International Biometric Society.

  7. Surveillance system and method having parameter estimation and operating mode partitioning

    NASA Technical Reports Server (NTRS)

    Bickford, Randall L. (Inventor)

    2003-01-01

    A system and method for monitoring an apparatus or process asset including partitioning an unpartitioned training data set into a plurality of training data subsets each having an operating mode associated thereto; creating a process model comprised of a plurality of process submodels each trained as a function of at least one of the training data subsets; acquiring a current set of observed signal data values from the asset; determining an operating mode of the asset for the current set of observed signal data values; selecting a process submodel from the process model as a function of the determined operating mode of the asset; calculating a current set of estimated signal data values from the selected process submodel for the determined operating mode; and outputting the calculated current set of estimated signal data values for providing asset surveillance and/or control.

  8. NLSCIDNT user's guide maximum likehood parameter identification computer program with nonlinear rotorcraft model

    NASA Technical Reports Server (NTRS)

    1979-01-01

    A nonlinear, maximum likelihood, parameter identification computer program (NLSCIDNT) is described which evaluates rotorcraft stability and control coefficients from flight test data. The optimal estimates of the parameters (stability and control coefficients) are determined (identified) by minimizing the negative log likelihood cost function. The minimization technique is the Levenberg-Marquardt method, which behaves like the steepest descent method when it is far from the minimum and behaves like the modified Newton-Raphson method when it is nearer the minimum. Twenty-one states and 40 measurement variables are modeled, and any subset may be selected. States which are not integrated may be fixed at an input value, or time history data may be substituted for the state in the equations of motion. Any aerodynamic coefficient may be expressed as a nonlinear polynomial function of selected 'expansion variables'.

  9. A LEAST ABSOLUTE SHRINKAGE AND SELECTION OPERATOR (LASSO) FOR NONLINEAR SYSTEM IDENTIFICATION

    NASA Technical Reports Server (NTRS)

    Kukreja, Sunil L.; Lofberg, Johan; Brenner, Martin J.

    2006-01-01

    Identification of parametric nonlinear models involves estimating unknown parameters and detecting its underlying structure. Structure computation is concerned with selecting a subset of parameters to give a parsimonious description of the system which may afford greater insight into the functionality of the system or a simpler controller design. In this study, a least absolute shrinkage and selection operator (LASSO) technique is investigated for computing efficient model descriptions of nonlinear systems. The LASSO minimises the residual sum of squares by the addition of a 1 penalty term on the parameter vector of the traditional 2 minimisation problem. Its use for structure detection is a natural extension of this constrained minimisation approach to pseudolinear regression problems which produces some model parameters that are exactly zero and, therefore, yields a parsimonious system description. The performance of this LASSO structure detection method was evaluated by using it to estimate the structure of a nonlinear polynomial model. Applicability of the method to more complex systems such as those encountered in aerospace applications was shown by identifying a parsimonious system description of the F/A-18 Active Aeroelastic Wing using flight test data.

  10. Transferability of optimally-selected climate models in the quantification of climate change impacts on hydrology

    NASA Astrophysics Data System (ADS)

    Chen, Jie; Brissette, François P.; Lucas-Picher, Philippe

    2016-11-01

    Given the ever increasing number of climate change simulations being carried out, it has become impractical to use all of them to cover the uncertainty of climate change impacts. Various methods have been proposed to optimally select subsets of a large ensemble of climate simulations for impact studies. However, the behaviour of optimally-selected subsets of climate simulations for climate change impacts is unknown, since the transfer process from climate projections to the impact study world is usually highly non-linear. Consequently, this study investigates the transferability of optimally-selected subsets of climate simulations in the case of hydrological impacts. Two different methods were used for the optimal selection of subsets of climate scenarios, and both were found to be capable of adequately representing the spread of selected climate model variables contained in the original large ensemble. However, in both cases, the optimal subsets had limited transferability to hydrological impacts. To capture a similar variability in the impact model world, many more simulations have to be used than those that are needed to simply cover variability from the climate model variables' perspective. Overall, both optimal subset selection methods were better than random selection when small subsets were selected from a large ensemble for impact studies. However, as the number of selected simulations increased, random selection often performed better than the two optimal methods. To ensure adequate uncertainty coverage, the results of this study imply that selecting as many climate change simulations as possible is the best avenue. Where this was not possible, the two optimal methods were found to perform adequately.

  11. How to Select a Good Training-data Subset for Transcription: Submodular Active Selection for Sequences

    DTIC Science & Technology

    2009-01-01

    selection and uncertainty sampling signif- icantly. Index Terms: Transcription, labeling, submodularity, submod- ular selection, active learning , sequence...name of batch active learning , where a subset of data that is most informative and represen- tative of the whole is selected for labeling. Often...representative subset. Note that our Fisher ker- nel is over an unsupervised generative model, which enables us to bootstrap our active learning approach

  12. Estimating skin blood saturation by selecting a subset of hyperspectral imaging data

    NASA Astrophysics Data System (ADS)

    Ewerlöf, Maria; Salerud, E. Göran; Strömberg, Tomas; Larsson, Marcus

    2015-03-01

    Skin blood haemoglobin saturation (?b) can be estimated with hyperspectral imaging using the wavelength (λ) range of 450-700 nm where haemoglobin absorption displays distinct spectral characteristics. Depending on the image size and photon transport algorithm, computations may be demanding. Therefore, this work aims to evaluate subsets with a reduced number of wavelengths for ?b estimation. White Monte Carlo simulations are performed using a two-layered tissue model with discrete values for epidermal thickness (?epi) and the reduced scattering coefficient (μ's ), mimicking an imaging setup. A detected intensity look-up table is calculated for a range of model parameter values relevant to human skin, adding absorption effects in the post-processing. Skin model parameters, including absorbers, are; μ's (λ), ?epi, haemoglobin saturation (?b), tissue fraction blood (?b) and tissue fraction melanin (?mel). The skin model paired with the look-up table allow spectra to be calculated swiftly. Three inverse models with varying number of free parameters are evaluated: A(?b, ?b), B(?b, ?b, ?mel) and C(all parameters free). Fourteen wavelength candidates are selected by analysing the maximal spectral sensitivity to ?b and minimizing the sensitivity to ?b. All possible combinations of these candidates with three, four and 14 wavelengths, as well as the full spectral range, are evaluated for estimating ?b for 1000 randomly generated evaluation spectra. The results show that the simplified models A and B estimated ?b accurately using four wavelengths (mean error 2.2% for model B). If the number of wavelengths increased, the model complexity needed to be increased to avoid poor estimations.

  13. Decision Variants for the Automatic Determination of Optimal Feature Subset in RF-RFE.

    PubMed

    Chen, Qi; Meng, Zhaopeng; Liu, Xinyi; Jin, Qianguo; Su, Ran

    2018-06-15

    Feature selection, which identifies a set of most informative features from the original feature space, has been widely used to simplify the predictor. Recursive feature elimination (RFE), as one of the most popular feature selection approaches, is effective in data dimension reduction and efficiency increase. A ranking of features, as well as candidate subsets with the corresponding accuracy, is produced through RFE. The subset with highest accuracy (HA) or a preset number of features (PreNum) are often used as the final subset. However, this may lead to a large number of features being selected, or if there is no prior knowledge about this preset number, it is often ambiguous and subjective regarding final subset selection. A proper decision variant is in high demand to automatically determine the optimal subset. In this study, we conduct pioneering work to explore the decision variant after obtaining a list of candidate subsets from RFE. We provide a detailed analysis and comparison of several decision variants to automatically select the optimal feature subset. Random forest (RF)-recursive feature elimination (RF-RFE) algorithm and a voting strategy are introduced. We validated the variants on two totally different molecular biology datasets, one for a toxicogenomic study and the other one for protein sequence analysis. The study provides an automated way to determine the optimal feature subset when using RF-RFE.

  14. Adenovirus-specific T-cell Subsets in Human Peripheral Blood and After IFN-γ Immunomagnetic Selection.

    PubMed

    Qian, Chongsheng; Wang, Yingying; Cai, Huili; Laroye, Caroline; De Carvalho Bittencourt, Marcelo; Clement, Laurence; Stoltz, Jean-François; Decot, Véronique; Reppel, Loïc; Bensoussan, Danièle

    2016-01-01

    Adoptive antiviral cellular immunotherapy by infusion of virus-specific T cells (VSTs) is becoming an alternative treatment for viral infection after hematopoietic stem cell transplantation. The T memory stem cell (TSCM) subset was recently described as exhibiting self-renewal and multipotency properties which are required for sustained efficacy in vivo. We wondered if such a crucial subset for immunotherapy was present in VSTs. We identified, by flow cytometry, TSCM in adenovirus (ADV)-specific interferon (IFN)-γ+ T cells before and after IFN-γ-based immunomagnetic selection, and analyzed the distribution of the main T-cell subsets in VSTs: naive T cells (TN), TSCM, T central memory cells (TCM), T effector memory cell (TEM), and effector T cells (TEFF). In this study all of the different T-cell subsets were observed in the blood sample from healthy donor ADV-VSTs, both before and after IFN-γ-based immunomagnetic selection. As the IFN-γ-based immunomagnetic selection system sorts mainly the most differentiated T-cell subsets, we observed that TEM was always the major T-cell subset of ADV-specific T cells after immunomagnetic isolation and especially after expansion in vitro. Comparing T-cell subpopulation profiles before and after in vitro expansion, we observed that in vitro cell culture with interleukin-2 resulted in a significant expansion of TN-like, TCM, TEM, and TEFF subsets in CD4IFN-γ T cells and of TCM and TEM subsets only in CD8IFN-γ T cells. We demonstrated the presence of all T-cell subsets in IFN-γ VSTs including the TSCM subpopulation, although this was weakly selected by the IFN-γ-based immunomagnetic selection system.

  15. Variable Neighborhood Search Heuristics for Selecting a Subset of Variables in Principal Component Analysis

    ERIC Educational Resources Information Center

    Brusco, Michael J.; Singh, Renu; Steinley, Douglas

    2009-01-01

    The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…

  16. Creating a non-linear total sediment load formula using polynomial best subset regression model

    NASA Astrophysics Data System (ADS)

    Okcu, Davut; Pektas, Ali Osman; Uyumaz, Ali

    2016-08-01

    The aim of this study is to derive a new total sediment load formula which is more accurate and which has less application constraints than the well-known formulae of the literature. 5 most known stream power concept sediment formulae which are approved by ASCE are used for benchmarking on a wide range of datasets that includes both field and flume (lab) observations. The dimensionless parameters of these widely used formulae are used as inputs in a new regression approach. The new approach is called Polynomial Best subset regression (PBSR) analysis. The aim of the PBRS analysis is fitting and testing all possible combinations of the input variables and selecting the best subset. Whole the input variables with their second and third powers are included in the regression to test the possible relation between the explanatory variables and the dependent variable. While selecting the best subset a multistep approach is used that depends on significance values and also the multicollinearity degrees of inputs. The new formula is compared to others in a holdout dataset and detailed performance investigations are conducted for field and lab datasets within this holdout data. Different goodness of fit statistics are used as they represent different perspectives of the model accuracy. After the detailed comparisons are carried out we figured out the most accurate equation that is also applicable on both flume and river data. Especially, on field dataset the prediction performance of the proposed formula outperformed the benchmark formulations.

  17. Combining markers with and without the limit of detection

    PubMed Central

    Dong, Ting; Liu, Catherine Chunling; Petricoin, Emanuel F.; Tang, Liansheng Larry

    2014-01-01

    In this paper, we consider the combination of markers with and without the limit of detection (LOD). LOD is often encountered when measuring proteomic markers. Because of the limited detecting ability of an equipment or instrument, it is difficult to measure markers at a relatively low level. Suppose that after some monotonic transformation, the marker values approximately follow multivariate normal distributions. We propose to estimate distribution parameters while taking the LOD into account, and then combine markers using the results from the linear discriminant analysis. Our simulation results show that the ROC curve parameter estimates generated from the proposed method are much closer to the truth than simply using the linear discriminant analysis to combine markers without considering the LOD. In addition, we propose a procedure to select and combine a subset of markers when many candidate markers are available. The procedure based on the correlation among markers is different from a common understanding that a subset of the most accurate markers should be selected for the combination. The simulation studies show that the accuracy of a combined marker can be largely impacted by the correlation of marker measurements. Our methods are applied to a protein pathway dataset to combine proteomic biomarkers to distinguish cancer patients from non-cancer patients. PMID:24132938

  18. Selecting climate change scenarios for regional hydrologic impact studies based on climate extremes indices

    NASA Astrophysics Data System (ADS)

    Seo, Seung Beom; Kim, Young-Oh; Kim, Youngil; Eum, Hyung-Il

    2018-04-01

    When selecting a subset of climate change scenarios (GCM models), the priority is to ensure that the subset reflects the comprehensive range of possible model results for all variables concerned. Though many studies have attempted to improve the scenario selection, there is a lack of studies that discuss methods to ensure that the results from a subset of climate models contain the same range of uncertainty in hydrologic variables as when all models are considered. We applied the Katsavounidis-Kuo-Zhang (KKZ) algorithm to select a subset of climate change scenarios and demonstrated its ability to reduce the number of GCM models in an ensemble, while the ranges of multiple climate extremes indices were preserved. First, we analyzed the role of 27 ETCCDI climate extremes indices for scenario selection and selected the representative climate extreme indices. Before the selection of a subset, we excluded a few deficient GCM models that could not represent the observed climate regime. Subsequently, we discovered that a subset of GCM models selected by the KKZ algorithm with the representative climate extreme indices could not capture the full potential range of changes in hydrologic extremes (e.g., 3-day peak flow and 7-day low flow) in some regional case studies. However, the application of the KKZ algorithm with a different set of climate indices, which are correlated to the hydrologic extremes, enabled the overcoming of this limitation. Key climate indices, dependent on the hydrologic extremes to be projected, must therefore be determined prior to the selection of a subset of GCM models.

  19. Minimal ensemble based on subset selection using ECG to diagnose categories of CAN.

    PubMed

    Abawajy, Jemal; Kelarev, Andrei; Yi, Xun; Jelinek, Herbert F

    2018-07-01

    Early diagnosis of cardiac autonomic neuropathy (CAN) is critical for reversing or decreasing its progression and prevent complications. Diagnostic accuracy or precision is one of the core requirements of CAN detection. As the standard Ewing battery tests suffer from a number of shortcomings, research in automating and improving the early detection of CAN has recently received serious attention in identifying additional clinical variables and designing advanced ensembles of classifiers to improve the accuracy or precision of CAN diagnostics. Although large ensembles are commonly proposed for the automated diagnosis of CAN, large ensembles are characterized by slow processing speed and computational complexity. This paper applies ECG features and proposes a new ensemble-based approach for diagnosis of CAN progression. We introduce a Minimal Ensemble Based On Subset Selection (MEBOSS) for the diagnosis of all categories of CAN including early, definite and atypical CAN. MEBOSS is based on a novel multi-tier architecture applying classifier subset selection as well as the training subset selection during several steps of its operation. Our experiments determined the diagnostic accuracy or precision obtained in 5 × 2 cross-validation for various options employed in MEBOSS and other classification systems. The experiments demonstrate the operation of the MEBOSS procedure invoking the most effective classifiers available in the open source software environment SageMath. The results of our experiments show that for the large DiabHealth database of CAN related parameters MEBOSS outperformed other classification systems available in SageMath and achieved 94% to 97% precision in 5 × 2 cross-validation correctly distinguishing any two CAN categories to a maximum of five categorizations including control, early, definite, severe and atypical CAN. These results show that MEBOSS architecture is effective and can be recommended for practical implementations in systems for the diagnosis of CAN progression. Copyright © 2018 Elsevier B.V. All rights reserved.

  20. Mono-, Di-, or Trimorphism in Black Sea Ammonia sp.

    NASA Astrophysics Data System (ADS)

    Altenbach, Alexander V.; Bassler, Barbara

    2014-05-01

    For the genus Ammonia, the size of proloculi was considered one of the valuable taxonomic landmarks, although it may split in first alternating generations. We analysed 140 living (stained) tests of Ammonia sp. from the outer shelf of the Black Sea, collected from 5 stations on a depth gradient (138 to 206 m water depth). Samples were treated by standard technologies, such as live staining, wet sieving, volume detection, counts, and measures by light microscopy. The size of the proloculi was detected, extended by biometric characterisations of 11 measures, 5 qualitative characters, and 4 numerical ratios. Surprisingly, the multitude of test parameters allows the definition of either one highly variable taxon, or several species, or either di- or trimorphism, exclusively resulting from the definition of 'decisive' or 'neglectable' parameters, or parameter subsets. We followed the general taxonomic definition for the species of the genera, and applied, discussed and rejected published criteria considered as taxonomically important. Surprisingly, in result none of the species described hitherto fully correlates with the morphological roundup observed. It is a new species. This conclusion mainly results from the balance of all morphologies, and not from the selection of an ultimate subset.

  1. Plate-based diversity subset screening generation 2: an improved paradigm for high-throughput screening of large compound files.

    PubMed

    Bell, Andrew S; Bradley, Joseph; Everett, Jeremy R; Loesel, Jens; McLoughlin, David; Mills, James; Peakman, Marie-Claire; Sharp, Robert E; Williams, Christine; Zhu, Hongyao

    2016-11-01

    High-throughput screening (HTS) is an effective method for lead and probe discovery that is widely used in industry and academia to identify novel chemical matter and to initiate the drug discovery process. However, HTS can be time consuming and costly and the use of subsets as an efficient alternative to screening entire compound collections has been investigated. Subsets may be selected on the basis of chemical diversity, molecular properties, biological activity diversity or biological target focus. Previously, we described a novel form of subset screening: plate-based diversity subset (PBDS) screening, in which the screening subset is constructed by plate selection (rather than individual compound cherry-picking), using algorithms that select for compound quality and chemical diversity on a plate basis. In this paper, we describe a second-generation approach to the construction of an updated subset: PBDS2, using both plate and individual compound selection, that has an improved coverage of the chemical space of the screening file, whilst only selecting the same number of plates for screening. We describe the validation of PBDS2 and its successful use in hit and lead discovery. PBDS2 screening became the default mode of singleton (one compound per well) HTS for lead discovery in Pfizer.

  2. Inside the Mind of a Medicinal Chemist: The Role of Human Bias in Compound Prioritization during Drug Discovery

    PubMed Central

    Kutchukian, Peter S.; Vasilyeva, Nadya Y.; Xu, Jordan; Lindvall, Mika K.; Dillon, Michael P.; Glick, Meir; Coley, John D.; Brooijmans, Natasja

    2012-01-01

    Medicinal chemists’ “intuition” is critical for success in modern drug discovery. Early in the discovery process, chemists select a subset of compounds for further research, often from many viable candidates. These decisions determine the success of a discovery campaign, and ultimately what kind of drugs are developed and marketed to the public. Surprisingly little is known about the cognitive aspects of chemists’ decision-making when they prioritize compounds. We investigate 1) how and to what extent chemists simplify the problem of identifying promising compounds, 2) whether chemists agree with each other about the criteria used for such decisions, and 3) how accurately chemists report the criteria they use for these decisions. Chemists were surveyed and asked to select chemical fragments that they would be willing to develop into a lead compound from a set of ∼4,000 available fragments. Based on each chemist’s selections, computational classifiers were built to model each chemist’s selection strategy. Results suggest that chemists greatly simplified the problem, typically using only 1–2 of many possible parameters when making their selections. Although chemists tended to use the same parameters to select compounds, differing value preferences for these parameters led to an overall lack of consensus in compound selections. Moreover, what little agreement there was among the chemists was largely in what fragments were undesirable. Furthermore, chemists were often unaware of the parameters (such as compound size) which were statistically significant in their selections, and overestimated the number of parameters they employed. A critical evaluation of the problem space faced by medicinal chemists and cognitive models of categorization were especially useful in understanding the low consensus between chemists. PMID:23185259

  3. A new approach to identify the sensitivity and importance of physical parameters combination within numerical models using the Lund-Potsdam-Jena (LPJ) model as an example

    NASA Astrophysics Data System (ADS)

    Sun, Guodong; Mu, Mu

    2017-05-01

    An important source of uncertainty, which causes further uncertainty in numerical simulations, is that residing in the parameters describing physical processes in numerical models. Therefore, finding a subset among numerous physical parameters in numerical models in the atmospheric and oceanic sciences, which are relatively more sensitive and important parameters, and reducing the errors in the physical parameters in this subset would be a far more efficient way to reduce the uncertainties involved in simulations. In this context, we present a new approach based on the conditional nonlinear optimal perturbation related to parameter (CNOP-P) method. The approach provides a framework to ascertain the subset of those relatively more sensitive and important parameters among the physical parameters. The Lund-Potsdam-Jena (LPJ) dynamical global vegetation model was utilized to test the validity of the new approach in China. The results imply that nonlinear interactions among parameters play a key role in the identification of sensitive parameters in arid and semi-arid regions of China compared to those in northern, northeastern, and southern China. The uncertainties in the numerical simulations were reduced considerably by reducing the errors of the subset of relatively more sensitive and important parameters. The results demonstrate that our approach not only offers a new route to identify relatively more sensitive and important physical parameters but also that it is viable to then apply "target observations" to reduce the uncertainties in model parameters.

  4. Comparison of Genetic Algorithm, Particle Swarm Optimization and Biogeography-based Optimization for Feature Selection to Classify Clusters of Microcalcifications

    NASA Astrophysics Data System (ADS)

    Khehra, Baljit Singh; Pharwaha, Amar Partap Singh

    2017-04-01

    Ductal carcinoma in situ (DCIS) is one type of breast cancer. Clusters of microcalcifications (MCCs) are symptoms of DCIS that are recognized by mammography. Selection of robust features vector is the process of selecting an optimal subset of features from a large number of available features in a given problem domain after the feature extraction and before any classification scheme. Feature selection reduces the feature space that improves the performance of classifier and decreases the computational burden imposed by using many features on classifier. Selection of an optimal subset of features from a large number of available features in a given problem domain is a difficult search problem. For n features, the total numbers of possible subsets of features are 2n. Thus, selection of an optimal subset of features problem belongs to the category of NP-hard problems. In this paper, an attempt is made to find the optimal subset of MCCs features from all possible subsets of features using genetic algorithm (GA), particle swarm optimization (PSO) and biogeography-based optimization (BBO). For simulation, a total of 380 benign and malignant MCCs samples have been selected from mammogram images of DDSM database. A total of 50 features extracted from benign and malignant MCCs samples are used in this study. In these algorithms, fitness function is correct classification rate of classifier. Support vector machine is used as a classifier. From experimental results, it is also observed that the performance of PSO-based and BBO-based algorithms to select an optimal subset of features for classifying MCCs as benign or malignant is better as compared to GA-based algorithm.

  5. URS DataBase: universe of RNA structures and their motifs.

    PubMed

    Baulin, Eugene; Yacovlev, Victor; Khachko, Denis; Spirin, Sergei; Roytberg, Mikhail

    2016-01-01

    The Universe of RNA Structures DataBase (URSDB) stores information obtained from all RNA-containing PDB entries (2935 entries in October 2015). The content of the database is updated regularly. The database consists of 51 tables containing indexed data on various elements of the RNA structures. The database provides a web interface allowing user to select a subset of structures with desired features and to obtain various statistical data for a selected subset of structures or for all structures. In particular, one can easily obtain statistics on geometric parameters of base pairs, on structural motifs (stems, loops, etc.) or on different types of pseudoknots. The user can also view and get information on an individual structure or its selected parts, e.g. RNA-protein hydrogen bonds. URSDB employs a new original definition of loops in RNA structures. That definition fits both pseudoknot-free and pseudoknotted secondary structures and coincides with the classical definition in case of pseudoknot-free structures. To our knowledge, URSDB is the first database supporting searches based on topological classification of pseudoknots and on extended loop classification.Database URL: http://server3.lpm.org.ru/urs/. © The Author(s) 2016. Published by Oxford University Press.

  6. URS DataBase: universe of RNA structures and their motifs

    PubMed Central

    Baulin, Eugene; Yacovlev, Victor; Khachko, Denis; Spirin, Sergei; Roytberg, Mikhail

    2016-01-01

    The Universe of RNA Structures DataBase (URSDB) stores information obtained from all RNA-containing PDB entries (2935 entries in October 2015). The content of the database is updated regularly. The database consists of 51 tables containing indexed data on various elements of the RNA structures. The database provides a web interface allowing user to select a subset of structures with desired features and to obtain various statistical data for a selected subset of structures or for all structures. In particular, one can easily obtain statistics on geometric parameters of base pairs, on structural motifs (stems, loops, etc.) or on different types of pseudoknots. The user can also view and get information on an individual structure or its selected parts, e.g. RNA–protein hydrogen bonds. URSDB employs a new original definition of loops in RNA structures. That definition fits both pseudoknot-free and pseudoknotted secondary structures and coincides with the classical definition in case of pseudoknot-free structures. To our knowledge, URSDB is the first database supporting searches based on topological classification of pseudoknots and on extended loop classification. Database URL: http://server3.lpm.org.ru/urs/ PMID:27242032

  7. Application of identified sensitive physical parameters in reducing the uncertainty of numerical simulation

    NASA Astrophysics Data System (ADS)

    Sun, Guodong; Mu, Mu

    2016-04-01

    An important source of uncertainty, which then causes further uncertainty in numerical simulations, is that residing in the parameters describing physical processes in numerical models. There are many physical parameters in numerical models in the atmospheric and oceanic sciences, and it would cost a great deal to reduce uncertainties in all physical parameters. Therefore, finding a subset of these parameters, which are relatively more sensitive and important parameters, and reducing the errors in the physical parameters in this subset would be a far more efficient way to reduce the uncertainties involved in simulations. In this context, we present a new approach based on the conditional nonlinear optimal perturbation related to parameter (CNOP-P) method. The approach provides a framework to ascertain the subset of those relatively more sensitive and important parameters among the physical parameters. The Lund-Potsdam-Jena (LPJ) dynamical global vegetation model was utilized to test the validity of the new approach. The results imply that nonlinear interactions among parameters play a key role in the uncertainty of numerical simulations in arid and semi-arid regions of China compared to those in northern, northeastern and southern China. The uncertainties in the numerical simulations were reduced considerably by reducing the errors of the subset of relatively more sensitive and important parameters. The results demonstrate that our approach not only offers a new route to identify relatively more sensitive and important physical parameters but also that it is viable to then apply "target observations" to reduce the uncertainties in model parameters.

  8. Optimal Artificial Boundary Condition Configurations for Sensitivity-Based Model Updating and Damage Detection

    DTIC Science & Technology

    2010-09-01

    matrix is used in many methods, like Jacobi or Gauss Seidel , for solving linear systems. Also, no partial pivoting is necessary for a strictly column...problems that arise during the procedure, which in general, converges to the solving of a linear system. The most common issue with the solution is the... iterative procedure to find an appropriate subset of parameters that produce an optimal solution commonly known as forward selection. Then, the

  9. Supernova Cosmology Inference with Probabilistic Photometric Redshifts (SCIPPR)

    NASA Astrophysics Data System (ADS)

    Peters, Christina; Malz, Alex; Hlozek, Renée

    2018-01-01

    The Bayesian Estimation Applied to Multiple Species (BEAMS) framework employs probabilistic supernova type classifications to do photometric SN cosmology. This work extends BEAMS to replace high-confidence spectroscopic redshifts with photometric redshift probability density functions, a capability that will be essential in the era the Large Synoptic Survey Telescope and other next-generation photometric surveys where it will not be possible to perform spectroscopic follow up on every SN. We present the Supernova Cosmology Inference with Probabilistic Photometric Redshifts (SCIPPR) Bayesian hierarchical model for constraining the cosmological parameters from photometric lightcurves and host galaxy photometry, which includes selection effects and is extensible to uncertainty in the redshift-dependent supernova type proportions. We create a pair of realistic mock catalogs of joint posteriors over supernova type, redshift, and distance modulus informed by photometric supernova lightcurves and over redshift from simulated host galaxy photometry. We perform inference under our model to obtain a joint posterior probability distribution over the cosmological parameters and compare our results with other methods, namely: a spectroscopic subset, a subset of high probability photometrically classified supernovae, and reducing the photometric redshift probability to a single measurement and error bar.

  10. Adaptive Neuro-Fuzzy Determination of the Effect of Experimental Parameters on Vehicle Agent Speed Relative to Vehicle Intruder.

    PubMed

    Shamshirband, Shahaboddin; Banjanovic-Mehmedovic, Lejla; Bosankic, Ivan; Kasapovic, Suad; Abdul Wahab, Ainuddin Wahid Bin

    2016-01-01

    Intelligent Transportation Systems rely on understanding, predicting and affecting the interactions between vehicles. The goal of this paper is to choose a small subset from the larger set so that the resulting regression model is simple, yet have good predictive ability for Vehicle agent speed relative to Vehicle intruder. The method of ANFIS (adaptive neuro fuzzy inference system) was applied to the data resulting from these measurements. The ANFIS process for variable selection was implemented in order to detect the predominant variables affecting the prediction of agent speed relative to intruder. This process includes several ways to discover a subset of the total set of recorded parameters, showing good predictive capability. The ANFIS network was used to perform a variable search. Then, it was used to determine how 9 parameters (Intruder Front sensors active (boolean), Intruder Rear sensors active (boolean), Agent Front sensors active (boolean), Agent Rear sensors active (boolean), RSSI signal intensity/strength (integer), Elapsed time (in seconds), Distance between Agent and Intruder (m), Angle of Agent relative to Intruder (angle between vehicles °), Altitude difference between Agent and Intruder (m)) influence prediction of agent speed relative to intruder. The results indicated that distance between Vehicle agent and Vehicle intruder (m) and angle of Vehicle agent relative to Vehicle Intruder (angle between vehicles °) is the most influential parameters to Vehicle agent speed relative to Vehicle intruder.

  11. Skin lesion computational diagnosis of dermoscopic images: Ensemble models based on input feature manipulation.

    PubMed

    Oliveira, Roberta B; Pereira, Aledir S; Tavares, João Manuel R S

    2017-10-01

    The number of deaths worldwide due to melanoma has risen in recent times, in part because melanoma is the most aggressive type of skin cancer. Computational systems have been developed to assist dermatologists in early diagnosis of skin cancer, or even to monitor skin lesions. However, there still remains a challenge to improve classifiers for the diagnosis of such skin lesions. The main objective of this article is to evaluate different ensemble classification models based on input feature manipulation to diagnose skin lesions. Input feature manipulation processes are based on feature subset selections from shape properties, colour variation and texture analysis to generate diversity for the ensemble models. Three subset selection models are presented here: (1) a subset selection model based on specific feature groups, (2) a correlation-based subset selection model, and (3) a subset selection model based on feature selection algorithms. Each ensemble classification model is generated using an optimum-path forest classifier and integrated with a majority voting strategy. The proposed models were applied on a set of 1104 dermoscopic images using a cross-validation procedure. The best results were obtained by the first ensemble classification model that generates a feature subset ensemble based on specific feature groups. The skin lesion diagnosis computational system achieved 94.3% accuracy, 91.8% sensitivity and 96.7% specificity. The input feature manipulation process based on specific feature subsets generated the greatest diversity for the ensemble classification model with very promising results. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study.

    PubMed

    Olivera, André Rodrigues; Roesler, Valter; Iochpe, Cirano; Schmidt, Maria Inês; Vigo, Álvaro; Barreto, Sandhi Maria; Duncan, Bruce Bartholow

    2017-01-01

    Type 2 diabetes is a chronic disease associated with a wide range of serious health complications that have a major impact on overall health. The aims here were to develop and validate predictive models for detecting undiagnosed diabetes using data from the Longitudinal Study of Adult Health (ELSA-Brasil) and to compare the performance of different machine-learning algorithms in this task. Comparison of machine-learning algorithms to develop predictive models using data from ELSA-Brasil. After selecting a subset of 27 candidate variables from the literature, models were built and validated in four sequential steps: (i) parameter tuning with tenfold cross-validation, repeated three times; (ii) automatic variable selection using forward selection, a wrapper strategy with four different machine-learning algorithms and tenfold cross-validation (repeated three times), to evaluate each subset of variables; (iii) error estimation of model parameters with tenfold cross-validation, repeated ten times; and (iv) generalization testing on an independent dataset. The models were created with the following machine-learning algorithms: logistic regression, artificial neural network, naïve Bayes, K-nearest neighbor and random forest. The best models were created using artificial neural networks and logistic regression. -These achieved mean areas under the curve of, respectively, 75.24% and 74.98% in the error estimation step and 74.17% and 74.41% in the generalization testing step. Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.

  13. Interactive model evaluation tool based on IPython notebook

    NASA Astrophysics Data System (ADS)

    Balemans, Sophie; Van Hoey, Stijn; Nopens, Ingmar; Seuntjes, Piet

    2015-04-01

    In hydrological modelling, some kind of parameter optimization is mostly performed. This can be the selection of a single best parameter set, a split in behavioural and non-behavioural parameter sets based on a selected threshold or a posterior parameter distribution derived with a formal Bayesian approach. The selection of the criterion to measure the goodness of fit (likelihood or any objective function) is an essential step in all of these methodologies and will affect the final selected parameter subset. Moreover, the discriminative power of the objective function is also dependent from the time period used. In practice, the optimization process is an iterative procedure. As such, in the course of the modelling process, an increasing amount of simulations is performed. However, the information carried by these simulation outputs is not always fully exploited. In this respect, we developed and present an interactive environment that enables the user to intuitively evaluate the model performance. The aim is to explore the parameter space graphically and to visualize the impact of the selected objective function on model behaviour. First, a set of model simulation results is loaded along with the corresponding parameter sets and a data set of the same variable as the model outcome (mostly discharge). The ranges of the loaded parameter sets define the parameter space. A selection of the two parameters visualised can be made by the user. Furthermore, an objective function and a time period of interest need to be selected. Based on this information, a two-dimensional parameter response surface is created, which actually just shows a scatter plot of the parameter combinations and assigns a color scale corresponding with the goodness of fit of each parameter combination. Finally, a slider is available to change the color mapping of the points. Actually, the slider provides a threshold to exclude non behaviour parameter sets and the color scale is only attributed to the remaining parameter sets. As such, by interactively changing the settings and interpreting the graph, the user gains insight in the model structural behaviour. Moreover, a more deliberate choice of objective function and periods of high information content can be identified. The environment is written in an IPython notebook and uses the available interactive functions provided by the IPython community. As such, the power of the IPython notebook as a development environment for scientific computing is illustrated (Shen, 2014).

  14. Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.

    PubMed

    Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin

    2017-04-01

    As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.

  15. An evaluation of exact methods for the multiple subset maximum cardinality selection problem.

    PubMed

    Brusco, Michael J; Köhn, Hans-Friedrich; Steinley, Douglas

    2016-05-01

    The maximum cardinality subset selection problem requires finding the largest possible subset from a set of objects, such that one or more conditions are satisfied. An important extension of this problem is to extract multiple subsets, where the addition of one more object to a larger subset would always be preferred to increases in the size of one or more smaller subsets. We refer to this as the multiple subset maximum cardinality selection problem (MSMCSP). A recently published branch-and-bound algorithm solves the MSMCSP as a partitioning problem. Unfortunately, the computational requirement associated with the algorithm is often enormous, thus rendering the method infeasible from a practical standpoint. In this paper, we present an alternative approach that successively solves a series of binary integer linear programs to obtain a globally optimal solution to the MSMCSP. Computational comparisons of the methods using published similarity data for 45 food items reveal that the proposed sequential method is computationally far more efficient than the branch-and-bound approach. © 2016 The British Psychological Society.

  16. [Varicocele and coincidental abacterial prostato-vesiculitis: negative role about the sperm output].

    PubMed

    Vicari, Enzo; La Vignera, Sandro; Tracia, Angelo; Cardì, Francesco; Donati, Angelo

    2003-03-01

    To evaluate the frequency and the role of a coincidentally expressed abacterial prostato-vesiculitis (PV) on sperm output in patients with left varicocele (Vr). We evaluated 143 selected infertile patients (mean age 27 years, range 21-43), with oligo- and/or astheno- and/or teratozoospermia (OAT) subdivided in two groups. Group A included 76 patients with previous varicocelectomy and persistent OAT. Group B included 67 infertile patients (mean age 26 years, range 21-37) with OAT and not varicocelectomized. Patients with Vr and coincidental didymo-epididymal ultrasound (US) abnormalities were excluded from the study. Following rectal prostato-vesicular ultrasonography, each group was subdivided in two subsets on the basis of the absence (group A: subset Vr-/PV-; and group B: subset Vr+/PV-) or the presence of an abacterial PV (group A: subset Vr-/PV+; group B: subset Vr+/PV+). Particularly, PV was present in 47.4% and 41.8% patients of groups A and B, respectively. This coincidental pathology was ipsilateral with Vr in the 61% of the cases. Semen analysis was performed in all patients. Patients of group A showed a total sperm number significantly higher than those found in group B. In presence of PV, sperm parameters were not significantly different between matched--subsets (Vr-/PV+ vs. Vr+/PV+). In absence of PV, the sperm density, the total sperm number and the percentage of forward motility from subset with previous varicocelectomy (Vr-/PV) exhibited values significantly higher than those found in the matched--subset (Vr+/PV-). Sperm analysis alone performed in patients with left Vr is not a useful prognostic post-varicocelectomy marker. Since following varicocelectomy a lack of sperm response could mask another coincidental pathology, the identification through US scans of a possible PV may be mandatory. On the other hand, an integrated uro-andrological approach, including US scans, allows to enucleate subsets of patients with Vr alone, who will have an expected better sperm response following Vr repair.

  17. Near-Infrared 1064 nm Laser Modulates Migratory Dendritic Cells To Augment the Immune Response to Intradermal Influenza Vaccine.

    PubMed

    Morse, Kaitlyn; Kimizuka, Yoshifumi; Chan, Megan P K; Shibata, Mai; Shimaoka, Yusuke; Takeuchi, Shu; Forbes, Benjamin; Nirschl, Christopher; Li, Binghao; Zeng, Yang; Bronson, Roderick T; Katagiri, Wataru; Shigeta, Ayako; Sîrbulescu, Ruxandra F; Chen, Huabiao; Tan, Rhea Y Y; Tsukada, Kosuke; Brauns, Timothy; Gelfand, Jeffrey; Sluder, Ann; Locascio, Joseph J; Poznansky, Mark C; Anandasabapathy, Niroshana; Kashiwagi, Satoshi

    2017-08-15

    Brief exposure of skin to near-infrared (NIR) laser light has been shown to augment the immune response to intradermal vaccination and thus act as an immunologic adjuvant. Although evidence indicates that the NIR laser adjuvant has the capacity to activate innate subsets including dendritic cells (DCs) in skin as conventional adjuvants do, the precise immunological mechanism by which the NIR laser adjuvant acts is largely unknown. In this study we sought to identify the cellular target of the NIR laser adjuvant by using an established mouse model of intradermal influenza vaccination and examining the alteration of responses resulting from genetic ablation of specific DC populations. We found that a continuous wave (CW) NIR laser adjuvant broadly modulates migratory DC (migDC) populations, specifically increasing and activating the Lang + and CD11b - Lang - subsets in skin, and that the Ab responses augmented by the CW NIR laser are dependent on DC subsets expressing CCR2 and Langerin. In comparison, a pulsed wave NIR laser adjuvant showed limited effects on the migDC subsets. Our vaccination study demonstrated that the efficacy of the CW NIR laser is significantly better than that of the pulsed wave laser, indicating that the CW NIR laser offers a desirable immunostimulatory microenvironment for migDCs. These results demonstrate the unique ability of the NIR laser adjuvant to selectively target specific migDC populations in skin depending on its parameters, and highlight the importance of optimization of laser parameters for desirable immune protection induced by an NIR laser-adjuvanted vaccine. Copyright © 2017 by The American Association of Immunologists, Inc.

  18. Stochastic subset selection for learning with kernel machines.

    PubMed

    Rhinelander, Jason; Liu, Xiaoping P

    2012-06-01

    Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm.

  19. Design and Application of Drought Indexes in Highly Regulated Mediterranean Water Systems

    NASA Astrophysics Data System (ADS)

    Castelletti, A.; Zaniolo, M.; Giuliani, M.

    2017-12-01

    Costs of drought are progressively increasing due to the undergoing alteration of hydro-meteorological regimes induced by climate change. Although drought management is largely studied in the literature, most of the traditional drought indexes fail in detecting critical events in highly regulated systems, which generally rely on ad-hoc formulations and cannot be generalized to different context. In this study, we contribute a novel framework for the design of a basin-customized drought index. This index represents a surrogate of the state of the basin and is computed by combining the available information about the water available in the system to reproduce a representative target variable for the drought condition of the basin (e.g., water deficit). To select the relevant variables and combinatione thereof, we use an advanced feature extraction algorithm called Wrapper for Quasi Equally Informative Subset Selection (W-QEISS). W-QEISS relies on a multi-objective evolutionary algorithm to find Pareto-efficient subsets of variables by maximizing the wrapper accuracy, minimizing the number of selected variables, and optimizing relevance and redundancy of the subset. The accuracy objective is evaluated trough the calibration of an extreme learning machine of the water deficit for each candidate subset of variables, with the index selected from the resulting solutions identifying a suitable compromise between accuracy, cardinality, relevance, and redundancy. The approach is tested on Lake Como, Italy, a regulated lake mainly operated for irrigation supply. In the absence of an institutional drought monitoring system, we constructed the combined index using all the hydrological variables from the existing monitoring system as well as common drought indicators at multiple time aggregations. The soil moisture deficit in the root zone computed by a distributed-parameter water balance model of the agricultural districts is used as target variable. Numerical results show that our combined drought index succesfully reproduces the deficit. The index represents a valuable information for supporting appropriate drought management strategies, including the possibility of directly informing the lake operations about the drought conditions and improve the overall reliability of the irrigation supply system.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Alam, Ujjaini; Lasue, Jeremie, E-mail: ujjaini.alam@gmail.com, E-mail: jeremie.lasue@irap.omp.eu

    We examine three SNe Type Ia datasets: Union2.1, JLA and Panstarrs to check their consistency using cosmology blind statistical analyses as well as cosmological parameter fitting. We find that the Panstarrs dataset is the most stable of the three to changes in the data, although it does not, at the moment, go to high enough redshifts to tightly constrain the equation of state of dark energy, w . The Union2.1, drawn from several different sources, appears to be somewhat susceptible to changes within the dataset. The JLA reconstructs well for a smaller number of cosmological parameters. At higher degrees ofmore » freedom, the dependence of its errors on redshift can lead to varying results between subsets. Panstarrs is inconsistent with the other two datasets at about 2σ confidence level, and JLA and Union2.1 are about 1σ away from each other. For the Ω{sub 0} {sub m} − w cosmological reconstruction, with no additional data, the 1σ range of values in w for selected subsets of each dataset is two times larger for JLA and Union2.1 as compared to Panstarrs. The range in Ω{sub 0} {sub m} for the same subsets remains approximately similar for all three datasets. We find that although there are differences in the fitting and correction techniques used in the different samples, the most important criterion is the selection of the SNe, a slightly different SNe selection can lead to noticeably different results both in the purely statistical analysis and in cosmological reconstruction. We note that a single, high quality low redshift sample could help decrease the uncertainties in the result. We also note that lack of homogeneity in the magnitude errors may bias the results and should either be modeled, or its effect neutralized by using other, complementary datasets. A supernova sample with high quality data at both high and low redshifts, constructed from a few surveys to avoid heterogeneity in the sample, and with homogeneous errors, would result in a more robust cosmological reconstruction.« less

  1. Validation and upgrading of physically based mathematical models

    NASA Technical Reports Server (NTRS)

    Duval, Ronald

    1992-01-01

    The validation of the results of physically-based mathematical models against experimental results was discussed. Systematic techniques are used for: (1) isolating subsets of the simulator mathematical model and comparing the response of each subset to its experimental response for the same input conditions; (2) evaluating the response error to determine whether it is the result of incorrect parameter values, incorrect structure of the model subset, or unmodeled external effects of cross coupling; and (3) modifying and upgrading the model and its parameter values to determine the most physically appropriate combination of changes.

  2. A regularized variable selection procedure in additive hazards model with stratified case-cohort design.

    PubMed

    Ni, Ai; Cai, Jianwen

    2018-07-01

    Case-cohort designs are commonly used in large epidemiological studies to reduce the cost associated with covariate measurement. In many such studies the number of covariates is very large. An efficient variable selection method is needed for case-cohort studies where the covariates are only observed in a subset of the sample. Current literature on this topic has been focused on the proportional hazards model. However, in many studies the additive hazards model is preferred over the proportional hazards model either because the proportional hazards assumption is violated or the additive hazards model provides more relevent information to the research question. Motivated by one such study, the Atherosclerosis Risk in Communities (ARIC) study, we investigate the properties of a regularized variable selection procedure in stratified case-cohort design under an additive hazards model with a diverging number of parameters. We establish the consistency and asymptotic normality of the penalized estimator and prove its oracle property. Simulation studies are conducted to assess the finite sample performance of the proposed method with a modified cross-validation tuning parameter selection methods. We apply the variable selection procedure to the ARIC study to demonstrate its practical use.

  3. YamiPred: A Novel Evolutionary Method for Predicting Pre-miRNAs and Selecting Relevant Features.

    PubMed

    Kleftogiannis, Dimitrios; Theofilatos, Konstantinos; Likothanassis, Spiros; Mavroudi, Seferina

    2015-01-01

    MicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of support vector machines (SVM) with genetic algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.

  4. Longitudinal analyses of correlated response efficiencies of fillet traits in Nile tilapia.

    PubMed

    Turra, E M; Fernandes, A F A; de Alvarenga, E R; Teixeira, E A; Alves, G F O; Manduca, L G; Murphy, T W; Silva, M A

    2018-03-01

    Recent studies with Nile tilapia have shown divergent results regarding the possibility of selecting on morphometric measurements to promote indirect genetic gains in fillet yield (FY). The use of indirect selection for fillet traits is important as these traits are only measurable after harvesting. Random regression models are a powerful tool in association studies to identify the best time point to measure and select animals. Random regression models can also be applied in a multiple trait approach to analyze indirect response to selection, which would avoid the need to sacrifice candidate fish. Therefore, the aim of this study was to investigate the genetic relationships between several body measurements, weight and fillet traits throughout the growth period and to evaluate the possibility of indirect selection for fillet traits in Nile tilapia. Data were collected from 2042 fish and was divided into two subsets. The first subset was used to estimate genetic parameters, including the permanent environmental effect for BW and body measurements (8758 records for each body measurement, as each fish was individually weighed and measured a maximum of six times). The second subset (2042 records for each trait) was used to estimate genetic correlations and heritabilities, which enabled the calculation of correlated response efficiencies between body measurements and the fillet traits. Heritability estimates across ages ranged from 0.05 to 0.5 for height, 0.02 to 0.48 for corrected length (CL), 0.05 to 0.68 for width, 0.08 to 0.57 for fillet weight (FW) and 0.12 to 0.42 for FY. All genetic correlation estimates between body measurements and FW were positive and strong (0.64 to 0.98). The estimates of genetic correlation between body measurements and FY were positive (except for CL at some ages), but weak to moderate (-0.08 to 0.68). These estimates resulted in strong and favorable correlated response efficiencies for FW and positive, but moderate for FY. These results indicate the possibility of achieving indirect genetic gains for FW and by selecting for morphometric traits, but low efficiency for FY when compared with direct selection.

  5. Detection of Nitrogen Content in Rubber Leaves Using Near-Infrared (NIR) Spectroscopy with Correlation-Based Successive Projections Algorithm (SPA).

    PubMed

    Tang, Rongnian; Chen, Xupeng; Li, Chuang

    2018-05-01

    Near-infrared spectroscopy is an efficient, low-cost technology that has potential as an accurate method in detecting the nitrogen content of natural rubber leaves. Successive projections algorithm (SPA) is a widely used variable selection method for multivariate calibration, which uses projection operations to select a variable subset with minimum multi-collinearity. However, due to the fluctuation of correlation between variables, high collinearity may still exist in non-adjacent variables of subset obtained by basic SPA. Based on analysis to the correlation matrix of the spectra data, this paper proposed a correlation-based SPA (CB-SPA) to apply the successive projections algorithm in regions with consistent correlation. The result shows that CB-SPA can select variable subsets with more valuable variables and less multi-collinearity. Meanwhile, models established by the CB-SPA subset outperform basic SPA subsets in predicting nitrogen content in terms of both cross-validation and external prediction. Moreover, CB-SPA is assured to be more efficient, for the time cost in its selection procedure is one-twelfth that of the basic SPA.

  6. Hematology and biochemistry reference intervals for Ontario commercial nursing pigs close to the time of weaning

    PubMed Central

    Perri, Amanda M.; O’Sullivan, Terri L.; Harding, John C.S.; Wood, R. Darren; Friendship, Robert M.

    2017-01-01

    The evaluation of pig hematology and biochemistry parameters is rarely done largely due to the costs associated with laboratory testing and labor, and the limited availability of reference intervals needed for interpretation. Within-herd and between-herd biological variation of these values also make it difficult to establish reference intervals. Regardless, baseline reference intervals are important to aid veterinarians in the interpretation of blood parameters for the diagnosis and treatment of diseased swine. The objective of this research was to provide reference intervals for hematology and biochemistry parameters of 3-week-old commercial nursing piglets in Ontario. A total of 1032 pigs lacking clinical signs of disease from 20 swine farms were sampled for hematology and iron panel evaluation, with biochemistry analysis performed on a subset of 189 randomly selected pigs. The 95% reference interval, mean, median, range, and 90% confidence intervals were calculated for each parameter. PMID:28373729

  7. Estimating genetic and phenotypic parameters of cellular immune-associated traits in dairy cows.

    PubMed

    Denholm, Scott J; McNeilly, Tom N; Banos, Georgios; Coffey, Mike P; Russell, George C; Bagnall, Ainsley; Mitchell, Mairi C; Wall, Eileen

    2017-04-01

    Data collected from an experimental Holstein-Friesian research herd were used to determine genetic and phenotypic parameters of innate and adaptive cellular immune-associated traits. Relationships between immune-associated traits and production, health, and fertility traits were also investigated. Repeated blood leukocyte records were analyzed in 546 cows for 9 cellular immune-associated traits, including percent T cell subsets, B cells, NK cells, and granulocytes. Variance components were estimated by univariate analysis. Heritability estimates were obtained for all 9 traits, the highest of which were observed in the T cell subsets percent CD4 + , percent CD8 + , CD4 + :CD8 + ratio, and percent NKp46 + cells (0.46, 0.41, 0.43 and 0.42, respectively), with between-individual variation accounting for 59 to 81% of total phenotypic variance. Associations between immune-associated traits and production, health, and fertility traits were investigated with bivariate analyses. Strong genetic correlations were observed between percent NKp46 + and stillbirth rate (0.61), and lameness episodes and percent CD8 + (-0.51). Regarding production traits, the strongest relationships were between CD4 + :CD8 + ratio and weight phenotypes (-0.52 for live weight; -0.51 for empty body weight). Associations between feed conversion traits and immune-associated traits were also observed. Our results provide evidence that cellular immune-associated traits are heritable and repeatable, and the noticeable variation between animals would permit selection for altered trait values, particularly in the case of the T cell subsets. The associations we observed between immune-associated, health, fertility, and production traits suggest that genetic selection for cellular immune-associated traits could provide a useful tool in improving animal health, fitness, and fertility. The Authors. Published by the Federation of Animal Science Societies and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY 2.0 license (http://creativecommons.org/licenses/by/2.0/).

  8. Vibration and acoustic frequency spectra for industrial process modeling using selective fusion multi-condition samples and multi-source features

    NASA Astrophysics Data System (ADS)

    Tang, Jian; Qiao, Junfei; Wu, ZhiWei; Chai, Tianyou; Zhang, Jian; Yu, Wen

    2018-01-01

    Frequency spectral data of mechanical vibration and acoustic signals relate to difficult-to-measure production quality and quantity parameters of complex industrial processes. A selective ensemble (SEN) algorithm can be used to build a soft sensor model of these process parameters by fusing valued information selectively from different perspectives. However, a combination of several optimized ensemble sub-models with SEN cannot guarantee the best prediction model. In this study, we use several techniques to construct mechanical vibration and acoustic frequency spectra of a data-driven industrial process parameter model based on selective fusion multi-condition samples and multi-source features. Multi-layer SEN (MLSEN) strategy is used to simulate the domain expert cognitive process. Genetic algorithm and kernel partial least squares are used to construct the inside-layer SEN sub-model based on each mechanical vibration and acoustic frequency spectral feature subset. Branch-and-bound and adaptive weighted fusion algorithms are integrated to select and combine outputs of the inside-layer SEN sub-models. Then, the outside-layer SEN is constructed. Thus, "sub-sampling training examples"-based and "manipulating input features"-based ensemble construction methods are integrated, thereby realizing the selective information fusion process based on multi-condition history samples and multi-source input features. This novel approach is applied to a laboratory-scale ball mill grinding process. A comparison with other methods indicates that the proposed MLSEN approach effectively models mechanical vibration and acoustic signals.

  9. Selection of a Representative Subset of Global Climate Models that Captures the Profile of Regional Changes for Integrated Climate Impacts Assessment

    NASA Technical Reports Server (NTRS)

    Ruane, Alex C.; Mcdermid, Sonali P.

    2017-01-01

    We present the Representative Temperature and Precipitation (T&P) GCM Subsetting Approach developed within the Agricultural Model Intercomparison and Improvement Project (AgMIP) to select a practical subset of global climate models (GCMs) for regional integrated assessment of climate impacts when resource limitations do not permit the full ensemble of GCMs to be evaluated given the need to also focus on impacts sector and economics models. Subsetting inherently leads to a loss of information but can free up resources to explore important uncertainties in the integrated assessment that would otherwise be prohibitive. The Representative T&P GCM Subsetting Approach identifies five individual GCMs that capture a profile of the full ensemble of temperature and precipitation change within the growing season while maintaining information about the probability that basic classes of climate changes (relatively cool/wet, cool/dry, middle, hot/wet, and hot/dry) are projected in the full GCM ensemble. We demonstrate the selection methodology for maize impacts in Ames, Iowa, and discuss limitations and situations when additional information may be required to select representative GCMs. We then classify 29 GCMs over all land areas to identify regions and seasons with characteristic diagonal skewness related to surface moisture as well as extreme skewness connected to snow-albedo feedbacks and GCM uncertainty. Finally, we employ this basic approach to recognize that GCM projections demonstrate coherence across space, time, and greenhouse gas concentration pathway. The Representative T&P GCM Subsetting Approach provides a quantitative basis for the determination of useful GCM subsets, provides a practical and coherent approach where previous assessments selected solely on availability of scenarios, and may be extended for application to a range of scales and sectoral impacts.

  10. Bone Pose Estimation in the Presence of Soft Tissue Artifact Using Triangular Cosserat Point Elements.

    PubMed

    Solav, Dana; Rubin, M B; Cereatti, Andrea; Camomilla, Valentina; Wolf, Alon

    2016-04-01

    Accurate estimation of the position and orientation (pose) of a bone from a cluster of skin markers is limited mostly by the relative motion between the bone and the markers, which is known as the soft tissue artifact (STA). This work presents a method, based on continuum mechanics, to describe the kinematics of a cluster affected by STA. The cluster is characterized by triangular cosserat point elements (TCPEs) defined by all combinations of three markers. The effects of the STA on the TCPEs are quantified using three parameters describing the strain in each TCPE and the relative rotation and translation between TCPEs. The method was evaluated using previously collected ex vivo kinematic data. Femur pose was estimated from 12 skin markers on the thigh, while its reference pose was measured using bone pins. Analysis revealed that instantaneous subsets of TCPEs exist which estimate bone position and orientation more accurately than the Procrustes Superimposition applied to the cluster of all markers. It has been shown that some of these parameters correlate well with femur pose errors, which suggests that they can be used to select, at each instant, subsets of TCPEs leading an improved estimation of the underlying bone pose.

  11. Using learning automata to determine proper subset size in high-dimensional spaces

    NASA Astrophysics Data System (ADS)

    Seyyedi, Seyyed Hossein; Minaei-Bidgoli, Behrouz

    2017-03-01

    In this paper, we offer a new method called FSLA (Finding the best candidate Subset using Learning Automata), which combines the filter and wrapper approaches for feature selection in high-dimensional spaces. Considering the difficulties of dimension reduction in high-dimensional spaces, FSLA's multi-objective functionality is to determine, in an efficient manner, a feature subset that leads to an appropriate tradeoff between the learning algorithm's accuracy and efficiency. First, using an existing weighting function, the feature list is sorted and selected subsets of the list of different sizes are considered. Then, a learning automaton verifies the performance of each subset when it is used as the input space of the learning algorithm and estimates its fitness upon the algorithm's accuracy and the subset size, which determines the algorithm's efficiency. Finally, FSLA introduces the fittest subset as the best choice. We tested FSLA in the framework of text classification. The results confirm its promising performance of attaining the identified goal.

  12. Two-stage atlas subset selection in multi-atlas based image segmentation.

    PubMed

    Zhao, Tingting; Ruan, Dan

    2015-06-01

    Fast growing access to large databases and cloud stored data presents a unique opportunity for multi-atlas based image segmentation and also presents challenges in heterogeneous atlas quality and computation burden. This work aims to develop a novel two-stage method tailored to the special needs in the face of large atlas collection with varied quality, so that high-accuracy segmentation can be achieved with low computational cost. An atlas subset selection scheme is proposed to substitute a significant portion of the computationally expensive full-fledged registration in the conventional scheme with a low-cost alternative. More specifically, the authors introduce a two-stage atlas subset selection method. In the first stage, an augmented subset is obtained based on a low-cost registration configuration and a preliminary relevance metric; in the second stage, the subset is further narrowed down to a fusion set of desired size, based on full-fledged registration and a refined relevance metric. An inference model is developed to characterize the relationship between the preliminary and refined relevance metrics, and a proper augmented subset size is derived to ensure that the desired atlases survive the preliminary selection with high probability. The performance of the proposed scheme has been assessed with cross validation based on two clinical datasets consisting of manually segmented prostate and brain magnetic resonance images, respectively. The proposed scheme demonstrates comparable end-to-end segmentation performance as the conventional single-stage selection method, but with significant computation reduction. Compared with the alternative computation reduction method, their scheme improves the mean and medium Dice similarity coefficient value from (0.74, 0.78) to (0.83, 0.85) and from (0.82, 0.84) to (0.95, 0.95) for prostate and corpus callosum segmentation, respectively, with statistical significance. The authors have developed a novel two-stage atlas subset selection scheme for multi-atlas based segmentation. It achieves good segmentation accuracy with significantly reduced computation cost, making it a suitable configuration in the presence of extensive heterogeneous atlases.

  13. Hash Bit Selection for Nearest Neighbor Search.

    PubMed

    Xianglong Liu; Junfeng He; Shih-Fu Chang

    2017-11-01

    To overcome the barrier of storage and computation when dealing with gigantic-scale data sets, compact hashing has been studied extensively to approximate the nearest neighbor search. Despite the recent advances, critical design issues remain open in how to select the right features, hashing algorithms, and/or parameter settings. In this paper, we address these by posing an optimal hash bit selection problem, in which an optimal subset of hash bits are selected from a pool of candidate bits generated by different features, algorithms, or parameters. Inspired by the optimization criteria used in existing hashing algorithms, we adopt the bit reliability and their complementarity as the selection criteria that can be carefully tailored for hashing performance in different tasks. Then, the bit selection solution is discovered by finding the best tradeoff between search accuracy and time using a modified dynamic programming method. To further reduce the computational complexity, we employ the pairwise relationship among hash bits to approximate the high-order independence property, and formulate it as an efficient quadratic programming method that is theoretically equivalent to the normalized dominant set problem in a vertex- and edge-weighted graph. Extensive large-scale experiments have been conducted under several important application scenarios of hash techniques, where our bit selection framework can achieve superior performance over both the naive selection methods and the state-of-the-art hashing algorithms, with significant accuracy gains ranging from 10% to 50%, relatively.

  14. PyCoTools: A Python Toolbox for COPASI.

    PubMed

    Welsh, Ciaran M; Fullard, Nicola; Proctor, Carole J; Martinez-Guimera, Alvaro; Isfort, Robert J; Bascom, Charles C; Tasseff, Ryan; Przyborski, Stefan A; Shanley, Daryl P

    2018-05-22

    COPASI is an open source software package for constructing, simulating and analysing dynamic models of biochemical networks. COPASI is primarily intended to be used with a graphical user interface but often it is desirable to be able to access COPASI features programmatically, with a high level interface. PyCoTools is a Python package aimed at providing a high level interface to COPASI tasks with an emphasis on model calibration. PyCoTools enables the construction of COPASI models and the execution of a subset of COPASI tasks including time courses, parameter scans and parameter estimations. Additional 'composite' tasks which use COPASI tasks as building blocks are available for increasing parameter estimation throughput, performing identifiability analysis and performing model selection. PyCoTools supports exploratory data analysis on parameter estimation data to assist with troubleshooting model calibrations. We demonstrate PyCoTools by posing a model selection problem designed to show case PyCoTools within a realistic scenario. The aim of the model selection problem is to test the feasibility of three alternative hypotheses in explaining experimental data derived from neonatal dermal fibroblasts in response to TGF-β over time. PyCoTools is used to critically analyse the parameter estimations and propose strategies for model improvement. PyCoTools can be downloaded from the Python Package Index (PyPI) using the command 'pip install pycotools' or directly from GitHub (https://github.com/CiaranWelsh/pycotools). Documentation at http://pycotools.readthedocs.io. Supplementary data are available at Bioinformatics.

  15. A Systematic Approach for Model-Based Aircraft Engine Performance Estimation

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Garg, Sanjay

    2010-01-01

    A requirement for effective aircraft engine performance estimation is the ability to account for engine degradation, generally described in terms of unmeasurable health parameters such as efficiencies and flow capacities related to each major engine module. This paper presents a linear point design methodology for minimizing the degradation-induced error in model-based aircraft engine performance estimation applications. The technique specifically focuses on the underdetermined estimation problem, where there are more unknown health parameters than available sensor measurements. A condition for Kalman filter-based estimation is that the number of health parameters estimated cannot exceed the number of sensed measurements. In this paper, the estimated health parameter vector will be replaced by a reduced order tuner vector whose dimension is equivalent to the sensed measurement vector. The reduced order tuner vector is systematically selected to minimize the theoretical mean squared estimation error of a maximum a posteriori estimator formulation. This paper derives theoretical estimation errors at steady-state operating conditions, and presents the tuner selection routine applied to minimize these values. Results from the application of the technique to an aircraft engine simulation are presented and compared to the estimation accuracy achieved through conventional maximum a posteriori and Kalman filter estimation approaches. Maximum a posteriori estimation results demonstrate that reduced order tuning parameter vectors can be found that approximate the accuracy of estimating all health parameters directly. Kalman filter estimation results based on the same reduced order tuning parameter vectors demonstrate that significantly improved estimation accuracy can be achieved over the conventional approach of selecting a subset of health parameters to serve as the tuner vector. However, additional development is necessary to fully extend the methodology to Kalman filter-based estimation applications.

  16. Efficient least angle regression for identification of linear-in-the-parameters models

    PubMed Central

    Beach, Thomas H.; Rezgui, Yacine

    2017-01-01

    Least angle regression, as a promising model selection method, differentiates itself from conventional stepwise and stagewise methods, in that it is neither too greedy nor too slow. It is closely related to L1 norm optimization, which has the advantage of low prediction variance through sacrificing part of model bias property in order to enhance model generalization capability. In this paper, we propose an efficient least angle regression algorithm for model selection for a large class of linear-in-the-parameters models with the purpose of accelerating the model selection process. The entire algorithm works completely in a recursive manner, where the correlations between model terms and residuals, the evolving directions and other pertinent variables are derived explicitly and updated successively at every subset selection step. The model coefficients are only computed when the algorithm finishes. The direct involvement of matrix inversions is thereby relieved. A detailed computational complexity analysis indicates that the proposed algorithm possesses significant computational efficiency, compared with the original approach where the well-known efficient Cholesky decomposition is involved in solving least angle regression. Three artificial and real-world examples are employed to demonstrate the effectiveness, efficiency and numerical stability of the proposed algorithm. PMID:28293140

  17. Adaptive feature selection using v-shaped binary particle swarm optimization.

    PubMed

    Teng, Xuyang; Dong, Hongbin; Zhou, Xiurong

    2017-01-01

    Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers.

  18. Adaptive feature selection using v-shaped binary particle swarm optimization

    PubMed Central

    Dong, Hongbin; Zhou, Xiurong

    2017-01-01

    Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers. PMID:28358850

  19. Fizzy: feature subset selection for metagenomics.

    PubMed

    Ditzler, Gregory; Morrison, J Calvin; Lan, Yemin; Rosen, Gail L

    2015-11-04

    Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α- & β-diversity. Feature subset selection--a sub-field of machine learning--can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.

  20. Fizzy. Feature subset selection for metagenomics

    DOE PAGES

    Ditzler, Gregory; Morrison, J. Calvin; Lan, Yemin; ...

    2015-11-04

    Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate betweenmore » age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.« less

  1. Anatomical constraints on attention: Hemifield independence is a signature of multifocal spatial selection

    PubMed Central

    Alvarez, George A; Gill, Jonathan; Cavanagh, Patrick

    2012-01-01

    Previous studies have shown independent attentional selection of targets in the left and right visual hemifields during attentional tracking (Alvarez & Cavanagh, 2005) but not during a visual search (Luck, Hillyard, Mangun, & Gazzaniga, 1989). Here we tested whether multifocal spatial attention is the critical process that operates independently in the two hemifields. It is explicitly required in tracking (attend to a subset of object locations, suppress the others) but not in the standard visual search task (where all items are potential targets). We used a modified visual search task in which observers searched for a target within a subset of display items, where the subset was selected based on location (Experiments 1 and 3A) or based on a salient feature difference (Experiments 2 and 3B). The results show hemifield independence in this subset visual search task with location-based selection but not with feature-based selection; this effect cannot be explained by general difficulty (Experiment 4). Combined, these findings suggest that hemifield independence is a signature of multifocal spatial attention and highlight the need for cognitive and neural theories of attention to account for anatomical constraints on selection mechanisms. PMID:22637710

  2. Hubble Parameter and Baryon Acoustic Oscillation Measurement Constraints on the Hubble Constant, the Deviation from the Spatially Flat ΛCDM Model, the Deceleration–Acceleration Transition Redshift, and Spatial Curvature

    NASA Astrophysics Data System (ADS)

    Yu, Hai; Ratra, Bharat; Wang, Fa-Yin

    2018-03-01

    We compile a complete collection of reliable Hubble parameter H(z) data to redshift z ≤ 2.36 and use them with the Gaussian Process method to determine continuous H(z) functions for various data subsets. From these continuous H(z)'s, summarizing across the data subsets considered, we find H 0 ∼ 67 ± 4 km s‑1 Mpc‑1, more consistent with the recent lower values determined using a variety of techniques. In most data subsets, we see a cosmological deceleration–acceleration transition at 2σ significance, with the data subsets transition redshifts varying over 0.33< {z}da}< 1.0 at 1σ significance. We find that the flat-ΛCDM model is consistent with the H(z) data to a z of 1.5 to 2.0, depending on data subset considered, with 2σ deviations from flat-ΛCDM above this redshift range. Using the continuous H(z) with baryon acoustic oscillation distance-redshift observations, we constrain the current spatial curvature density parameter to be {{{Ω }}}K0=-0.03+/- 0.21, consistent with a flat universe, but the large error bar does not rule out small values of spatial curvature that are now under debate.

  3. Unbiased feature selection in learning random forests for high-dimensional data.

    PubMed

    Nguyen, Thanh-Tung; Huang, Joshua Zhexue; Nguyen, Thuy Thi

    2015-01-01

    Random forests (RFs) have been widely used as a powerful classification method. However, with the randomization in both bagging samples and feature selection, the trees in the forest tend to select uninformative features for node splitting. This makes RFs have poor accuracy when working with high-dimensional data. Besides that, RFs have bias in the feature selection process where multivalued features are favored. Aiming at debiasing feature selection in RFs, we propose a new RF algorithm, called xRF, to select good features in learning RFs for high-dimensional data. We first remove the uninformative features using p-value assessment, and the subset of unbiased features is then selected based on some statistical measures. This feature subset is then partitioned into two subsets. A feature weighting sampling technique is used to sample features from these two subsets for building trees. This approach enables one to generate more accurate trees, while allowing one to reduce dimensionality and the amount of data needed for learning RFs. An extensive set of experiments has been conducted on 47 high-dimensional real-world datasets including image datasets. The experimental results have shown that RFs with the proposed approach outperformed the existing random forests in increasing the accuracy and the AUC measures.

  4. Designing basin-customized combined drought indices via feature extraction

    NASA Astrophysics Data System (ADS)

    Zaniolo, Marta; Giuliani, Matteo; Castelletti, Andrea

    2017-04-01

    The socio-economic costs of drought are progressively increasing worldwide due to the undergoing alteration of hydro-meteorological regimes induced by climate change. Although drought management is largely studied in the literature, most of the traditional drought indexes fail in detecting critical events in highly regulated systems, which generally rely on ad-hoc formulations and cannot be generalized to different context. In this study, we contribute a novel framework for the design of a basin-customized drought index. This index represents a surrogate of the state of the basin and is computed by combining the available information about the water available in the system to reproduce a representative target variable for the drought condition of the basin (e.g., water deficit). To select the relevant variables and how to combine them, we use an advanced feature extraction algorithm called Wrapper for Quasi Equally Informative Subset Selection (W-QEISS). The W-QEISS algorithm relies on a multi-objective evolutionary algorithm to find Pareto-efficient subsets of variables by maximizing the wrapper accuracy, minimizing the number of selected variables (cardinality) and optimizing relevance and redundancy of the subset. The accuracy objective is evaluated trough the calibration of a pre-defined model (i.e., an extreme learning machine) of the water deficit for each candidate subset of variables, with the index selected from the resulting solutions identifying a suitable compromise between accuracy, cardinality, relevance, and redundancy. The proposed methodology is tested in the case study of Lake Como in northern Italy, a regulated lake mainly operated for irrigation supply to four downstream agricultural districts. In the absence of an institutional drought monitoring system, we constructed the combined index using all the hydrological variables from the existing monitoring system as well as the most common drought indicators at multiple time aggregations. The soil moisture deficit in the root zone computed by a distributed-parameter water balance model of the agricultural districts is used as target variable. Numerical results show that our framework succeeds in constructing a combined drought index that reproduces the soil moisture deficit. Moreover, this index represents a valuable information for supporting appropriate drought management strategies, including the possibility of directly informing the lake operations about the drought conditions and improve the overall reliability of the irrigation supply system.

  5. Effects of Sample Selection on Estimates of Economic Impacts of Outdoor Recreation

    Treesearch

    Donald B.K. English

    1997-01-01

    Estimates of the economic impacts of recreation often come from spending data provided by a self-selected subset of a random sample of site visitors. The subset is frequently less than half the onsite sample. Biased vectors of per trip spending and impact estimates can result if self-selection is related to spending pattctns, and proper corrective procedures arc not...

  6. Visual analytics in cheminformatics: user-supervised descriptor selection for QSAR methods.

    PubMed

    Martínez, María Jimena; Ponzoni, Ignacio; Díaz, Mónica F; Vazquez, Gustavo E; Soto, Axel J

    2015-01-01

    The design of QSAR/QSPR models is a challenging problem, where the selection of the most relevant descriptors constitutes a key step of the process. Several feature selection methods that address this step are concentrated on statistical associations among descriptors and target properties, whereas the chemical knowledge is left out of the analysis. For this reason, the interpretability and generality of the QSAR/QSPR models obtained by these feature selection methods are drastically affected. Therefore, an approach for integrating domain expert's knowledge in the selection process is needed for increase the confidence in the final set of descriptors. In this paper a software tool, which we named Visual and Interactive DEscriptor ANalysis (VIDEAN), that combines statistical methods with interactive visualizations for choosing a set of descriptors for predicting a target property is proposed. Domain expertise can be added to the feature selection process by means of an interactive visual exploration of data, and aided by statistical tools and metrics based on information theory. Coordinated visual representations are presented for capturing different relationships and interactions among descriptors, target properties and candidate subsets of descriptors. The competencies of the proposed software were assessed through different scenarios. These scenarios reveal how an expert can use this tool to choose one subset of descriptors from a group of candidate subsets or how to modify existing descriptor subsets and even incorporate new descriptors according to his or her own knowledge of the target property. The reported experiences showed the suitability of our software for selecting sets of descriptors with low cardinality, high interpretability, low redundancy and high statistical performance in a visual exploratory way. Therefore, it is possible to conclude that the resulting tool allows the integration of a chemist's expertise in the descriptor selection process with a low cognitive effort in contrast with the alternative of using an ad-hoc manual analysis of the selected descriptors. Graphical abstractVIDEAN allows the visual analysis of candidate subsets of descriptors for QSAR/QSPR. In the two panels on the top, users can interactively explore numerical correlations as well as co-occurrences in the candidate subsets through two interactive graphs.

  7. Closed-form solutions for linear regulator-design of mechanical systems including optimal weighting matrix selection

    NASA Technical Reports Server (NTRS)

    Hanks, Brantley R.; Skelton, Robert E.

    1991-01-01

    This paper addresses the restriction of Linear Quadratic Regulator (LQR) solutions to the algebraic Riccati Equation to design spaces which can be implemented as passive structural members and/or dampers. A general closed-form solution to the optimal free-decay control problem is presented which is tailored for structural-mechanical systems. The solution includes, as subsets, special cases such as the Rayleigh Dissipation Function and total energy. Weighting matrix selection is a constrained choice among several parameters to obtain desired physical relationships. The closed-form solution is also applicable to active control design for systems where perfect, collocated actuator-sensor pairs exist. Some examples of simple spring mass systems are shown to illustrate key points.

  8. Selecting climate simulations for impact studies based on multivariate patterns of climate change.

    PubMed

    Mendlik, Thomas; Gobiet, Andreas

    In climate change impact research it is crucial to carefully select the meteorological input for impact models. We present a method for model selection that enables the user to shrink the ensemble to a few representative members, conserving the model spread and accounting for model similarity. This is done in three steps: First, using principal component analysis for a multitude of meteorological parameters, to find common patterns of climate change within the multi-model ensemble. Second, detecting model similarities with regard to these multivariate patterns using cluster analysis. And third, sampling models from each cluster, to generate a subset of representative simulations. We present an application based on the ENSEMBLES regional multi-model ensemble with the aim to provide input for a variety of climate impact studies. We find that the two most dominant patterns of climate change relate to temperature and humidity patterns. The ensemble can be reduced from 25 to 5 simulations while still maintaining its essential characteristics. Having such a representative subset of simulations reduces computational costs for climate impact modeling and enhances the quality of the ensemble at the same time, as it prevents double-counting of dependent simulations that would lead to biased statistics. The online version of this article (doi:10.1007/s10584-015-1582-0) contains supplementary material, which is available to authorized users.

  9. Retinal ganglion cells with distinct directional preferences differ in molecular identity, structure, and central projections.

    PubMed

    Kay, Jeremy N; De la Huerta, Irina; Kim, In-Jung; Zhang, Yifeng; Yamagata, Masahito; Chu, Monica W; Meister, Markus; Sanes, Joshua R

    2011-05-25

    The retina contains ganglion cells (RGCs) that respond selectively to objects moving in particular directions. Individual members of a group of ON-OFF direction-selective RGCs (ooDSGCs) detect stimuli moving in one of four directions: ventral, dorsal, nasal, or temporal. Despite this physiological diversity, little is known about subtype-specific differences in structure, molecular identity, and projections. To seek such differences, we characterized mouse transgenic lines that selectively mark ooDSGCs preferring ventral or nasal motion as well as a line that marks both ventral- and dorsal-preferring subsets. We then used the lines to identify cell surface molecules, including Cadherin 6, CollagenXXVα1, and Matrix metalloprotease 17, that are selectively expressed by distinct subsets of ooDSGCs. We also identify a neuropeptide, CART (cocaine- and amphetamine-regulated transcript), that distinguishes all ooDSGCs from other RGCs. Together, this panel of endogenous and transgenic markers distinguishes the four ooDSGC subsets. Patterns of molecular diversification occur before eye opening and are therefore experience independent. They may help to explain how the four subsets obtain distinct inputs. We also demonstrate differences among subsets in their dendritic patterns within the retina and their axonal projections to the brain. Differences in projections indicate that information about motion in different directions is sent to different destinations.

  10. Optimization of image reconstruction method for SPECT studies performed using [⁹⁹mTc-EDDA/HYNIC] octreotate in patients with neuroendocrine tumors.

    PubMed

    Sowa-Staszczak, Anna; Lenda-Tracz, Wioletta; Tomaszuk, Monika; Głowa, Bogusław; Hubalewska-Dydejczyk, Alicja

    2013-01-01

    Somatostatin receptor scintigraphy (SRS) is a useful tool in the assessment of GEP-NET (gastroenteropancreatic neuroendocrine tumor) patients. The choice of appropriate settings of image reconstruction parameters is crucial in interpretation of these images. The aim of the study was to investigate how the GEP NET lesion signal to noise ratio (TCS/TCB) depends on different reconstruction settings for Flash 3D software (Siemens). SRS results of 76 randomly selected patients with confirmed GEP-NET were analyzed. For SPECT studies the data were acquired using standard clinical settings 3-4 h after the injection of 740 MBq 99mTc-[EDDA/HYNIC] octreotate. To obtain final images the OSEM 3D Flash reconstruction with different settings and FBP reconstruction were used. First, the TCS/TCB ratio in voxels was analyzed for different combinations of the number of subsets and the number of iterations of the OSEM 3D Flash reconstruction. Secondly, the same ratio was analyzed for different parameters of the Gaussian filter (with FWHM = 2-4 times greater from the pixel size). Also the influence of scatter correction on the TCS/TCB ratio was investigated. With increasing number of subsets and iterations, the increase of TCS/TCB ratio was observed. With increasing settings of Gauss [FWHM coefficient] filter, the decrease of TCS/TCB ratio was reported. The use of scatter correction slightly decreases the values of this ratio. OSEM algorithm provides a meaningfully better reconstruction of the SRS SPECT study as compared to the FBP technique. A high number of subsets improves image quality (images are smoother). Increasing number of iterations gives a better contrast and the shapes of lesions and organs are sharper. The choice of reconstruction parameters is a compromise between image qualitative appearance and its quantitative accuracy and should not be modified when comparing multiple studies of the same patient.

  11. Two-stage atlas subset selection in multi-atlas based image segmentation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, Tingting, E-mail: tingtingzhao@mednet.ucla.edu; Ruan, Dan, E-mail: druan@mednet.ucla.edu

    2015-06-15

    Purpose: Fast growing access to large databases and cloud stored data presents a unique opportunity for multi-atlas based image segmentation and also presents challenges in heterogeneous atlas quality and computation burden. This work aims to develop a novel two-stage method tailored to the special needs in the face of large atlas collection with varied quality, so that high-accuracy segmentation can be achieved with low computational cost. Methods: An atlas subset selection scheme is proposed to substitute a significant portion of the computationally expensive full-fledged registration in the conventional scheme with a low-cost alternative. More specifically, the authors introduce a two-stagemore » atlas subset selection method. In the first stage, an augmented subset is obtained based on a low-cost registration configuration and a preliminary relevance metric; in the second stage, the subset is further narrowed down to a fusion set of desired size, based on full-fledged registration and a refined relevance metric. An inference model is developed to characterize the relationship between the preliminary and refined relevance metrics, and a proper augmented subset size is derived to ensure that the desired atlases survive the preliminary selection with high probability. Results: The performance of the proposed scheme has been assessed with cross validation based on two clinical datasets consisting of manually segmented prostate and brain magnetic resonance images, respectively. The proposed scheme demonstrates comparable end-to-end segmentation performance as the conventional single-stage selection method, but with significant computation reduction. Compared with the alternative computation reduction method, their scheme improves the mean and medium Dice similarity coefficient value from (0.74, 0.78) to (0.83, 0.85) and from (0.82, 0.84) to (0.95, 0.95) for prostate and corpus callosum segmentation, respectively, with statistical significance. Conclusions: The authors have developed a novel two-stage atlas subset selection scheme for multi-atlas based segmentation. It achieves good segmentation accuracy with significantly reduced computation cost, making it a suitable configuration in the presence of extensive heterogeneous atlases.« less

  12. Minimally buffered data transfers between nodes in a data communications network

    DOEpatents

    Miller, Douglas R.

    2015-06-23

    Methods, apparatus, and products for minimally buffered data transfers between nodes in a data communications network are disclosed that include: receiving, by a messaging module on an origin node, a storage identifier, a origin data type, and a target data type, the storage identifier specifying application storage containing data, the origin data type describing a data subset contained in the origin application storage, the target data type describing an arrangement of the data subset in application storage on a target node; creating, by the messaging module, origin metadata describing the origin data type; selecting, by the messaging module from the origin application storage in dependence upon the origin metadata and the storage identifier, the data subset; and transmitting, by the messaging module to the target node, the selected data subset for storing in the target application storage in dependence upon the target data type without temporarily buffering the data subset.

  13. A Cancer Gene Selection Algorithm Based on the K-S Test and CFS.

    PubMed

    Su, Qiang; Wang, Yina; Jiang, Xiaobing; Chen, Fuxue; Lu, Wen-Cong

    2017-01-01

    To address the challenging problem of selecting distinguished genes from cancer gene expression datasets, this paper presents a gene subset selection algorithm based on the Kolmogorov-Smirnov (K-S) test and correlation-based feature selection (CFS) principles. The algorithm selects distinguished genes first using the K-S test, and then, it uses CFS to select genes from those selected by the K-S test. We adopted support vector machines (SVM) as the classification tool and used the criteria of accuracy to evaluate the performance of the classifiers on the selected gene subsets. This approach compared the proposed gene subset selection algorithm with the K-S test, CFS, minimum-redundancy maximum-relevancy (mRMR), and ReliefF algorithms. The average experimental results of the aforementioned gene selection algorithms for 5 gene expression datasets demonstrate that, based on accuracy, the performance of the new K-S and CFS-based algorithm is better than those of the K-S test, CFS, mRMR, and ReliefF algorithms. The experimental results show that the K-S test-CFS gene selection algorithm is a very effective and promising approach compared to the K-S test, CFS, mRMR, and ReliefF algorithms.

  14. The Cross-Entropy Based Multi-Filter Ensemble Method for Gene Selection.

    PubMed

    Sun, Yingqiang; Lu, Chengbo; Li, Xiaobo

    2018-05-17

    The gene expression profile has the characteristics of a high dimension, low sample, and continuous type, and it is a great challenge to use gene expression profile data for the classification of tumor samples. This paper proposes a cross-entropy based multi-filter ensemble (CEMFE) method for microarray data classification. Firstly, multiple filters are used to select the microarray data in order to obtain a plurality of the pre-selected feature subsets with a different classification ability. The top N genes with the highest rank of each subset are integrated so as to form a new data set. Secondly, the cross-entropy algorithm is used to remove the redundant data in the data set. Finally, the wrapper method, which is based on forward feature selection, is used to select the best feature subset. The experimental results show that the proposed method is more efficient than other gene selection methods and that it can achieve a higher classification accuracy under fewer characteristic genes.

  15. Analysis of Information Content in High-Spectral Resolution Sounders using Subset Selection Analysis

    NASA Technical Reports Server (NTRS)

    Velez-Reyes, Miguel; Joiner, Joanna

    1998-01-01

    In this paper, we summarize the results of the sensitivity analysis and data reduction carried out to determine the information content of AIRS and IASI channels. The analysis and data reduction was based on the use of subset selection techniques developed in the linear algebra and statistical community to study linear dependencies in high dimensional data sets. We applied the subset selection method to study dependency among channels by studying the dependency among their weighting functions. Also, we applied the technique to study the information provided by the different levels in which the atmosphere is discretized for retrievals and analysis. Results from the method correlate well with intuition in many respects and point out to possible modifications for band selection in sensor design and number and location of levels in the analysis process.

  16. On the applicability of surrogate-based Markov chain Monte Carlo-Bayesian inversion to the Community Land Model: Case studies at flux tower sites: SURROGATE-BASED MCMC FOR CLM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, Maoyi; Ray, Jaideep; Hou, Zhangshuan

    2016-07-04

    The Community Land Model (CLM) has been widely used in climate and Earth system modeling. Accurate estimation of model parameters is needed for reliable model simulations and predictions under current and future conditions, respectively. In our previous work, a subset of hydrological parameters has been identified to have significant impact on surface energy fluxes at selected flux tower sites based on parameter screening and sensitivity analysis, which indicate that the parameters could potentially be estimated from surface flux observations at the towers. To date, such estimates do not exist. In this paper, we assess the feasibility of applying a Bayesianmore » model calibration technique to estimate CLM parameters at selected flux tower sites under various site conditions. The parameters are estimated as a joint probability density function (PDF) that provides estimates of uncertainty of the parameters being inverted, conditional on climatologically-average latent heat fluxes derived from observations. We find that the simulated mean latent heat fluxes from CLM using the calibrated parameters are generally improved at all sites when compared to those obtained with CLM simulations using default parameter sets. Further, our calibration method also results in credibility bounds around the simulated mean fluxes which bracket the measured data. The modes (or maximum a posteriori values) and 95% credibility intervals of the site-specific posterior PDFs are tabulated as suggested parameter values for each site. Analysis of relationships between the posterior PDFs and site conditions suggests that the parameter values are likely correlated with the plant functional type, which needs to be confirmed in future studies by extending the approach to more sites.« less

  17. On the applicability of surrogate-based MCMC-Bayesian inversion to the Community Land Model: Case studies at Flux tower sites

    DOE PAGES

    Huang, Maoyi; Ray, Jaideep; Hou, Zhangshuan; ...

    2016-06-01

    The Community Land Model (CLM) has been widely used in climate and Earth system modeling. Accurate estimation of model parameters is needed for reliable model simulations and predictions under current and future conditions, respectively. In our previous work, a subset of hydrological parameters has been identified to have significant impact on surface energy fluxes at selected flux tower sites based on parameter screening and sensitivity analysis, which indicate that the parameters could potentially be estimated from surface flux observations at the towers. To date, such estimates do not exist. In this paper, we assess the feasibility of applying a Bayesianmore » model calibration technique to estimate CLM parameters at selected flux tower sites under various site conditions. The parameters are estimated as a joint probability density function (PDF) that provides estimates of uncertainty of the parameters being inverted, conditional on climatologically average latent heat fluxes derived from observations. We find that the simulated mean latent heat fluxes from CLM using the calibrated parameters are generally improved at all sites when compared to those obtained with CLM simulations using default parameter sets. Further, our calibration method also results in credibility bounds around the simulated mean fluxes which bracket the measured data. The modes (or maximum a posteriori values) and 95% credibility intervals of the site-specific posterior PDFs are tabulated as suggested parameter values for each site. As a result, analysis of relationships between the posterior PDFs and site conditions suggests that the parameter values are likely correlated with the plant functional type, which needs to be confirmed in future studies by extending the approach to more sites.« less

  18. On the applicability of surrogate-based MCMC-Bayesian inversion to the Community Land Model: Case studies at Flux tower sites

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, Maoyi; Ray, Jaideep; Hou, Zhangshuan

    The Community Land Model (CLM) has been widely used in climate and Earth system modeling. Accurate estimation of model parameters is needed for reliable model simulations and predictions under current and future conditions, respectively. In our previous work, a subset of hydrological parameters has been identified to have significant impact on surface energy fluxes at selected flux tower sites based on parameter screening and sensitivity analysis, which indicate that the parameters could potentially be estimated from surface flux observations at the towers. To date, such estimates do not exist. In this paper, we assess the feasibility of applying a Bayesianmore » model calibration technique to estimate CLM parameters at selected flux tower sites under various site conditions. The parameters are estimated as a joint probability density function (PDF) that provides estimates of uncertainty of the parameters being inverted, conditional on climatologically average latent heat fluxes derived from observations. We find that the simulated mean latent heat fluxes from CLM using the calibrated parameters are generally improved at all sites when compared to those obtained with CLM simulations using default parameter sets. Further, our calibration method also results in credibility bounds around the simulated mean fluxes which bracket the measured data. The modes (or maximum a posteriori values) and 95% credibility intervals of the site-specific posterior PDFs are tabulated as suggested parameter values for each site. As a result, analysis of relationships between the posterior PDFs and site conditions suggests that the parameter values are likely correlated with the plant functional type, which needs to be confirmed in future studies by extending the approach to more sites.« less

  19. On the applicability of surrogate-based Markov chain Monte Carlo-Bayesian inversion to the Community Land Model: Case studies at flux tower sites

    NASA Astrophysics Data System (ADS)

    Huang, Maoyi; Ray, Jaideep; Hou, Zhangshuan; Ren, Huiying; Liu, Ying; Swiler, Laura

    2016-07-01

    The Community Land Model (CLM) has been widely used in climate and Earth system modeling. Accurate estimation of model parameters is needed for reliable model simulations and predictions under current and future conditions, respectively. In our previous work, a subset of hydrological parameters has been identified to have significant impact on surface energy fluxes at selected flux tower sites based on parameter screening and sensitivity analysis, which indicate that the parameters could potentially be estimated from surface flux observations at the towers. To date, such estimates do not exist. In this paper, we assess the feasibility of applying a Bayesian model calibration technique to estimate CLM parameters at selected flux tower sites under various site conditions. The parameters are estimated as a joint probability density function (PDF) that provides estimates of uncertainty of the parameters being inverted, conditional on climatologically average latent heat fluxes derived from observations. We find that the simulated mean latent heat fluxes from CLM using the calibrated parameters are generally improved at all sites when compared to those obtained with CLM simulations using default parameter sets. Further, our calibration method also results in credibility bounds around the simulated mean fluxes which bracket the measured data. The modes (or maximum a posteriori values) and 95% credibility intervals of the site-specific posterior PDFs are tabulated as suggested parameter values for each site. Analysis of relationships between the posterior PDFs and site conditions suggests that the parameter values are likely correlated with the plant functional type, which needs to be confirmed in future studies by extending the approach to more sites.

  20. The predictive consequences of parameterization

    NASA Astrophysics Data System (ADS)

    White, J.; Hughes, J. D.; Doherty, J. E.

    2013-12-01

    In numerical groundwater modeling, parameterization is the process of selecting the aspects of a computer model that will be allowed to vary during history matching. This selection process is dependent on professional judgment and is, therefore, inherently subjective. Ideally, a robust parameterization should be commensurate with the spatial and temporal resolution of the model and should include all uncertain aspects of the model. Limited computing resources typically require reducing the number of adjustable parameters so that only a subset of the uncertain model aspects are treated as estimable parameters; the remaining aspects are treated as fixed parameters during history matching. We use linear subspace theory to develop expressions for the predictive error incurred by fixing parameters. The predictive error is comprised of two terms. The first term arises directly from the sensitivity of a prediction to fixed parameters. The second term arises from prediction-sensitive adjustable parameters that are forced to compensate for fixed parameters during history matching. The compensation is accompanied by inappropriate adjustment of otherwise uninformed, null-space parameter components. Unwarranted adjustment of null-space components away from prior maximum likelihood values may produce bias if a prediction is sensitive to those components. The potential for subjective parameterization choices to corrupt predictions is examined using a synthetic model. Several strategies are evaluated, including use of piecewise constant zones, use of pilot points with Tikhonov regularization and use of the Karhunen-Loeve transformation. The best choice of parameterization (as defined by minimum error variance) is strongly dependent on the types of predictions to be made by the model.

  1. Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology

    PubMed Central

    2012-01-01

    Background An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biological information concerning the variables of interest. Pathway and network maps are one example of a source of such information. However, although ancillary information is increasingly available, it is not always clear how it should be used nor how it should be weighted in relation to primary data. Results We put forward an approach in which biological knowledge is incorporated using informative prior distributions over variable subsets, with prior information selected and weighted in an automated, objective manner using an empirical Bayes formulation. We employ continuous, linear models with interaction terms and exploit biochemically-motivated sparsity constraints to permit exact inference. We show an example of priors for pathway- and network-based information and illustrate our proposed method on both synthetic response data and by an application to cancer drug response data. Comparisons are also made to alternative Bayesian and frequentist penalised-likelihood methods for incorporating network-based information. Conclusions The empirical Bayes method proposed here can aid prior elicitation for Bayesian variable selection studies and help to guard against mis-specification of priors. Empirical Bayes, together with the proposed pathway-based priors, results in an approach with a competitive variable selection performance. In addition, the overall procedure is fast, deterministic, and has very few user-set parameters, yet is capable of capturing interplay between molecular players. The approach presented is general and readily applicable in any setting with multiple sources of biological prior knowledge. PMID:22578440

  2. Association of the expression of Th cytokines with peripheral CD4 and CD8 lymphocyte subsets after vaccination with FMD vaccine in Holstein young sires.

    PubMed

    Yang, Ling; Liu, Zhichao; Li, Jianbin; He, Kaili; Kong, Lingna; Guo, Runqing; Liu, Wenjiao; Gao, Yundong; Zhong, Jifeng

    2018-05-25

    High immune response (HIR) cows have a balanced and robust host defense and lower disease incidence, and immune response is more important to consider for selecting young sires than for selecting cows. The protective immune response against foot-and-mouth disease (FMD) virus infection is T-cell-independent in an animal experimental model. However, there is no convenient method to select young sires with a HIR to FMD virus. In this study, 39 healthy Holstein young sires were vaccinated with the trivalent (A, O and Asia 1) FMD vaccine, and T-lymphocyte subsets in peripheral blood lymphocytes (PBLs) were detected using flow cytometric analysis before and after vaccination. The expression of interferon-gamma (IFN-γ), interleukin-2 (IL-2), IL-4, and IL-6 mRNA in PBLs was analyzed after stimulation by lipopolysaccharide (LPS) or Concanavalin A (ConA) after vaccination. According to the percentage of CD4 + lymphocyte and CD4/CD8 ratio after vaccination for selecting the HIR young sires, the results showed that the percentages of CD3 + , CD4 + , CD3 + CD4 + lymphocytes and the CD4/CD8 ratio in the HIR group were higher compared to those in the medium immune response (MIR) and low immune response (LIR) groups before vaccination. Additionally, the percentage of CD4 + lymphocytes and the CD4/CD8 ratio after vaccination were positively associated with the expression level of IFN-γ mRNA in the PBLs after stimulation by LPS. In conclusion, the in vitro expression level of IFN-γ mRNA in the PBLs stimulated by LPS may serve as a parameter for selecting young sires with a HIR to FMD virus. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. A new approach to human microRNA target prediction using ensemble pruning and rotation forest.

    PubMed

    Mousavi, Reza; Eftekhari, Mahdi; Haghighi, Mehdi Ghezelbash

    2015-12-01

    MicroRNAs (miRNAs) are small non-coding RNAs that have important functions in gene regulation. Since finding miRNA target experimentally is costly and needs spending much time, the use of machine learning methods is a growing research area for miRNA target prediction. In this paper, a new approach is proposed by using two popular ensemble strategies, i.e. Ensemble Pruning and Rotation Forest (EP-RTF), to predict human miRNA target. For EP, the approach utilizes Genetic Algorithm (GA). In other words, a subset of classifiers from the heterogeneous ensemble is first selected by GA. Next, the selected classifiers are trained based on the RTF method and then are combined using weighted majority voting. In addition to seeking a better subset of classifiers, the parameter of RTF is also optimized by GA. Findings of the present study confirm that the newly developed EP-RTF outperforms (in terms of classification accuracy, sensitivity, and specificity) the previously applied methods over four datasets in the field of human miRNA target. Diversity-error diagrams reveal that the proposed ensemble approach constructs individual classifiers which are more accurate and usually diverse than the other ensemble approaches. Given these experimental results, we highly recommend EP-RTF for improving the performance of miRNA target prediction.

  4. Lectin Ulex europaeus agglutinin I specifically labels a subset of primary afferent fibers which project selectively to the superficial dorsal horn of the spinal cord.

    PubMed

    Mori, K

    1986-02-19

    To examine differential carbohydrate expression among different subsets of primary afferent fibers, several fluorescein-isothiocyanate conjugated lectins were used in a histochemical study of the dorsal root ganglion (DRG) and spinal cord of the rabbit. The lectin Ulex europaeus agglutinin I specifically labeled a subset of DRG cells and primary afferent fibers which projected to the superficial laminae of the dorsal horn. These results suggest that specific carbohydrates containing L-fucosyl residue is expressed selectively in small diameter primary afferent fibers which subserve nociception or thermoception.

  5. Relevance popularity: A term event model based feature selection scheme for text classification.

    PubMed

    Feng, Guozhong; An, Baiguo; Yang, Fengqin; Wang, Han; Zhang, Libiao

    2017-01-01

    Feature selection is a practical approach for improving the performance of text classification methods by optimizing the feature subsets input to classifiers. In traditional feature selection methods such as information gain and chi-square, the number of documents that contain a particular term (i.e. the document frequency) is often used. However, the frequency of a given term appearing in each document has not been fully investigated, even though it is a promising feature to produce accurate classifications. In this paper, we propose a new feature selection scheme based on a term event Multinomial naive Bayes probabilistic model. According to the model assumptions, the matching score function, which is based on the prediction probability ratio, can be factorized. Finally, we derive a feature selection measurement for each term after replacing inner parameters by their estimators. On a benchmark English text datasets (20 Newsgroups) and a Chinese text dataset (MPH-20), our numerical experiment results obtained from using two widely used text classifiers (naive Bayes and support vector machine) demonstrate that our method outperformed the representative feature selection methods.

  6. Multisensor-based real-time quality monitoring by means of feature extraction, selection and modeling for Al alloy in arc welding

    NASA Astrophysics Data System (ADS)

    Zhang, Zhifen; Chen, Huabin; Xu, Yanling; Zhong, Jiyong; Lv, Na; Chen, Shanben

    2015-08-01

    Multisensory data fusion-based online welding quality monitoring has gained increasing attention in intelligent welding process. This paper mainly focuses on the automatic detection of typical welding defect for Al alloy in gas tungsten arc welding (GTAW) by means of analzing arc spectrum, sound and voltage signal. Based on the developed algorithms in time and frequency domain, 41 feature parameters were successively extracted from these signals to characterize the welding process and seam quality. Then, the proposed feature selection approach, i.e., hybrid fisher-based filter and wrapper was successfully utilized to evaluate the sensitivity of each feature and reduce the feature dimensions. Finally, the optimal feature subset with 19 features was selected to obtain the highest accuracy, i.e., 94.72% using established classification model. This study provides a guideline for feature extraction, selection and dynamic modeling based on heterogeneous multisensory data to achieve a reliable online defect detection system in arc welding.

  7. Computerized stratified random site-selection approaches for design of a ground-water-quality sampling network

    USGS Publications Warehouse

    Scott, J.C.

    1990-01-01

    Computer software was written to randomly select sites for a ground-water-quality sampling network. The software uses digital cartographic techniques and subroutines from a proprietary geographic information system. The report presents the approaches, computer software, and sample applications. It is often desirable to collect ground-water-quality samples from various areas in a study region that have different values of a spatial characteristic, such as land-use or hydrogeologic setting. A stratified network can be used for testing hypotheses about relations between spatial characteristics and water quality, or for calculating statistical descriptions of water-quality data that account for variations that correspond to the spatial characteristic. In the software described, a study region is subdivided into areal subsets that have a common spatial characteristic to stratify the population into several categories from which sampling sites are selected. Different numbers of sites may be selected from each category of areal subsets. A population of potential sampling sites may be defined by either specifying a fixed population of existing sites, or by preparing an equally spaced population of potential sites. In either case, each site is identified with a single category, depending on the value of the spatial characteristic of the areal subset in which the site is located. Sites are selected from one category at a time. One of two approaches may be used to select sites. Sites may be selected randomly, or the areal subsets in the category can be grouped into cells and sites selected randomly from each cell.

  8. Feature selection for the classification of traced neurons.

    PubMed

    López-Cabrera, José D; Lorenzo-Ginori, Juan V

    2018-06-01

    The great availability of computational tools to calculate the properties of traced neurons leads to the existence of many descriptors which allow the automated classification of neurons from these reconstructions. This situation determines the necessity to eliminate irrelevant features as well as making a selection of the most appropriate among them, in order to improve the quality of the classification obtained. The dataset used contains a total of 318 traced neurons, classified by human experts in 192 GABAergic interneurons and 126 pyramidal cells. The features were extracted by means of the L-measure software, which is one of the most used computational tools in neuroinformatics to quantify traced neurons. We review some current feature selection techniques as filter, wrapper, embedded and ensemble methods. The stability of the feature selection methods was measured. For the ensemble methods, several aggregation methods based on different metrics were applied to combine the subsets obtained during the feature selection process. The subsets obtained applying feature selection methods were evaluated using supervised classifiers, among which Random Forest, C4.5, SVM, Naïve Bayes, Knn, Decision Table and the Logistic classifier were used as classification algorithms. Feature selection methods of types filter, embedded, wrappers and ensembles were compared and the subsets returned were tested in classification tasks for different classification algorithms. L-measure features EucDistanceSD, PathDistanceSD, Branch_pathlengthAve, Branch_pathlengthSD and EucDistanceAve were present in more than 60% of the selected subsets which provides evidence about their importance in the classification of this neurons. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. The Fisher-Markov selector: fast selecting maximally separable feature subset for multiclass classification with applications to high-dimensional data.

    PubMed

    Cheng, Qiang; Zhou, Hongbo; Cheng, Jie

    2011-06-01

    Selecting features for multiclass classification is a critically important task for pattern recognition and machine learning applications. Especially challenging is selecting an optimal subset of features from high-dimensional data, which typically have many more variables than observations and contain significant noise, missing components, or outliers. Existing methods either cannot handle high-dimensional data efficiently or scalably, or can only obtain local optimum instead of global optimum. Toward the selection of the globally optimal subset of features efficiently, we introduce a new selector--which we call the Fisher-Markov selector--to identify those features that are the most useful in describing essential differences among the possible groups. In particular, in this paper we present a way to represent essential discriminating characteristics together with the sparsity as an optimization objective. With properly identified measures for the sparseness and discriminativeness in possibly high-dimensional settings, we take a systematic approach for optimizing the measures to choose the best feature subset. We use Markov random field optimization techniques to solve the formulated objective functions for simultaneous feature selection. Our results are noncombinatorial, and they can achieve the exact global optimum of the objective function for some special kernels. The method is fast; in particular, it can be linear in the number of features and quadratic in the number of observations. We apply our procedure to a variety of real-world data, including mid--dimensional optical handwritten digit data set and high-dimensional microarray gene expression data sets. The effectiveness of our method is confirmed by experimental results. In pattern recognition and from a model selection viewpoint, our procedure says that it is possible to select the most discriminating subset of variables by solving a very simple unconstrained objective function which in fact can be obtained with an explicit expression.

  10. Approximate error conjugation gradient minimization methods

    DOEpatents

    Kallman, Jeffrey S

    2013-05-21

    In one embodiment, a method includes selecting a subset of rays from a set of all rays to use in an error calculation for a constrained conjugate gradient minimization problem, calculating an approximate error using the subset of rays, and calculating a minimum in a conjugate gradient direction based on the approximate error. In another embodiment, a system includes a processor for executing logic, logic for selecting a subset of rays from a set of all rays to use in an error calculation for a constrained conjugate gradient minimization problem, logic for calculating an approximate error using the subset of rays, and logic for calculating a minimum in a conjugate gradient direction based on the approximate error. In other embodiments, computer program products, methods, and systems are described capable of using approximate error in constrained conjugate gradient minimization problems.

  11. An overview of the NASA Langley Atmospheric Data Center: Online tools to effectively disseminate Earth science data products

    NASA Astrophysics Data System (ADS)

    Parker, L.; Dye, R. A.; Perez, J.; Rinsland, P.

    2012-12-01

    Over the past decade the Atmospheric Science Data Center (ASDC) at NASA Langley Research Center has archived and distributed a variety of satellite mission and aircraft campaign data sets. These datasets posed unique challenges to the user community at large due to the sheer volume and variety of the data and the lack of intuitive features in the order tools available to the investigator. Some of these data sets also lack sufficient metadata to provide rudimentary data discovery. To meet the needs of emerging users, the ASDC addressed issues in data discovery and delivery through the use of standards in data and access methods, and distribution through appropriate portals. The ASDC is currently undergoing a refresh of its webpages and Ordering Tools that will leverage updated collection level metadata in an effort to enhance the user experience. The ASDC is now providing search and subset capability to key mission satellite data sets. The ASDC has collaborated with Science Teams to accommodate prospective science users in the climate and modeling communities. The ASDC is using a common framework that enables more rapid development and deployment of search and subset tools that provide enhanced access features for the user community. Features of the Search and Subset web application enables a more sophisticated approach to selecting and ordering data subsets by parameter, date, time, and geographic area. The ASDC has also applied key practices from satellite missions to the multi-campaign aircraft missions executed for Earth Venture-1 and MEaSUReS

  12. Choice: 36 band feature selection software with applications to multispectral pattern recognition

    NASA Technical Reports Server (NTRS)

    Jones, W. C.

    1973-01-01

    Feature selection software was developed at the Earth Resources Laboratory that is capable of inputting up to 36 channels and selecting channel subsets according to several criteria based on divergence. One of the criterion used is compatible with the table look-up classifier requirements. The software indicates which channel subset best separates (based on average divergence) each class from all other classes. The software employs an exhaustive search technique, and computer time is not prohibitive. A typical task to select the best 4 of 22 channels for 12 classes takes 9 minutes on a Univac 1108 computer.

  13. A data driven partial ambiguity resolution: Two step success rate criterion, and its simulation demonstration

    NASA Astrophysics Data System (ADS)

    Hou, Yanqing; Verhagen, Sandra; Wu, Jie

    2016-12-01

    Ambiguity Resolution (AR) is a key technique in GNSS precise positioning. In case of weak models (i.e., low precision of data), however, the success rate of AR may be low, which may consequently introduce large errors to the baseline solution in cases of wrong fixing. Partial Ambiguity Resolution (PAR) is therefore proposed such that the baseline precision can be improved by fixing only a subset of ambiguities with high success rate. This contribution proposes a new PAR strategy, allowing to select the subset such that the expected precision gain is maximized among a set of pre-selected subsets, while at the same time the failure rate is controlled. These pre-selected subsets are supposed to obtain the highest success rate among those with the same subset size. The strategy is called Two-step Success Rate Criterion (TSRC) as it will first try to fix a relatively large subset with the fixed failure rate ratio test (FFRT) to decide on acceptance or rejection. In case of rejection, a smaller subset will be fixed and validated by the ratio test so as to fulfill the overall failure rate criterion. It is shown how the method can be practically used, without introducing a large additional computation effort. And more importantly, how it can improve (or at least not deteriorate) the availability in terms of baseline precision comparing to classical Success Rate Criterion (SRC) PAR strategy, based on a simulation validation. In the simulation validation, significant improvements are obtained for single-GNSS on short baselines with dual-frequency observations. For dual-constellation GNSS, the improvement for single-frequency observations on short baselines is very significant, on average 68%. For the medium- to long baselines, with dual-constellation GNSS the average improvement is around 20-30%.

  14. Simultaneous detection of circulating immunological parameters and tumor biomarkers in early stage breast cancer patients during adjuvant chemotherapy.

    PubMed

    Rovati, B; Mariucci, S; Delfanti, S; Grasso, D; Tinelli, C; Torre, C; De Amici, M; Pedrazzoli, P

    2016-06-01

    Chemotherapy-induced immune suppression has mainly been studied in patients with advanced cancer, but the influence of chemotherapy on the immune system in early stage cancer patients has so far not been studied systematically. The aim of the present study was to monitor the immune system during anthracycline- and taxane-based adjuvant chemotherapy in early stage breast cancer patients, to assess the impact of circulating tumor cells on selected immune parameters and to reveal putative angiogenic effects of circulating endothelial cells. Peripheral blood samples from 20 early stage breast cancer patients were analyzed using a flow cytometric multi-color of antibodies to enumerate lymphocyte and dendritic cell subsets, as well as endothelial and tumor cells. An enzyme-linked immunosorbent assay (ELISA) was used to measure the levels of various serological factors. During chemotherapy, all immunological parameters and angiogenesis surrogate biomarkers showed significant decreases. The numbers of circulating tumor cells showed significant inverse correlations with the numbers of T helper cells, a lymphocyte subset directly related to effective anti-tumor responses. Reduced T helper cell numbers may contribute to systemic immunosuppression and, as such, the activation of dormant tumor cells. From our results we conclude that adjuvant chemotherapy suppresses immune function in early stage breast cancer patients. In addition, we conclude that the presence of circulating tumor cells, defined as pan-cytokeratin(+), CD326(+), CD45(-) cells, may serve as an important indicator of a patient's immune status. Further investigations are needed to firmly define circulating tumor cells as a predictor for the success of breast cancer adjuvant chemotherapy.

  15. Identifying Typhoon Tracks based on Event Synchronization derived Spatially Embedded Climate Networks

    NASA Astrophysics Data System (ADS)

    Ozturk, Ugur; Marwan, Norbert; Kurths, Jürgen

    2017-04-01

    Complex networks are commonly used for investigating spatiotemporal dynamics of complex systems, e.g. extreme rainfall. Especially directed networks are very effective tools in identifying climatic patterns on spatially embedded networks. They can capture the network flux, so as the principal dynamics of spreading significant phenomena. Network measures, such as network divergence, bare the source-receptor relation of the directed networks. However, it is still a challenge how to catch fast evolving atmospheric events, i.e. typhoons. In this study, we propose a new technique, namely Radial Ranks, to detect the general pattern of typhoons forward direction based on the strength parameter of the event synchronization over Japan. We suggest to subset a circular zone of high correlation around the selected grid based on the strength parameter. Radial sums of the strength parameter along vectors within this zone, radial ranks are measured for potential directions, which allows us to trace the network flux over long distances. We employed also the delay parameter of event synchronization to identify and separate the frontal storms' and typhoons' individual behaviors.

  16. Defining an essence of structure determining residue contacts in proteins.

    PubMed

    Sathyapriya, R; Duarte, Jose M; Stehr, Henning; Filippis, Ioannis; Lappe, Michael

    2009-12-01

    The network of native non-covalent residue contacts determines the three-dimensional structure of a protein. However, not all contacts are of equal structural significance, and little knowledge exists about a minimal, yet sufficient, subset required to define the global features of a protein. Characterisation of this "structural essence" has remained elusive so far: no algorithmic strategy has been devised to-date that could outperform a random selection in terms of 3D reconstruction accuracy (measured as the Ca RMSD). It is not only of theoretical interest (i.e., for design of advanced statistical potentials) to identify the number and nature of essential native contacts-such a subset of spatial constraints is very useful in a number of novel experimental methods (like EPR) which rely heavily on constraint-based protein modelling. To derive accurate three-dimensional models from distance constraints, we implemented a reconstruction pipeline using distance geometry. We selected a test-set of 12 protein structures from the four major SCOP fold classes and performed our reconstruction analysis. As a reference set, series of random subsets (ranging from 10% to 90% of native contacts) are generated for each protein, and the reconstruction accuracy is computed for each subset. We have developed a rational strategy, termed "cone-peeling" that combines sequence features and network descriptors to select minimal subsets that outperform the reference sets. We present, for the first time, a rational strategy to derive a structural essence of residue contacts and provide an estimate of the size of this minimal subset. Our algorithm computes sparse subsets capable of determining the tertiary structure at approximately 4.8 A Ca RMSD with as little as 8% of the native contacts (Ca-Ca and Cb-Cb). At the same time, a randomly chosen subset of native contacts needs about twice as many contacts to reach the same level of accuracy. This "structural essence" opens new avenues in the fields of structure prediction, empirical potentials and docking.

  17. Defining an Essence of Structure Determining Residue Contacts in Proteins

    PubMed Central

    Sathyapriya, R.; Duarte, Jose M.; Stehr, Henning; Filippis, Ioannis; Lappe, Michael

    2009-01-01

    The network of native non-covalent residue contacts determines the three-dimensional structure of a protein. However, not all contacts are of equal structural significance, and little knowledge exists about a minimal, yet sufficient, subset required to define the global features of a protein. Characterisation of this “structural essence” has remained elusive so far: no algorithmic strategy has been devised to-date that could outperform a random selection in terms of 3D reconstruction accuracy (measured as the Ca RMSD). It is not only of theoretical interest (i.e., for design of advanced statistical potentials) to identify the number and nature of essential native contacts—such a subset of spatial constraints is very useful in a number of novel experimental methods (like EPR) which rely heavily on constraint-based protein modelling. To derive accurate three-dimensional models from distance constraints, we implemented a reconstruction pipeline using distance geometry. We selected a test-set of 12 protein structures from the four major SCOP fold classes and performed our reconstruction analysis. As a reference set, series of random subsets (ranging from 10% to 90% of native contacts) are generated for each protein, and the reconstruction accuracy is computed for each subset. We have developed a rational strategy, termed “cone-peeling” that combines sequence features and network descriptors to select minimal subsets that outperform the reference sets. We present, for the first time, a rational strategy to derive a structural essence of residue contacts and provide an estimate of the size of this minimal subset. Our algorithm computes sparse subsets capable of determining the tertiary structure at approximately 4.8 Å Ca RMSD with as little as 8% of the native contacts (Ca-Ca and Cb-Cb). At the same time, a randomly chosen subset of native contacts needs about twice as many contacts to reach the same level of accuracy. This “structural essence” opens new avenues in the fields of structure prediction, empirical potentials and docking. PMID:19997489

  18. Surveillance system and method having parameter estimation and operating mode partitioning

    NASA Technical Reports Server (NTRS)

    Bickford, Randall L. (Inventor)

    2005-01-01

    A system and method for monitoring an apparatus or process asset including creating a process model comprised of a plurality of process submodels each correlative to at least one training data subset partitioned from an unpartitioned training data set and each having an operating mode associated thereto; acquiring a set of observed signal data values from the asset; determining an operating mode of the asset for the set of observed signal data values; selecting a process submodel from the process model as a function of the determined operating mode of the asset; calculating a set of estimated signal data values from the selected process submodel for the determined operating mode; and determining asset status as a function of the calculated set of estimated signal data values for providing asset surveillance and/or control.

  19. Selection and collection of multi parameter physiological data for cardiac rhythm diagnostic algorithm development

    NASA Astrophysics Data System (ADS)

    Bostock, J.; Weller, P.; Cooklin, M.

    2010-07-01

    Automated diagnostic algorithms are used in implantable cardioverter-defibrillators (ICD's) to detect abnormal heart rhythms. Algorithms misdiagnose and improved specificity is needed to prevent inappropriate therapy. Knowledge engineering (KE) and artificial intelligence (AI) could improve this. A pilot study of KE was performed with artificial neural network (ANN) as AI system. A case note review analysed arrhythmic events stored in patients ICD memory. 13.2% patients received inappropriate therapy. The best ICD algorithm had sensitivity 1.00, specificity 0.69 (p<0.001 different to gold standard). A subset of data was used to train and test an ANN. A feed-forward, back-propagation network with 7 inputs, a 4 node hidden layer and 1 output had sensitivity 1.00, specificity 0.71 (p<0.001). A prospective study was performed using KE to list arrhythmias, factors and indicators for which measurable parameters were evaluated and results reviewed by a domain expert. Waveforms from electrodes in the heart and thoracic bio-impedance; temperature and motion data were collected from 65 patients during cardiac electrophysiological studies. 5 incomplete datasets were due to technical failures. We concluded that KE successfully guided selection of parameters and ANN produced a usable system and that complex data collection carries greater risk of technical failure, leading to data loss.

  20. Selecting predictors for discriminant analysis of species performance: an example from an amphibious softwater plant.

    PubMed

    Vanderhaeghe, F; Smolders, A J P; Roelofs, J G M; Hoffmann, M

    2012-03-01

    Selecting an appropriate variable subset in linear multivariate methods is an important methodological issue for ecologists. Interest often exists in obtaining general predictive capacity or in finding causal inferences from predictor variables. Because of a lack of solid knowledge on a studied phenomenon, scientists explore predictor variables in order to find the most meaningful (i.e. discriminating) ones. As an example, we modelled the response of the amphibious softwater plant Eleocharis multicaulis using canonical discriminant function analysis. We asked how variables can be selected through comparison of several methods: univariate Pearson chi-square screening, principal components analysis (PCA) and step-wise analysis, as well as combinations of some methods. We expected PCA to perform best. The selected methods were evaluated through fit and stability of the resulting discriminant functions and through correlations between these functions and the predictor variables. The chi-square subset, at P < 0.05, followed by a step-wise sub-selection, gave the best results. In contrast to expectations, PCA performed poorly, as so did step-wise analysis. The different chi-square subset methods all yielded ecologically meaningful variables, while probable noise variables were also selected by PCA and step-wise analysis. We advise against the simple use of PCA or step-wise discriminant analysis to obtain an ecologically meaningful variable subset; the former because it does not take into account the response variable, the latter because noise variables are likely to be selected. We suggest that univariate screening techniques are a worthwhile alternative for variable selection in ecology. © 2011 German Botanical Society and The Royal Botanical Society of the Netherlands.

  1. VARIABLE SELECTION FOR QUALITATIVE INTERACTIONS IN PERSONALIZED MEDICINE WHILE CONTROLLING THE FAMILY-WISE ERROR RATE

    PubMed Central

    Gunter, Lacey; Zhu, Ji; Murphy, Susan

    2012-01-01

    For many years, subset analysis has been a popular topic for the biostatistics and clinical trials literature. In more recent years, the discussion has focused on finding subsets of genomes which play a role in the effect of treatment, often referred to as stratified or personalized medicine. Though highly sought after, methods for detecting subsets with altering treatment effects are limited and lacking in power. In this article we discuss variable selection for qualitative interactions with the aim to discover these critical patient subsets. We propose a new technique designed specifically to find these interaction variables among a large set of variables while still controlling for the number of false discoveries. We compare this new method against standard qualitative interaction tests using simulations and give an example of its use on data from a randomized controlled trial for the treatment of depression. PMID:22023676

  2. Classification of Medical Datasets Using SVMs with Hybrid Evolutionary Algorithms Based on Endocrine-Based Particle Swarm Optimization and Artificial Bee Colony Algorithms.

    PubMed

    Lin, Kuan-Cheng; Hsieh, Yi-Hsiu

    2015-10-01

    The classification and analysis of data is an important issue in today's research. Selecting a suitable set of features makes it possible to classify an enormous quantity of data quickly and efficiently. Feature selection is generally viewed as a problem of feature subset selection, such as combination optimization problems. Evolutionary algorithms using random search methods have proven highly effective in obtaining solutions to problems of optimization in a diversity of applications. In this study, we developed a hybrid evolutionary algorithm based on endocrine-based particle swarm optimization (EPSO) and artificial bee colony (ABC) algorithms in conjunction with a support vector machine (SVM) for the selection of optimal feature subsets for the classification of datasets. The results of experiments using specific UCI medical datasets demonstrate that the accuracy of the proposed hybrid evolutionary algorithm is superior to that of basic PSO, EPSO and ABC algorithms, with regard to classification accuracy using subsets with a reduced number of features.

  3. A hybrid feature selection method using multiclass SVM for diagnosis of erythemato-squamous disease

    NASA Astrophysics Data System (ADS)

    Maryam, Setiawan, Noor Akhmad; Wahyunggoro, Oyas

    2017-08-01

    The diagnosis of erythemato-squamous disease is a complex problem and difficult to detect in dermatology. Besides that, it is a major cause of skin cancer. Data mining implementation in the medical field helps expert to diagnose precisely, accurately, and inexpensively. In this research, we use data mining technique to developed a diagnosis model based on multiclass SVM with a novel hybrid feature selection method to diagnose erythemato-squamous disease. Our hybrid feature selection method, named ChiGA (Chi Square and Genetic Algorithm), uses the advantages from filter and wrapper methods to select the optimal feature subset from original feature. Chi square used as filter method to remove redundant features and GA as wrapper method to select the ideal feature subset with SVM used as classifier. Experiment performed with 10 fold cross validation on erythemato-squamous diseases dataset taken from University of California Irvine (UCI) machine learning database. The experimental result shows that the proposed model based multiclass SVM with Chi Square and GA can give an optimum feature subset. There are 18 optimum features with 99.18% accuracy.

  4. Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework.

    PubMed

    Deng, Changjian; Lv, Kun; Shi, Debo; Yang, Bo; Yu, Song; He, Zhiyi; Yan, Jia

    2018-06-12

    In this paper, a novel feature selection and fusion framework is proposed to enhance the discrimination ability of gas sensor arrays for odor identification. Firstly, we put forward an efficient feature selection method based on the separability and the dissimilarity to determine the feature selection order for each type of feature when increasing the dimension of selected feature subsets. Secondly, the K-nearest neighbor (KNN) classifier is applied to determine the dimensions of the optimal feature subsets for different types of features. Finally, in the process of establishing features fusion, we come up with a classification dominance feature fusion strategy which conducts an effective basic feature. Experimental results on two datasets show that the recognition rates of Database I and Database II achieve 97.5% and 80.11%, respectively, when k = 1 for KNN classifier and the distance metric is correlation distance (COR), which demonstrates the superiority of the proposed feature selection and fusion framework in representing signal features. The novel feature selection method proposed in this paper can effectively select feature subsets that are conducive to the classification, while the feature fusion framework can fuse various features which describe the different characteristics of sensor signals, for enhancing the discrimination ability of gas sensors and, to a certain extent, suppressing drift effect.

  5. Effects of Number of Animals Monitored on Representations of Cattle Group Movement Characteristics and Spatial Occupancy

    PubMed Central

    Liu, Tong; Green, Angela R.; Rodríguez, Luis F.; Ramirez, Brett C.; Shike, Daniel W.

    2015-01-01

    The number of animals required to represent the collective characteristics of a group remains a concern in animal movement monitoring with GPS. Monitoring a subset of animals from a group instead of all animals can reduce costs and labor; however, incomplete data may cause information losses and inaccuracy in subsequent data analyses. In cattle studies, little work has been conducted to determine the number of cattle within a group needed to be instrumented considering subsequent analyses. Two different groups of cattle (a mixed group of 24 beef cows and heifers, and another group of 8 beef cows) were monitored with GPS collars at 4 min intervals on intensively managed pastures and corn residue fields in 2011. The effects of subset group size on cattle movement characterization and spatial occupancy analysis were evaluated by comparing the results between subset groups and the entire group for a variety of summarization parameters. As expected, more animals yield better results for all parameters. Results show the average group travel speed and daily travel distances are overestimated as subset group size decreases, while the average group radius is underestimated. Accuracy of group centroid locations and group radii are improved linearly as subset group size increases. A kernel density estimation was performed to quantify the spatial occupancy by cattle via GPS location data. Results show animals among the group had high similarity of spatial occupancy. Decisions regarding choosing an appropriate subset group size for monitoring depend on the specific use of data for subsequent analysis: a small subset group may be adequate for identifying areas visited by cattle; larger subset group size (e.g. subset group containing more than 75% of animals) is recommended to achieve better accuracy of group movement characteristics and spatial occupancy for the use of correlating cattle locations with other environmental factors. PMID:25647571

  6. Two methods for parameter estimation using multiple-trait models and beef cattle field data.

    PubMed

    Bertrand, J K; Kriese, L A

    1990-08-01

    Two methods are presented for estimating variances and covariances from beef cattle field data using multiple-trait sire models. Both methods require that the first trait have no missing records and that the contemporary groups for the second trait be subsets of the contemporary groups for the first trait; however, the second trait may have missing records. One method uses pseudo expectations involving quadratics composed of the solutions and the right-hand sides of the mixed model equations. The other method is an extension of Henderson's Simple Method to the multiple trait case. Neither of these methods requires any inversions of large matrices in the computation of the parameters; therefore, both methods can handle very large sets of data. Four simulated data sets were generated to evaluate the methods. In general, both methods estimated genetic correlations and heritabilities that were close to the Restricted Maximum Likelihood estimates and the true data set values, even when selection within contemporary groups was practiced. The estimates of residual correlations by both methods, however, were biased by selection. These two methods can be useful in estimating variances and covariances from multiple-trait models in large populations that have undergone a minimal amount of selection within contemporary groups.

  7. Criteria to Extract High-Quality Protein Data Bank Subsets for Structure Users.

    PubMed

    Carugo, Oliviero; Djinović-Carugo, Kristina

    2016-01-01

    It is often necessary to build subsets of the Protein Data Bank to extract structural trends and average values. For this purpose it is mandatory that the subsets are non-redundant and of high quality. The first problem can be solved relatively easily at the sequence level or at the structural level. The second, on the contrary, needs special attention. It is not sufficient, in fact, to consider the crystallographic resolution and other feature must be taken into account: the absence of strings of residues from the electron density maps and from the files deposited in the Protein Data Bank; the B-factor values; the appropriate validation of the structural models; the quality of the electron density maps, which is not uniform; and the temperature of the diffraction experiments. More stringent criteria produce smaller subsets, which can be enlarged with more tolerant selection criteria. The incessant growth of the Protein Data Bank and especially of the number of high-resolution structures is allowing the use of more stringent selection criteria, with a consequent improvement of the quality of the subsets of the Protein Data Bank.

  8. Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm.

    PubMed

    Martinez, Emmanuel; Alvarez, Mario Moises; Trevino, Victor

    2010-08-01

    Biomarker discovery is a typical application from functional genomics. Due to the large number of genes studied simultaneously in microarray data, feature selection is a key step. Swarm intelligence has emerged as a solution for the feature selection problem. However, swarm intelligence settings for feature selection fail to select small features subsets. We have proposed a swarm intelligence feature selection algorithm based on the initialization and update of only a subset of particles in the swarm. In this study, we tested our algorithm in 11 microarray datasets for brain, leukemia, lung, prostate, and others. We show that the proposed swarm intelligence algorithm successfully increase the classification accuracy and decrease the number of selected features compared to other swarm intelligence methods. Copyright © 2010 Elsevier Ltd. All rights reserved.

  9. Chronic Low-Grade Inflammation in Childhood Obesity Is Associated with Decreased IL-10 Expression by Monocyte Subsets.

    PubMed

    Mattos, Rafael T; Medeiros, Nayara I; Menezes, Carlos A; Fares, Rafaelle C G; Franco, Eliza P; Dutra, Walderez O; Rios-Santos, Fabrício; Correa-Oliveira, Rodrigo; Gomes, Juliana A S

    2016-01-01

    Chronic low-grade inflammation is related to the development of comorbidities and poor prognosis in obesity. Monocytes are main sources of cytokines and play a pivotal role in inflammation. We evaluated monocyte frequency, phenotype and cytokine profile of monocyte subsets, to determine their association with the pathogenesis of childhood obesity. Children with obesity were evaluated for biochemical and anthropometric parameters. Monocyte subsets were characterized by flow cytometry, considering cytokine production and activation/recognition molecules. Correlation analysis between clinical parameters and immunological data delineated the monocytes contribution for low-grade inflammation. We observed a higher frequency of non-classical monocytes in the childhood obesity group (CO) than normal-weight group (NW). All subsets displayed higher TLR4 expression in CO, but their recognition and antigen presentation functions seem to be diminished due to lower expression of CD40, CD80/86 and HLA-DR. All subsets showed a lower expression of IL-10 in CO and correlation analyses showed changes in IL-10 expression profile. The lower expression of IL-10 may be decisive for the maintenance of the low-grade inflammation status in CO, especially for alterations in non-classical monocytes profile. These cells may contribute to supporting inflammation and loss of regulation in the immune response of children with obesity.

  10. Chronic Low-Grade Inflammation in Childhood Obesity Is Associated with Decreased IL-10 Expression by Monocyte Subsets

    PubMed Central

    Mattos, Rafael T.; Medeiros, Nayara I.; Menezes, Carlos A.; Fares, Rafaelle C. G.; Franco, Eliza P.; Dutra, Walderez O.; Rios-Santos, Fabrício; Correa-Oliveira, Rodrigo; Gomes, Juliana A. S.

    2016-01-01

    Chronic low-grade inflammation is related to the development of comorbidities and poor prognosis in obesity. Monocytes are main sources of cytokines and play a pivotal role in inflammation. We evaluated monocyte frequency, phenotype and cytokine profile of monocyte subsets, to determine their association with the pathogenesis of childhood obesity. Children with obesity were evaluated for biochemical and anthropometric parameters. Monocyte subsets were characterized by flow cytometry, considering cytokine production and activation/recognition molecules. Correlation analysis between clinical parameters and immunological data delineated the monocytes contribution for low-grade inflammation. We observed a higher frequency of non-classical monocytes in the childhood obesity group (CO) than normal-weight group (NW). All subsets displayed higher TLR4 expression in CO, but their recognition and antigen presentation functions seem to be diminished due to lower expression of CD40, CD80/86 and HLA-DR. All subsets showed a lower expression of IL-10 in CO and correlation analyses showed changes in IL-10 expression profile. The lower expression of IL-10 may be decisive for the maintenance of the low-grade inflammation status in CO, especially for alterations in non-classical monocytes profile. These cells may contribute to supporting inflammation and loss of regulation in the immune response of children with obesity. PMID:27977792

  11. Decoys Selection in Benchmarking Datasets: Overview and Perspectives

    PubMed Central

    Réau, Manon; Langenfeld, Florent; Zagury, Jean-François; Lagarde, Nathalie; Montes, Matthieu

    2018-01-01

    Virtual Screening (VS) is designed to prospectively help identifying potential hits, i.e., compounds capable of interacting with a given target and potentially modulate its activity, out of large compound collections. Among the variety of methodologies, it is crucial to select the protocol that is the most adapted to the query/target system under study and that yields the most reliable output. To this aim, the performance of VS methods is commonly evaluated and compared by computing their ability to retrieve active compounds in benchmarking datasets. The benchmarking datasets contain a subset of known active compounds together with a subset of decoys, i.e., assumed non-active molecules. The composition of both the active and the decoy compounds subsets is critical to limit the biases in the evaluation of the VS methods. In this review, we focus on the selection of decoy compounds that has considerably changed over the years, from randomly selected compounds to highly customized or experimentally validated negative compounds. We first outline the evolution of decoys selection in benchmarking databases as well as current benchmarking databases that tend to minimize the introduction of biases, and secondly, we propose recommendations for the selection and the design of benchmarking datasets. PMID:29416509

  12. An improved wrapper-based feature selection method for machinery fault diagnosis

    PubMed Central

    2017-01-01

    A major issue of machinery fault diagnosis using vibration signals is that it is over-reliant on personnel knowledge and experience in interpreting the signal. Thus, machine learning has been adapted for machinery fault diagnosis. The quantity and quality of the input features, however, influence the fault classification performance. Feature selection plays a vital role in selecting the most representative feature subset for the machine learning algorithm. In contrast, the trade-off relationship between capability when selecting the best feature subset and computational effort is inevitable in the wrapper-based feature selection (WFS) method. This paper proposes an improved WFS technique before integration with a support vector machine (SVM) model classifier as a complete fault diagnosis system for a rolling element bearing case study. The bearing vibration dataset made available by the Case Western Reserve University Bearing Data Centre was executed using the proposed WFS and its performance has been analysed and discussed. The results reveal that the proposed WFS secures the best feature subset with a lower computational effort by eliminating the redundancy of re-evaluation. The proposed WFS has therefore been found to be capable and efficient to carry out feature selection tasks. PMID:29261689

  13. New TES Search and Subset Application

    Atmospheric Science Data Center

    2017-08-23

    ... Wednesday, September 19, 2012 The Atmospheric Science Data Center (ASDC) at NASA Langley Research Center in collaboration ... pleased to announce the release of the TES Search and Subset Web Application for select TES Level 2 products. Features of the Search and ...

  14. A Simple Joint Estimation Method of Residual Frequency Offset and Sampling Frequency Offset for DVB Systems

    NASA Astrophysics Data System (ADS)

    Kwon, Ki-Won; Cho, Yongsoo

    This letter presents a simple joint estimation method for residual frequency offset (RFO) and sampling frequency offset (STO) in OFDM-based digital video broadcasting (DVB) systems. The proposed method selects a continual pilot (CP) subset from an unsymmetrically and non-uniformly distributed CP set to obtain an unbiased estimator. Simulation results show that the proposed method using a properly selected CP subset is unbiased and performs robustly.

  15. Proof of concept of a novel SMA cage actuator

    NASA Astrophysics Data System (ADS)

    Deyer, Christopher W.; Brei, Diann E.

    2001-06-01

    Numerous industrial applications that currently utilize expensive solenoids or slow wax motors are good candidates for smart material actuation. Many of these applications require millimeter-scale displacement and low cost; thereby, eliminating piezoelectric technologies. Fortunately, there is a subset of these applications that can tolerate the slower response of shape memory alloys. This paper details a proof-of-concept study of a novel SMA cage actuator intended for proportional braking in commercial appliances. The chosen actuator architecture consists of a SMA wire cage enclosing a return spring. To develop an understanding of the influences of key design parameters on the actuator response time and displacement amplitude, a half-factorial 25 Design of Experiment (DOE) study was conducted utilizing eight differently configured prototypes. The DOE results guided the selection of the design parameters for the final proof-of-concept actuator. This actuator was built and experimentally characterized for stroke, proportional control and response time.

  16. A new item response theory model to adjust data allowing examinee choice

    PubMed Central

    Costa, Marcelo Azevedo; Braga Oliveira, Rivert Paulo

    2018-01-01

    In a typical questionnaire testing situation, examinees are not allowed to choose which items they answer because of a technical issue in obtaining satisfactory statistical estimates of examinee ability and item difficulty. This paper introduces a new item response theory (IRT) model that incorporates information from a novel representation of questionnaire data using network analysis. Three scenarios in which examinees select a subset of items were simulated. In the first scenario, the assumptions required to apply the standard Rasch model are met, thus establishing a reference for parameter accuracy. The second and third scenarios include five increasing levels of violating those assumptions. The results show substantial improvements over the standard model in item parameter recovery. Furthermore, the accuracy was closer to the reference in almost every evaluated scenario. To the best of our knowledge, this is the first proposal to obtain satisfactory IRT statistical estimates in the last two scenarios. PMID:29389996

  17. SU-E-J-128: Two-Stage Atlas Selection in Multi-Atlas-Based Image Segmentation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, T; Ruan, D

    2015-06-15

    Purpose: In the new era of big data, multi-atlas-based image segmentation is challenged by heterogeneous atlas quality and high computation burden from extensive atlas collection, demanding efficient identification of the most relevant atlases. This study aims to develop a two-stage atlas selection scheme to achieve computational economy with performance guarantee. Methods: We develop a low-cost fusion set selection scheme by introducing a preliminary selection to trim full atlas collection into an augmented subset, alleviating the need for extensive full-fledged registrations. More specifically, fusion set selection is performed in two successive steps: preliminary selection and refinement. An augmented subset is firstmore » roughly selected from the whole atlas collection with a simple registration scheme and the corresponding preliminary relevance metric; the augmented subset is further refined into the desired fusion set size, using full-fledged registration and the associated relevance metric. The main novelty of this work is the introduction of an inference model to relate the preliminary and refined relevance metrics, based on which the augmented subset size is rigorously derived to ensure the desired atlases survive the preliminary selection with high probability. Results: The performance and complexity of the proposed two-stage atlas selection method were assessed using a collection of 30 prostate MR images. It achieved comparable segmentation accuracy as the conventional one-stage method with full-fledged registration, but significantly reduced computation time to 1/3 (from 30.82 to 11.04 min per segmentation). Compared with alternative one-stage cost-saving approach, the proposed scheme yielded superior performance with mean and medium DSC of (0.83, 0.85) compared to (0.74, 0.78). Conclusion: This work has developed a model-guided two-stage atlas selection scheme to achieve significant cost reduction while guaranteeing high segmentation accuracy. The benefit in both complexity and performance is expected to be most pronounced with large-scale heterogeneous data.« less

  18. Neurobehavioral, autonomic nervous function and lymphocyte subsets among aluminum electrolytic workers.

    PubMed

    He, S C; Qiao, N; Sheng, W

    2003-01-01

    The purpose of our study is to determine the alteration of neurobehavioral parameters, autonomic nervous function and lymphocyte subsets in aluminum electrolytic workers of long-term aluminum exposure. 33 men who were 35.16 +/- 2.95 (mean +/- S.D) years old occupationally exposed to aluminum for 14.91 +/- 6.31 (mean +/- S.D) years. Air Al level and urinary aluminum concentration was measured by means of graphite furnace atomic absorption spectrophotometer. Normal reference group were selected from a flour plant. Neurobehavioral core test battery (NCTB) recommended by WHO was utilized. Autonomic nervous function test battery recommended by Ewing DJ was conducted on subjects. FAC SCAN was used to measure the lymphocyte subsets of peripheral blood. The mean air aluminum level in the workshop was 6.36 mg/m3, ranged from 2.90 to 11.38 mg/m3. Urinary aluminum of the Al electrolytic workers (40.08 +/- 9.36 microgram/mg.cre) was obviously higher than that of control group (26.84 +/- 8.93 m/mg.cre). Neurobehavioral results showed that the scores of DSY, PAC and PA in Al electrolytic workers were significantly lower than those of control group, The score of POMSC, POMSF and SRT among Al exposed workers were significantly augmented in relation to those of control group. Autonomic nervous function test results showed that R-R interval variability of maximum ratio of immediately standing up in Al electrolytic workers were decreased compare with the control group, while the BP-IS, HR-V, HR-DB, R30:15 had no significant change. Peripheral blood lymphocyte subsets test showed that CD4-CD8+ T lymphocyte in Al electrolytic workers increased. This study suggests that Al exposure exerts adverse effects on neurobehavioral performance, especially movement coordination and negative mood, and parasympathetic nervous function; moreover it increase CD4-CD8+ T lymphocyte subsets.

  19. GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets.

    PubMed

    Jeong, Seongmun; Kim, Jae-Yoon; Jeong, Soon-Chun; Kang, Sung-Taeg; Moon, Jung-Kyung; Kim, Namshin

    2017-01-01

    Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.

  20. Performance Analysis of Relay Subset Selection for Amplify-and-Forward Cognitive Relay Networks

    PubMed Central

    Qureshi, Ijaz Mansoor; Malik, Aqdas Naveed; Zubair, Muhammad

    2014-01-01

    Cooperative communication is regarded as a key technology in wireless networks, including cognitive radio networks (CRNs), which increases the diversity order of the signal to combat the unfavorable effects of the fading channels, by allowing distributed terminals to collaborate through sophisticated signal processing. Underlay CRNs have strict interference constraints towards the secondary users (SUs) active in the frequency band of the primary users (PUs), which limits their transmit power and their coverage area. Relay selection offers a potential solution to the challenges faced by underlay networks, by selecting either single best relay or a subset of potential relay set under different design requirements and assumptions. The best relay selection schemes proposed in the literature for amplify-and-forward (AF) based underlay cognitive relay networks have been very well studied in terms of outage probability (OP) and bit error rate (BER), which is deficient in multiple relay selection schemes. The novelty of this work is to study the outage behavior of multiple relay selection in the underlay CRN and derive the closed-form expressions for the OP and BER through cumulative distribution function (CDF) of the SNR received at the destination. The effectiveness of relay subset selection is shown through simulation results. PMID:24737980

  1. In Silico Syndrome Prediction for Coronary Artery Disease in Traditional Chinese Medicine

    PubMed Central

    Lu, Peng; Chen, Jianxin; Zhao, Huihui; Gao, Yibo; Luo, Liangtao; Zuo, Xiaohan; Shi, Qi; Yang, Yiping; Yi, Jianqiang; Wang, Wei

    2012-01-01

    Coronary artery disease (CAD) is the leading causes of deaths in the world. The differentiation of syndrome (ZHENG) is the criterion of diagnosis and therapeutic in TCM. Therefore, syndrome prediction in silico can be improving the performance of treatment. In this paper, we present a Bayesian network framework to construct a high-confidence syndrome predictor based on the optimum subset, that is, collected by Support Vector Machine (SVM) feature selection. Syndrome of CAD can be divided into asthenia and sthenia syndromes. According to the hierarchical characteristics of syndrome, we firstly label every case three types of syndrome (asthenia, sthenia, or both) to solve several syndromes with some patients. On basis of the three syndromes' classes, we design SVM feature selection to achieve the optimum symptom subset and compare this subset with Markov blanket feature select using ROC. Using this subset, the six predictors of CAD's syndrome are constructed by the Bayesian network technique. We also design Naïve Bayes, C4.5 Logistic, Radial basis function (RBF) network compared with Bayesian network. In a conclusion, the Bayesian network method based on the optimum symptoms shows a practical method to predict six syndromes of CAD in TCM. PMID:22567030

  2. Verification Techniques for Parameter Selection and Bayesian Model Calibration Presented for an HIV Model

    NASA Astrophysics Data System (ADS)

    Wentworth, Mami Tonoe

    Uncertainty quantification plays an important role when making predictive estimates of model responses. In this context, uncertainty quantification is defined as quantifying and reducing uncertainties, and the objective is to quantify uncertainties in parameter, model and measurements, and propagate the uncertainties through the model, so that one can make a predictive estimate with quantified uncertainties. Two of the aspects of uncertainty quantification that must be performed prior to propagating uncertainties are model calibration and parameter selection. There are several efficient techniques for these processes; however, the accuracy of these methods are often not verified. This is the motivation for our work, and in this dissertation, we present and illustrate verification frameworks for model calibration and parameter selection in the context of biological and physical models. First, HIV models, developed and improved by [2, 3, 8], describe the viral infection dynamics of an HIV disease. These are also used to make predictive estimates of viral loads and T-cell counts and to construct an optimal control for drug therapy. Estimating input parameters is an essential step prior to uncertainty quantification. However, not all the parameters are identifiable, implying that they cannot be uniquely determined by the observations. These unidentifiable parameters can be partially removed by performing parameter selection, a process in which parameters that have minimal impacts on the model response are determined. We provide verification techniques for Bayesian model calibration and parameter selection for an HIV model. As an example of a physical model, we employ a heat model with experimental measurements presented in [10]. A steady-state heat model represents a prototypical behavior for heat conduction and diffusion process involved in a thermal-hydraulic model, which is a part of nuclear reactor models. We employ this simple heat model to illustrate verification techniques for model calibration. For Bayesian model calibration, we employ adaptive Metropolis algorithms to construct densities for input parameters in the heat model and the HIV model. To quantify the uncertainty in the parameters, we employ two MCMC algorithms: Delayed Rejection Adaptive Metropolis (DRAM) [33] and Differential Evolution Adaptive Metropolis (DREAM) [66, 68]. The densities obtained using these methods are compared to those obtained through the direct numerical evaluation of the Bayes' formula. We also combine uncertainties in input parameters and measurement errors to construct predictive estimates for a model response. A significant emphasis is on the development and illustration of techniques to verify the accuracy of sampling-based Metropolis algorithms. We verify the accuracy of DRAM and DREAM by comparing chains, densities and correlations obtained using DRAM, DREAM and the direct evaluation of Bayes formula. We also perform similar analysis for credible and prediction intervals for responses. Once the parameters are estimated, we employ energy statistics test [63, 64] to compare the densities obtained by different methods for the HIV model. The energy statistics are used to test the equality of distributions. We also consider parameter selection and verification techniques for models having one or more parameters that are noninfluential in the sense that they minimally impact model outputs. We illustrate these techniques for a dynamic HIV model but note that the parameter selection and verification framework is applicable to a wide range of biological and physical models. To accommodate the nonlinear input to output relations, which are typical for such models, we focus on global sensitivity analysis techniques, including those based on partial correlations, Sobol indices based on second-order model representations, and Morris indices, as well as a parameter selection technique based on standard errors. A significant objective is to provide verification strategies to assess the accuracy of those techniques, which we illustrate in the context of the HIV model. Finally, we examine active subspace methods as an alternative to parameter subset selection techniques. The objective of active subspace methods is to determine the subspace of inputs that most strongly affect the model response, and to reduce the dimension of the input space. The major difference between active subspace methods and parameter selection techniques is that parameter selection identifies influential parameters whereas subspace selection identifies a linear combination of parameters that impacts the model responses significantly. We employ active subspace methods discussed in [22] for the HIV model and present a verification that the active subspace successfully reduces the input dimensions.

  3. Data Point Averaging for Computational Fluid Dynamics Data

    NASA Technical Reports Server (NTRS)

    Norman, Jr., David (Inventor)

    2016-01-01

    A system and method for generating fluid flow parameter data for use in aerodynamic heating analysis. Computational fluid dynamics data is generated for a number of points in an area on a surface to be analyzed. Sub-areas corresponding to areas of the surface for which an aerodynamic heating analysis is to be performed are identified. A computer system automatically determines a sub-set of the number of points corresponding to each of the number of sub-areas and determines a value for each of the number of sub-areas using the data for the sub-set of points corresponding to each of the number of sub-areas. The value is determined as an average of the data for the sub-set of points corresponding to each of the number of sub-areas. The resulting parameter values then may be used to perform an aerodynamic heating analysis.

  4. Data Point Averaging for Computational Fluid Dynamics Data

    NASA Technical Reports Server (NTRS)

    Norman, David, Jr. (Inventor)

    2014-01-01

    A system and method for generating fluid flow parameter data for use in aerodynamic heating analysis. Computational fluid dynamics data is generated for a number of points in an area on a surface to be analyzed. Sub-areas corresponding to areas of the surface for which an aerodynamic heating analysis is to be performed are identified. A computer system automatically determines a sub-set of the number of points corresponding to each of the number of sub-areas and determines a value for each of the number of sub-areas using the data for the sub-set of points corresponding to each of the number of sub-areas. The value is determined as an average of the data for the sub-set of points corresponding to each of the number of sub-areas. The resulting parameter values then may be used to perform an aerodynamic heating analysis.

  5. Parameter sensitivity and identifiability for a biogeochemical model of hypoxia in the northern Gulf of Mexico

    EPA Science Inventory

    Local sensitivity analyses and identifiable parameter subsets were used to describe numerical constraints of a hypoxia model for bottom waters of the northern Gulf of Mexico. The sensitivity of state variables differed considerably with parameter changes, although most variables ...

  6. Large-scale investigation of the parameters in response to Eimeria maxima challenge in broilers.

    PubMed

    Hamzic, E; Bed'Hom, B; Juin, H; Hawken, R; Abrahamsen, M S; Elsen, J M; Servin, B; Pinard-van der Laan, M H; Demeure, O

    2015-04-01

    Coccidiosis, a parasitic disease of the intestinal tract caused by members of the genera Eimeria and Isospora, is one of the most common and costly diseases in chicken. The aims of this study were to assess the effect of the challenge and level of variability of measured parameters in chickens during the challenge with Eimeria maxima. Furthermore, this study aimed to investigate which parameters are the most relevant indicators of the health status. Finally, the study also aimed to estimate accuracy of prediction for traits that cannot be measured on large scale (such as intestinal lesion score and fecal oocyst count) using parameters that can easily be measured on all animals. The study was performed in 2 parts: a pilot challenge on 240 animals followed by a large-scale challenge on 2,024 animals. In both experiments, animals were challenged with 50,000 Eimeria maxima oocysts at 16 d of age. In the pilot challenge, all animals were measured for BW gain, plasma coloration, hematocrit, and rectal temperature and, in addition, a subset of 48 animals was measured for oocyst count and the intestinal lesion score. All animals from the second challenge were measured for BW gain, plasma coloration, and hematocrit whereas a subset of 184 animals was measured for intestinal lesion score, fecal oocyst count, blood parameters, and plasma protein content and composition. Most of the parameters measured were significantly affected by the challenge. Lesion scores for duodenum and jejunum (P < 0.001), oocyst count (P < 0.05), plasma coloration for the optical density values between 450 and 490 nm (P < 0.001), albumin (P < 0.001), α1-globulin (P < 0.01), α2-globulin (P < 0.001), α3-globulin (P < 0.01), and β2-globulin (P < 0.001) were the most strongly affected parameters and expressed the greatest levels of variation. Plasma protein profiles proved to be a new, reliable parameter for measuring response to Eimeria maxima. Prediction of intestinal lesion score and fecal oocyst count using the other parameters measured was not very precise (R2 < 0.7). The study was successfully performed in real raising conditions on a large scale. Finally, we observed a high variability in response to the challenge, suggesting that broilers' response to Eimeria maxima has a strong genetic determinism, which may be improved by genetic selection.

  7. Choosing non-redundant representative subsets of protein sequence data sets using submodular optimization.

    PubMed

    Libbrecht, Maxwell W; Bilmes, Jeffrey A; Noble, William Stafford

    2018-04-01

    Selecting a non-redundant representative subset of sequences is a common step in many bioinformatics workflows, such as the creation of non-redundant training sets for sequence and structural models or selection of "operational taxonomic units" from metagenomics data. Previous methods for this task, such as CD-HIT, PISCES, and UCLUST, apply a heuristic threshold-based algorithm that has no theoretical guarantees. We propose a new approach based on submodular optimization. Submodular optimization, a discrete analogue to continuous convex optimization, has been used with great success for other representative set selection problems. We demonstrate that the submodular optimization approach results in representative protein sequence subsets with greater structural diversity than sets chosen by existing methods, using as a gold standard the SCOPe library of protein domain structures. In this setting, submodular optimization consistently yields protein sequence subsets that include more SCOPe domain families than sets of the same size selected by competing approaches. We also show how the optimization framework allows us to design a mixture objective function that performs well for both large and small representative sets. The framework we describe is the best possible in polynomial time (under some assumptions), and it is flexible and intuitive because it applies a suite of generic methods to optimize one of a variety of objective functions. © 2018 Wiley Periodicals, Inc.

  8. An Active RBSE Framework to Generate Optimal Stimulus Sequences in a BCI for Spelling

    NASA Astrophysics Data System (ADS)

    Moghadamfalahi, Mohammad; Akcakaya, Murat; Nezamfar, Hooman; Sourati, Jamshid; Erdogmus, Deniz

    2017-10-01

    A class of brain computer interfaces (BCIs) employs noninvasive recordings of electroencephalography (EEG) signals to enable users with severe speech and motor impairments to interact with their environment and social network. For example, EEG based BCIs for typing popularly utilize event related potentials (ERPs) for inference. Presentation paradigm design in current ERP-based letter by letter typing BCIs typically query the user with an arbitrary subset characters. However, the typing accuracy and also typing speed can potentially be enhanced with more informed subset selection and flash assignment. In this manuscript, we introduce the active recursive Bayesian state estimation (active-RBSE) framework for inference and sequence optimization. Prior to presentation in each iteration, rather than showing a subset of randomly selected characters, the developed framework optimally selects a subset based on a query function. Selected queries are made adaptively specialized for users during each intent detection. Through a simulation-based study, we assess the effect of active-RBSE on the performance of a language-model assisted typing BCI in terms of typing speed and accuracy. To provide a baseline for comparison, we also utilize standard presentation paradigms namely, row and column matrix presentation paradigm and also random rapid serial visual presentation paradigms. The results show that utilization of active-RBSE can enhance the online performance of the system, both in terms of typing accuracy and speed.

  9. Variable selection with stepwise and best subset approaches

    PubMed Central

    2016-01-01

    While purposeful selection is performed partly by software and partly by hand, the stepwise and best subset approaches are automatically performed by software. Two R functions stepAIC() and bestglm() are well designed for stepwise and best subset regression, respectively. The stepAIC() function begins with a full or null model, and methods for stepwise regression can be specified in the direction argument with character values “forward”, “backward” and “both”. The bestglm() function begins with a data frame containing explanatory variables and response variables. The response variable should be in the last column. Varieties of goodness-of-fit criteria can be specified in the IC argument. The Bayesian information criterion (BIC) usually results in more parsimonious model than the Akaike information criterion. PMID:27162786

  10. Long-Term Variations of the EOP and ICRF2

    NASA Technical Reports Server (NTRS)

    Zharov, Vladimir; Sazhin, Mikhail; Sementsov, Valerian; Sazhina, Olga

    2010-01-01

    We analyzed the time series of the coordinates of the ICRF radio sources. We show that part of the radio sources, including the defining sources, shows a significant apparent motion. The stability of the celestial reference frame is provided by a no-net-rotation condition applied to the defining sources. In our case this condition leads to a rotation of the frame axes with time. We calculated the effect of this rotation on the Earth orientation parameters (EOP). In order to improve the stability of the celestial reference frame we suggest a new method for the selection of the defining sources. The method consists of two criteria: the first one we call cosmological and the second one kinematical. It is shown that a subset of the ICRF sources selected according to cosmological criteria provides the most stable reference frame for the next decade.

  11. Editorial highlighting and highly cited papers

    NASA Astrophysics Data System (ADS)

    Antonoyiannakis, Manolis

    Editorial highlighting-the process whereby journal editors select, at the time of publication, a small subset of papers that are ostensibly of higher quality, importance or interest-is by now a widespread practice among major scientific journal publishers. Depending on the venue, and the extent to which editorial resources are invested in the process, highlighted papers appear as News & Views, Research Highlights, Perspectives, Editors' Choice, IOP Select, Editors' Summary, Spotlight on Optics, Editors' Picks, Viewpoints, Synopses, Editors' Suggestions, etc. Here, we look at the relation between highlighted papers and highly influential papers, which we define at two levels: having received enough citations to be among the (i) top few percent of their journal, and (ii) top 1% of all physics papers. Using multiple linear regression and multilevel regression modeling we examine the parameters associated with highly influential papers. We briefly comment on cause and effect relationships between citedness and highlighting of papers.

  12. Joint constraints on galaxy bias and σ{sub 8} through the N-pdf of the galaxy number density

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Arnalte-Mur, Pablo; Martínez, Vicent J.; Vielva, Patricio

    We present a full description of the N-probability density function of the galaxy number density fluctuations. This N-pdf is given in terms, on the one hand, of the cold dark matter correlations and, on the other hand, of the galaxy bias parameter. The method relies on the assumption commonly adopted that the dark matter density fluctuations follow a local non-linear transformation of the initial energy density perturbations. The N-pdf of the galaxy number density fluctuations allows for an optimal estimation of the bias parameter (e.g., via maximum-likelihood estimation, or Bayesian inference if there exists any a priori information on themore » bias parameter), and of those parameters defining the dark matter correlations, in particular its amplitude (σ{sub 8}). It also provides the proper framework to perform model selection between two competitive hypotheses. The parameters estimation capabilities of the N-pdf are proved by SDSS-like simulations (both, ideal log-normal simulations and mocks obtained from Las Damas simulations), showing that our estimator is unbiased. We apply our formalism to the 7th release of the SDSS main sample (for a volume-limited subset with absolute magnitudes M{sub r} ≤ −20). We obtain b-circumflex  = 1.193 ± 0.074 and σ-bar{sub 8} = 0.862 ± 0.080, for galaxy number density fluctuations in cells of the size of 30h{sup −1}Mpc. Different model selection criteria show that galaxy biasing is clearly favoured.« less

  13. Sensitivity analysis of simulated SOA loadings using a variance-based statistical approach: SENSITIVITY ANALYSIS OF SOA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shrivastava, Manish; Zhao, Chun; Easter, Richard C.

    We investigate the sensitivity of secondary organic aerosol (SOA) loadings simulated by a regional chemical transport model to 7 selected tunable model parameters: 4 involving emissions of anthropogenic and biogenic volatile organic compounds, anthropogenic semi-volatile and intermediate volatility organics (SIVOCs), and NOx, 2 involving dry deposition of SOA precursor gases, and one involving particle-phase transformation of SOA to low volatility. We adopt a quasi-Monte Carlo sampling approach to effectively sample the high-dimensional parameter space, and perform a 250 member ensemble of simulations using a regional model, accounting for some of the latest advances in SOA treatments based on our recentmore » work. We then conduct a variance-based sensitivity analysis using the generalized linear model method to study the responses of simulated SOA loadings to the tunable parameters. Analysis of SOA variance from all 250 simulations shows that the volatility transformation parameter, which controls whether particle-phase transformation of SOA from semi-volatile SOA to non-volatile is on or off, is the dominant contributor to variance of simulated surface-level daytime SOA (65% domain average contribution). We also split the simulations into 2 subsets of 125 each, depending on whether the volatility transformation is turned on/off. For each subset, the SOA variances are dominated by the parameters involving biogenic VOC and anthropogenic SIVOC emissions. Furthermore, biogenic VOC emissions have a larger contribution to SOA variance when the SOA transformation to non-volatile is on, while anthropogenic SIVOC emissions have a larger contribution when the transformation is off. NOx contributes less than 4.3% to SOA variance, and this low contribution is mainly attributed to dominance of intermediate to high NOx conditions throughout the simulated domain. The two parameters related to dry deposition of SOA precursor gases also have very low contributions to SOA variance. This study highlights the large sensitivity of SOA loadings to the particle-phase transformation of SOA volatility, which is neglected in most previous models.« less

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson, Mark A.; Bigelow, Matthew; Gilkey, Jeff C.

    The Super Strypi SWIL is a six degree-of-freedom (6DOF) simulation for the Super Strypi Launch Vehicle that includes a subset of the Super Strypi NGC software (guidance, ACS and sequencer). Aerodynamic and propulsive forces, mass properties, ACS (attitude control system) parameters, guidance parameters and Monte-Carlo parameters are defined in input files. Output parameters are saved to a Matlab mat file.

  15. An enhancement of binary particle swarm optimization for gene selection in classifying cancer classes

    PubMed Central

    2013-01-01

    Background Gene expression data could likely be a momentous help in the progress of proficient cancer diagnoses and classification platforms. Lately, many researchers analyze gene expression data using diverse computational intelligence methods, for selecting a small subset of informative genes from the data for cancer classification. Many computational methods face difficulties in selecting small subsets due to the small number of samples compared to the huge number of genes (high-dimension), irrelevant genes, and noisy genes. Methods We propose an enhanced binary particle swarm optimization to perform the selection of small subsets of informative genes which is significant for cancer classification. Particle speed, rule, and modified sigmoid function are introduced in this proposed method to increase the probability of the bits in a particle’s position to be zero. The method was empirically applied to a suite of ten well-known benchmark gene expression data sets. Results The performance of the proposed method proved to be superior to other previous related works, including the conventional version of binary particle swarm optimization (BPSO) in terms of classification accuracy and the number of selected genes. The proposed method also requires lower computational time compared to BPSO. PMID:23617960

  16. Weak-lensing mass calibration of redMaPPer galaxy clusters in Dark Energy Survey Science Verification data

    DOE PAGES

    Melchior, P.; Gruen, D.; McClintock, T.; ...

    2017-05-16

    Here, we use weak-lensing shear measurements to determine the mean mass of optically selected galaxy clusters in Dark Energy Survey Science Verification data. In a blinded analysis, we split the sample of more than 8000 redMaPPer clusters into 15 subsets, spanning ranges in the richness parameter 5 ≤ λ ≤ 180 and redshift 0.2 ≤ z ≤ 0.8, and fit the averaged mass density contrast profiles with a model that accounts for seven distinct sources of systematic uncertainty: shear measurement and photometric redshift errors; cluster-member contamination; miscentring; deviations from the NFW halo profile; halo triaxiality and line-of-sight projections.

  17. Weak-lensing mass calibration of redMaPPer galaxy clusters in Dark Energy Survey Science Verification data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Melchior, P.; Gruen, D.; McClintock, T.

    Here, we use weak-lensing shear measurements to determine the mean mass of optically selected galaxy clusters in Dark Energy Survey Science Verification data. In a blinded analysis, we split the sample of more than 8000 redMaPPer clusters into 15 subsets, spanning ranges in the richness parameter 5 ≤ λ ≤ 180 and redshift 0.2 ≤ z ≤ 0.8, and fit the averaged mass density contrast profiles with a model that accounts for seven distinct sources of systematic uncertainty: shear measurement and photometric redshift errors; cluster-member contamination; miscentring; deviations from the NFW halo profile; halo triaxiality and line-of-sight projections.

  18. Aggregating job exit statuses of a plurality of compute nodes executing a parallel application

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

    Aggregating job exit statuses of a plurality of compute nodes executing a parallel application, including: identifying a subset of compute nodes in the parallel computer to execute the parallel application; selecting one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; initiating execution of the parallel application on the subset of compute nodes; receiving an exit status from each compute node in the subset of compute nodes, where the exit status for each compute node includes information describing execution of some portion of the parallel application by the compute node; aggregatingmore » each exit status from each compute node in the subset of compute nodes; and sending an aggregated exit status for the subset of compute nodes in the parallel computer.« less

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ditzler, Gregory; Morrison, J. Calvin; Lan, Yemin

    Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate betweenmore » age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.« less

  20. Subset selective search on the basis of color and preview.

    PubMed

    Donk, Mieke

    2017-01-01

    In the preview paradigm observers are presented with one set of elements (the irrelevant set) followed by the addition of a second set among which the target is presented (the relevant set). Search efficiency in such a preview condition has been demonstrated to be higher than that in a full-baseline condition in which both sets are simultaneously presented, suggesting that a preview of the irrelevant set reduces its influence on the search process. However, numbers of irrelevant and relevant elements are typically not independently manipulated. Moreover, subset selective search also occurs when both sets are presented simultaneously but differ in color. The aim of the present study was to investigate how numbers of irrelevant and relevant elements contribute to preview search in the absence and presence of a color difference between subsets. In two experiments it was demonstrated that a preview reduced the influence of the number of irrelevant elements in the absence but not in the presence of a color difference between subsets. In the presence of a color difference, a preview lowered the effect of the number of relevant elements but only when the target was defined by a unique feature within the relevant set (Experiment 1); when the target was defined by a conjunction of features (Experiment 2), search efficiency as a function of the number of relevant elements was not modulated by a preview. Together the results are in line with the idea that subset selective search is based on different simultaneously operating mechanisms.

  1. Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification

    PubMed Central

    2012-01-01

    Background Previous studies on tumor classification based on gene expression profiles suggest that gene selection plays a key role in improving the classification performance. Moreover, finding important tumor-related genes with the highest accuracy is a very important task because these genes might serve as tumor biomarkers, which is of great benefit to not only tumor molecular diagnosis but also drug development. Results This paper proposes a novel gene selection method with rich biomedical meaning based on Heuristic Breadth-first Search Algorithm (HBSA) to find as many optimal gene subsets as possible. Due to the curse of dimensionality, this type of method could suffer from over-fitting and selection bias problems. To address these potential problems, a HBSA-based ensemble classifier is constructed using majority voting strategy from individual classifiers constructed by the selected gene subsets, and a novel HBSA-based gene ranking method is designed to find important tumor-related genes by measuring the significance of genes using their occurrence frequencies in the selected gene subsets. The experimental results on nine tumor datasets including three pairs of cross-platform datasets indicate that the proposed method can not only obtain better generalization performance but also find many important tumor-related genes. Conclusions It is found that the frequencies of the selected genes follow a power-law distribution, indicating that only a few top-ranked genes can be used as potential diagnosis biomarkers. Moreover, the top-ranked genes leading to very high prediction accuracy are closely related to specific tumor subtype and even hub genes. Compared with other related methods, the proposed method can achieve higher prediction accuracy with fewer genes. Moreover, they are further justified by analyzing the top-ranked genes in the context of individual gene function, biological pathway, and protein-protein interaction network. PMID:22830977

  2. Evaluation of intranuclear BrdU detection procedures for use in multicolor flow cytometry*

    PubMed Central

    Rothaeusler, Kristina; Baumgarth, Nicole

    2010-01-01

    Background Measurement of cell proliferation via BrdU incorporation in combination with multicolor cell surface staining would facilitate studies on cell subsets that require multiple markers for their identification. However, the extent to which the often harsh cell preparation procedures required affect the staining quality of more recently developed fluorescent dyes has not been assessed. Methods Three cell preparation protocols for BrdU measurement were compared for their ability to maintain fluorescent surface staining and scatter parameters of in vivo BrdU-labeled cells by flow cytometry. A 10-color fluorescent panel was developed to test the quality of surface staining following cell treatment and the ability to perform BrdU measurements on even small B lymphocyte subsets. Results All cell preparation procedures affected the quality of fluorescent and/or scatter parameters to varying degrees. Paraformaldehyde / saponin-based procedures preserved sufficient fluorescent surface staining to determine BrdU incorporation rates among all splenic B cell subsets, including B-1a cells, which constitute roughly 0.5% of cells. Turnover rates of B-1a cells were similar to immature B cells and higher than those of the other mature B cell subsets. Conclusion Paraformaldehyde / saponin-based cell preparation procedures facilitate detailed cell turnover studies on small cell subsets in vivo, revealing new functional information on rare cell populations. PMID:16538653

  3. Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction

    PubMed Central

    Arruti, Andoni; Cearreta, Idoia; Álvarez, Aitor; Lazkano, Elena; Sierra, Basilio

    2014-01-01

    Study of emotions in human–computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested. PMID:25279686

  4. Reference ranges of hematology and lymphocyte subsets in healthy Korean native cattle (Hanwoo) and Holstein dairy cattle.

    PubMed

    Kim, Yun-Mi; Lee, Jin-A; Jung, Bock-Gie; Kim, Tae-Hoon; Lee, Bong-Joo; Suh, Guk-Hyun

    2016-06-01

    There are no accurate reference ranges for hematology parameters and lymphocyte subsets in Korean native beef cattle (Hanwoo). This study was performed to establish reliable reference ranges of hematology and lymphocyte subsets using a large number of Hanwoo cattle (n = 350) and to compare differences between Hanwoo and Holstein dairy cattle (n = 334). Additionally, age-related changes in lymphocyte subsets were studied. Bovine leukocyte subpopulation analysis was performed using mono or dual color flow cytometry. The leukocyte subpopulations investigated in healthy cattle included: CD2(+) cells, sIgM(+) cells, MHC class II(+) cells, CD3(+) CD4(+) cells, CD3(+) CD8(+) cells, and WC1(+) cells. Although Hanwoo and Holstein cattle are the same species, results showed several differences in hematology and lymphocyte subsets between Hanwoo and Holstein cattle. This study is the first report to establish reference ranges of hematology and lymphocyte subsets in adult Hanwoo cattle. © 2015 Japanese Society of Animal Science.

  5. Algorithms for Learning Preferences for Sets of Objects

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri L.; desJardins, Marie; Eaton, Eric

    2010-01-01

    A method is being developed that provides for an artificial-intelligence system to learn a user's preferences for sets of objects and to thereafter automatically select subsets of objects according to those preferences. The method was originally intended to enable automated selection, from among large sets of images acquired by instruments aboard spacecraft, of image subsets considered to be scientifically valuable enough to justify use of limited communication resources for transmission to Earth. The method is also applicable to other sets of objects: examples of sets of objects considered in the development of the method include food menus, radio-station music playlists, and assortments of colored blocks for creating mosaics. The method does not require the user to perform the often-difficult task of quantitatively specifying preferences; instead, the user provides examples of preferred sets of objects. This method goes beyond related prior artificial-intelligence methods for learning which individual items are preferred by the user: this method supports a concept of setbased preferences, which include not only preferences for individual items but also preferences regarding types and degrees of diversity of items in a set. Consideration of diversity in this method involves recognition that members of a set may interact with each other in the sense that when considered together, they may be regarded as being complementary, redundant, or incompatible to various degrees. The effects of such interactions are loosely summarized in the term portfolio effect. The learning method relies on a preference representation language, denoted DD-PREF, to express set-based preferences. In DD-PREF, a preference is represented by a tuple that includes quality (depth) functions to estimate how desired a specific value is, weights for each feature preference, the desired diversity of feature values, and the relative importance of diversity versus depth. The system applies statistical concepts to estimate quantitative measures of the user s preferences from training examples (preferred subsets) specified by the user. Once preferences have been learned, the system uses those preferences to select preferred subsets from new sets. The method was found to be viable when tested in computational experiments on menus, music playlists, and rover images. Contemplated future development efforts include further tests on more diverse sets and development of a sub-method for (a) estimating the parameter that represents the relative importance of diversity versus depth, and (b) incorporating background knowledge about the nature of quality functions, which are special functions that specify depth preferences for features.

  6. Identification of features in indexed data and equipment therefore

    DOEpatents

    Jarman, Kristin H [Richland, WA; Daly, Don Simone [Richland, WA; Anderson, Kevin K [Richland, WA; Wahl, Karen L [Richland, WA

    2002-04-02

    Embodiments of the present invention provide methods of identifying a feature in an indexed dataset. Such embodiments encompass selecting an initial subset of indices, the initial subset of indices being encompassed by an initial window-of-interest and comprising at least one beginning index and at least one ending index; computing an intensity weighted measure of dispersion for the subset of indices using a subset of responses corresponding to the subset of indices; and comparing the intensity weighted measure of dispersion to a dispersion critical value determined from an expected value of the intensity weighted measure of dispersion under a null hypothesis of no transient feature present. Embodiments of the present invention also encompass equipment configured to perform the methods of the present invention.

  7. Evaluation of biochemical and haematological parameters and prevalence of selected pathogens in feral cats from urban and rural habitats in South Korea.

    PubMed

    Hwang, Jusun; Gottdenker, Nicole; Min, Mi-Sook; Lee, Hang; Chun, Myung-Sun

    2016-06-01

    In this study, we evaluated the potential association between the habitat types of feral cats and the prevalence of selected infectious pathogens and health status based on a set of blood parameters. We live-trapped 72 feral cats from two different habitat types: an urban area (n = 48) and a rural agricultural area (n = 24). We compared blood values and the prevalence of feline immunodeficiency virus (FIV), feline leukaemia virus (FeLV) and haemotropic Mycoplasma infection in feral cats from the two contrasting habitats. Significant differences were observed in several blood values (haematocrit, red blood cells, blood urea nitrogen, creatinine) depending on the habitat type and/or sex of the cat. Two individuals from the urban area were seropositive for FIV (3.0%), and eight (12.1%) were positive for FeLV infection (five from an urban habitat and three from a rural habitat). Haemoplasma infection was more common. Based on molecular analysis, 38 cats (54.3%) were positive for haemoplasma, with a significantly higher infection rate in cats from rural habitats (70.8%) compared with urban cats (47.8%). Our study recorded haematological and serum biochemical values, and prevalence of selected pathogens in feral cat populations from two different habitat types. A subset of important laboratory parameters from rural cats showed values under or above the corresponding reference intervals for healthy domestic cats, suggesting potential differences in the health status of feral cats depending on the habitat type. Our findings provide information about the association between 1) blood values (haematological and serum biochemistry parameters) and 2) prevalence of selected pathogen infections and different habitat types; this may be important for veterinarians who work with feral and/or stray cats and for overall cat welfare management. © ISFM and AAFP 2015.

  8. Efficient feature subset selection with probabilistic distance criteria. [pattern recognition

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1979-01-01

    Recursive expressions are derived for efficiently computing the commonly used probabilistic distance measures as a change in the criteria both when a feature is added to and when a feature is deleted from the current feature subset. A combinatorial algorithm for generating all possible r feature combinations from a given set of s features in (s/r) steps with a change of a single feature at each step is presented. These expressions can also be used for both forward and backward sequential feature selection.

  9. Variable screening via quantile partial correlation

    PubMed Central

    Ma, Shujie; Tsai, Chih-Ling

    2016-01-01

    In quantile linear regression with ultra-high dimensional data, we propose an algorithm for screening all candidate variables and subsequently selecting relevant predictors. Specifically, we first employ quantile partial correlation for screening, and then we apply the extended Bayesian information criterion (EBIC) for best subset selection. Our proposed method can successfully select predictors when the variables are highly correlated, and it can also identify variables that make a contribution to the conditional quantiles but are marginally uncorrelated or weakly correlated with the response. Theoretical results show that the proposed algorithm can yield the sure screening set. By controlling the false selection rate, model selection consistency can be achieved theoretically. In practice, we proposed using EBIC for best subset selection so that the resulting model is screening consistent. Simulation studies demonstrate that the proposed algorithm performs well, and an empirical example is presented. PMID:28943683

  10. Flight Regime Recognition Analysis for the Army UH-60A IMDS Usage

    DTIC Science & Technology

    2006-12-01

    61 Figure 34. The Behavior of The Parameter Weight.On.Wheels........................... 65 Figure 35. The ... Behavior of a Take-off Regime in Subsetting Process ................ 68 xi Figure 36. Subsetting the Big Data into Smaller Sets (WOW, Flags...of components can be extended to their true lifetime (Bechhoefer, n.d.) This is directly related to the accurate representation of regime

  11. The efficacy of calibrating hydrologic model using remotely sensed evapotranspiration and soil moisture for streamflow prediction

    NASA Astrophysics Data System (ADS)

    Kunnath-Poovakka, A.; Ryu, D.; Renzullo, L. J.; George, B.

    2016-04-01

    Calibration of spatially distributed hydrologic models is frequently limited by the availability of ground observations. Remotely sensed (RS) hydrologic information provides an alternative source of observations to inform models and extend modelling capability beyond the limits of ground observations. This study examines the capability of RS evapotranspiration (ET) and soil moisture (SM) in calibrating a hydrologic model and its efficacy to improve streamflow predictions. SM retrievals from the Advanced Microwave Scanning Radiometer-EOS (AMSR-E) and daily ET estimates from the CSIRO MODIS ReScaled potential ET (CMRSET) are used to calibrate a simplified Australian Water Resource Assessment - Landscape model (AWRA-L) for a selection of parameters. The Shuffled Complex Evolution Uncertainty Algorithm (SCE-UA) is employed for parameter estimation at eleven catchments in eastern Australia. A subset of parameters for calibration is selected based on the variance-based Sobol' sensitivity analysis. The efficacy of 15 objective functions for calibration is assessed based on streamflow predictions relative to control cases, and relative merits of each are discussed. Synthetic experiments were conducted to examine the effect of bias in RS ET observations on calibration. The objective function containing the root mean square deviation (RMSD) of ET result in best streamflow predictions and the efficacy is superior for catchments with medium to high average runoff. Synthetic experiments revealed that accurate ET product can improve the streamflow predictions in catchments with low average runoff.

  12. EmpiriciSN: Re-sampling Observed Supernova/Host Galaxy Populations Using an XD Gaussian Mixture Model

    NASA Astrophysics Data System (ADS)

    Holoien, Thomas W.-S.; Marshall, Philip J.; Wechsler, Risa H.

    2017-06-01

    We describe two new open-source tools written in Python for performing extreme deconvolution Gaussian mixture modeling (XDGMM) and using a conditioned model to re-sample observed supernova and host galaxy populations. XDGMM is new program that uses Gaussian mixtures to perform density estimation of noisy data using extreme deconvolution (XD) algorithms. Additionally, it has functionality not available in other XD tools. It allows the user to select between the AstroML and Bovy et al. fitting methods and is compatible with scikit-learn machine learning algorithms. Most crucially, it allows the user to condition a model based on the known values of a subset of parameters. This gives the user the ability to produce a tool that can predict unknown parameters based on a model that is conditioned on known values of other parameters. EmpiriciSN is an exemplary application of this functionality, which can be used to fit an XDGMM model to observed supernova/host data sets and predict likely supernova parameters using a model conditioned on observed host properties. It is primarily intended to simulate realistic supernovae for LSST data simulations based on empirical galaxy properties.

  13. EmpiriciSN: Re-sampling Observed Supernova/Host Galaxy Populations Using an XD Gaussian Mixture Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Holoien, Thomas W. -S.; Marshall, Philip J.; Wechsler, Risa H.

    We describe two new open-source tools written in Python for performing extreme deconvolution Gaussian mixture modeling (XDGMM) and using a conditioned model to re-sample observed supernova and host galaxy populations. XDGMM is new program that uses Gaussian mixtures to perform density estimation of noisy data using extreme deconvolution (XD) algorithms. Additionally, it has functionality not available in other XD tools. It allows the user to select between the AstroML and Bovy et al. fitting methods and is compatible with scikit-learn machine learning algorithms. Most crucially, it allows the user to condition a model based on the known values of amore » subset of parameters. This gives the user the ability to produce a tool that can predict unknown parameters based on a model that is conditioned on known values of other parameters. EmpiriciSN is an exemplary application of this functionality, which can be used to fit an XDGMM model to observed supernova/host data sets and predict likely supernova parameters using a model conditioned on observed host properties. It is primarily intended to simulate realistic supernovae for LSST data simulations based on empirical galaxy properties.« less

  14. Evaluating experimental design for soil-plant model selection using a Bootstrap Filter and Bayesian model averaging

    NASA Astrophysics Data System (ADS)

    Wöhling, T.; Schöniger, A.; Geiges, A.; Nowak, W.; Gayler, S.

    2013-12-01

    The objective selection of appropriate models for realistic simulations of coupled soil-plant processes is a challenging task since the processes are complex, not fully understood at larger scales, and highly non-linear. Also, comprehensive data sets are scarce, and measurements are uncertain. In the past decades, a variety of different models have been developed that exhibit a wide range of complexity regarding their approximation of processes in the coupled model compartments. We present a method for evaluating experimental design for maximum confidence in the model selection task. The method considers uncertainty in parameters, measurements and model structures. Advancing the ideas behind Bayesian Model Averaging (BMA), we analyze the changes in posterior model weights and posterior model choice uncertainty when more data are made available. This allows assessing the power of different data types, data densities and data locations in identifying the best model structure from among a suite of plausible models. The models considered in this study are the crop models CERES, SUCROS, GECROS and SPASS, which are coupled to identical routines for simulating soil processes within the modelling framework Expert-N. The four models considerably differ in the degree of detail at which crop growth and root water uptake are represented. Monte-Carlo simulations were conducted for each of these models considering their uncertainty in soil hydraulic properties and selected crop model parameters. Using a Bootstrap Filter (BF), the models were then conditioned on field measurements of soil moisture, matric potential, leaf-area index, and evapotranspiration rates (from eddy-covariance measurements) during a vegetation period of winter wheat at a field site at the Swabian Alb in Southwestern Germany. Following our new method, we derived model weights when using all data or different subsets thereof. We discuss to which degree the posterior mean outperforms the prior mean and all individual posterior models, how informative the data types were for reducing prediction uncertainty of evapotranspiration and deep drainage, and how well the model structure can be identified based on the different data types and subsets. We further analyze the impact of measurement uncertainty und systematic model errors on the effective sample size of the BF and the resulting model weights.

  15. Identification of selection signatures in cattle breeds selected for dairy production.

    PubMed

    Stella, Alessandra; Ajmone-Marsan, Paolo; Lazzari, Barbara; Boettcher, Paul

    2010-08-01

    The genomics revolution has spurred the undertaking of HapMap studies of numerous species, allowing for population genomics to increase the understanding of how selection has created genetic differences between subspecies populations. The objectives of this study were to (1) develop an approach to detect signatures of selection in subsets of phenotypically similar breeds of livestock by comparing single nucleotide polymorphism (SNP) diversity between the subset and a larger population, (2) verify this method in breeds selected for simply inherited traits, and (3) apply this method to the dairy breeds in the International Bovine HapMap (IBHM) study. The data consisted of genotypes for 32,689 SNPs of 497 animals from 19 breeds. For a given subset of breeds, the test statistic was the parametric composite log likelihood (CLL) of the differences in allelic frequencies between the subset and the IBHM for a sliding window of SNPs. The null distribution was obtained by calculating CLL for 50,000 random subsets (per chromosome) of individuals. The validity of this approach was confirmed by obtaining extremely large CLLs at the sites of causative variation for polled (BTA1) and black-coat-color (BTA18) phenotypes. Across the 30 bovine chromosomes, 699 putative selection signatures were detected. The largest CLL was on BTA6 and corresponded to KIT, which is responsible for the piebald phenotype present in four of the five dairy breeds. Potassium channel-related genes were at the site of the largest CLL on three chromosomes (BTA14, -16, and -25) whereas integrins (BTA18 and -19) and serine/arginine rich splicing factors (BTA20 and -23) each had the largest CLL on two chromosomes. On the basis of the results of this study, the application of population genomics to farm animals seems quite promising. Comparisons between breed groups have the potential to identify genomic regions influencing complex traits with no need for complex equipment and the collection of extensive phenotypic records and can contribute to the identification of candidate genes and to the understanding of the biological mechanisms controlling complex traits.

  16. Geographical traceability of wild Boletus edulis based on data fusion of FT-MIR and ICP-AES coupled with data mining methods (SVM)

    NASA Astrophysics Data System (ADS)

    Li, Yun; Zhang, Ji; Li, Tao; Liu, Honggao; Li, Jieqing; Wang, Yuanzhong

    2017-04-01

    In this work, the data fusion strategy of Fourier transform mid infrared (FT-MIR) spectroscopy and inductively coupled plasma-atomic emission spectrometry (ICP-AES) was used in combination with Support Vector Machine (SVM) to determine the geographic origin of Boletus edulis collected from nine regions of Yunnan Province in China. Firstly, competitive adaptive reweighted sampling (CARS) was used for selecting an optimal combination of key wavenumbers of second derivative FT-MIR spectra, and thirteen elements were sorted with variable importance in projection (VIP) scores. Secondly, thirteen subsets of multi-elements with the best VIP score were generated and each subset was used to fuse with FT-MIR. Finally, the classification models were established by SVM, and the combination of parameter C and γ (gamma) of SVM models was calculated by the approaches of grid search (GS) and genetic algorithm (GA). The results showed that both GS-SVM and GA-SVM models achieved good performances based on the #9 subset and the prediction accuracy in calibration and validation sets of the two models were 81.40% and 90.91%, correspondingly. In conclusion, it indicated that the data fusion strategy of FT-MIR and ICP-AES coupled with the algorithm of SVM can be used as a reliable tool for accurate identification of B. edulis, and it can provide a useful way of thinking for the quality control of edible mushrooms.

  17. Geographical traceability of wild Boletus edulis based on data fusion of FT-MIR and ICP-AES coupled with data mining methods (SVM).

    PubMed

    Li, Yun; Zhang, Ji; Li, Tao; Liu, Honggao; Li, Jieqing; Wang, Yuanzhong

    2017-04-15

    In this work, the data fusion strategy of Fourier transform mid infrared (FT-MIR) spectroscopy and inductively coupled plasma-atomic emission spectrometry (ICP-AES) was used in combination with Support Vector Machine (SVM) to determine the geographic origin of Boletus edulis collected from nine regions of Yunnan Province in China. Firstly, competitive adaptive reweighted sampling (CARS) was used for selecting an optimal combination of key wavenumbers of second derivative FT-MIR spectra, and thirteen elements were sorted with variable importance in projection (VIP) scores. Secondly, thirteen subsets of multi-elements with the best VIP score were generated and each subset was used to fuse with FT-MIR. Finally, the classification models were established by SVM, and the combination of parameter C and γ (gamma) of SVM models was calculated by the approaches of grid search (GS) and genetic algorithm (GA). The results showed that both GS-SVM and GA-SVM models achieved good performances based on the #9 subset and the prediction accuracy in calibration and validation sets of the two models were 81.40% and 90.91%, correspondingly. In conclusion, it indicated that the data fusion strategy of FT-MIR and ICP-AES coupled with the algorithm of SVM can be used as a reliable tool for accurate identification of B. edulis, and it can provide a useful way of thinking for the quality control of edible mushrooms. Copyright © 2017. Published by Elsevier B.V.

  18. Identification of petroleum hydrocarbons using a reduced number of PAHs selected by Procrustes rotation.

    PubMed

    Fernández-Varela, R; Andrade, J M; Muniategui, S; Prada, D; Ramírez-Villalobos, F

    2010-04-01

    Identifying petroleum-related products released into the environment is a complex and difficult task. To achieve this, polycyclic aromatic hydrocarbons (PAHs) are of outstanding importance nowadays. Despite traditional quantitative fingerprinting uses straightforward univariate statistical analyses to differentiate among oils and to assess their sources, a multivariate strategy based on Procrustes rotation (PR) was applied in this paper. The aim of PR is to select a reduced subset of PAHs still capable of performing a satisfactory identification of petroleum-related hydrocarbons. PR selected two subsets of three (C(2)-naphthalene, C(2)-dibenzothiophene and C(2)-phenanthrene) and five (C(1)-decahidronaphthalene, naphthalene, C(2)-phenanthrene, C(3)-phenanthrene and C(2)-fluoranthene) PAHs for each of the two datasets studied here. The classification abilities of each subset of PAHs were tested using principal components analysis, hierarchical cluster analysis and Kohonen neural networks and it was demonstrated that they unraveled the same patterns as the overall set of PAHs. (c) 2009 Elsevier Ltd. All rights reserved.

  19. Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers

    PubMed Central

    2010-01-01

    Background At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls. Conclusions Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ~ 3,000 to 5,000 evenly spaced SNP. PMID:20950478

  20. Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers.

    PubMed

    Weigel, K A; de los Campos, G; González-Recio, O; Naya, H; Wu, X L; Long, N; Rosa, G J M; Gianola, D

    2009-10-01

    The objective of the present study was to assess the predictive ability of subsets of single nucleotide polymorphism (SNP) markers for development of low-cost, low-density genotyping assays in dairy cattle. Dense SNP genotypes of 4,703 Holstein bulls were provided by the USDA Agricultural Research Service. A subset of 3,305 bulls born from 1952 to 1998 was used to fit various models (training set), and a subset of 1,398 bulls born from 1999 to 2002 was used to evaluate their predictive ability (testing set). After editing, data included genotypes for 32,518 SNP and August 2003 and April 2008 predicted transmitting abilities (PTA) for lifetime net merit (LNM$), the latter resulting from progeny testing. The Bayesian least absolute shrinkage and selection operator method was used to regress August 2003 PTA on marker covariates in the training set to arrive at estimates of marker effects and direct genomic PTA. The coefficient of determination (R(2)) from regressing the April 2008 progeny test PTA of bulls in the testing set on their August 2003 direct genomic PTA was 0.375. Subsets of 300, 500, 750, 1,000, 1,250, 1,500, and 2,000 SNP were created by choosing equally spaced and highly ranked SNP, with the latter based on the absolute value of their estimated effects obtained from the training set. The SNP effects were re-estimated from the training set for each subset of SNP, and the 2008 progeny test PTA of bulls in the testing set were regressed on corresponding direct genomic PTA. The R(2) values for subsets of 300, 500, 750, 1,000, 1,250, 1,500, and 2,000 SNP with largest effects (evenly spaced SNP) were 0.184 (0.064), 0.236 (0.111), 0.269 (0.190), 0.289 (0.179), 0.307 (0.228), 0.313 (0.268), and 0.322 (0.291), respectively. These results indicate that a low-density assay comprising selected SNP could be a cost-effective alternative for selection decisions and that significant gains in predictive ability may be achieved by increasing the number of SNP allocated to such an assay from 300 or fewer to 1,000 or more.

  1. Optimization of OSEM parameters in myocardial perfusion imaging reconstruction as a function of body mass index: a clinical approach*

    PubMed Central

    de Barros, Pietro Paolo; Metello, Luis F.; Camozzato, Tatiane Sabriela Cagol; Vieira, Domingos Manuel da Silva

    2015-01-01

    Objective The present study is aimed at contributing to identify the most appropriate OSEM parameters to generate myocardial perfusion imaging reconstructions with the best diagnostic quality, correlating them with patients’ body mass index. Materials and Methods The present study included 28 adult patients submitted to myocardial perfusion imaging in a public hospital. The OSEM method was utilized in the images reconstruction with six different combinations of iterations and subsets numbers. The images were analyzed by nuclear cardiology specialists taking their diagnostic value into consideration and indicating the most appropriate images in terms of diagnostic quality. Results An overall scoring analysis demonstrated that the combination of four iterations and four subsets has generated the most appropriate images in terms of diagnostic quality for all the classes of body mass index; however, the role played by the combination of six iterations and four subsets is highlighted in relation to the higher body mass index classes. Conclusion The use of optimized parameters seems to play a relevant role in the generation of images with better diagnostic quality, ensuring the diagnosis and consequential appropriate and effective treatment for the patient. PMID:26543282

  2. Probabilistic streamflow forecasting for hydroelectricity production: A comparison of two non-parametric system identification algorithms

    NASA Astrophysics Data System (ADS)

    Pande, Saket; Sharma, Ashish

    2014-05-01

    This study is motivated by the need to robustly specify, identify, and forecast runoff generation processes for hydroelectricity production. It atleast requires the identification of significant predictors of runoff generation and the influence of each such significant predictor on runoff response. To this end, we compare two non-parametric algorithms of predictor subset selection. One is based on information theory that assesses predictor significance (and hence selection) based on Partial Information (PI) rationale of Sharma and Mehrotra (2014). The other algorithm is based on a frequentist approach that uses bounds on probability of error concept of Pande (2005), assesses all possible predictor subsets on-the-go and converges to a predictor subset in an computationally efficient manner. Both the algorithms approximate the underlying system by locally constant functions and select predictor subsets corresponding to these functions. The performance of the two algorithms is compared on a set of synthetic case studies as well as a real world case study of inflow forecasting. References: Sharma, A., and R. Mehrotra (2014), An information theoretic alternative to model a natural system using observational information alone, Water Resources Research, 49, doi:10.1002/2013WR013845. Pande, S. (2005), Generalized local learning in water resource management, PhD dissertation, Utah State University, UT-USA, 148p.

  3. Ordering Elements and Subsets: Examples for Student Understanding

    ERIC Educational Resources Information Center

    Mellinger, Keith E.

    2004-01-01

    Teaching the art of counting can be quite difficult. Many undergraduate students have difficulty separating the ideas of permutation, combination, repetition, etc. This article develops some examples to help explain some of the underlying theory while looking carefully at the selection of various subsets of objects from a larger collection. The…

  4. Monocyte Subset Dynamics in Human Atherosclerosis Can Be Profiled with Magnetic Nano-Sensors

    PubMed Central

    Wildgruber, Moritz; Lee, Hakho; Chudnovskiy, Aleksey; Yoon, Tae-Jong; Etzrodt, Martin; Pittet, Mikael J.; Nahrendorf, Matthias; Croce, Kevin; Libby, Peter; Weissleder, Ralph; Swirski, Filip K.

    2009-01-01

    Monocytes are circulating macrophage and dendritic cell precursors that populate healthy and diseased tissue. In humans, monocytes consist of at least two subsets whose proportions in the blood fluctuate in response to coronary artery disease, sepsis, and viral infection. Animal studies have shown that specific shifts in the monocyte subset repertoire either exacerbate or attenuate disease, suggesting a role for monocyte subsets as biomarkers and therapeutic targets. Assays are therefore needed that can selectively and rapidly enumerate monocytes and their subsets. This study shows that two major human monocyte subsets express similar levels of the receptor for macrophage colony stimulating factor (MCSFR) but differ in their phagocytic capacity. We exploit these properties and custom-engineer magnetic nanoparticles for ex vivo sensing of monocytes and their subsets. We present a two-dimensional enumerative mathematical model that simultaneously reports number and proportion of monocyte subsets in a small volume of human blood. Using a recently described diagnostic magnetic resonance (DMR) chip with 1 µl sample size and high throughput capabilities, we then show that application of the model accurately quantifies subset fluctuations that occur in patients with atherosclerosis. PMID:19461894

  5. Uncertainties in Integrated Climate Change Impact Assessments by Sub-setting GCMs Based on Annual as well as Crop Growing Period under Rice Based Farming System of Indo-Gangetic Plains of India

    NASA Astrophysics Data System (ADS)

    Pillai, S. N.; Singh, H.; Panwar, A. S.; Meena, M. S.; Singh, S. V.; Singh, B.; Paudel, G. P.; Baigorria, G. A.; Ruane, A. C.; McDermid, S.; Boote, K. J.; Porter, C.; Valdivia, R. O.

    2016-12-01

    Integrated assessment of climate change impact on agricultural productivity is a challenge to the scientific community due to uncertainties of input data, particularly the climate, soil, crop calibration and socio-economic dataset. However, the uncertainty due to selection of GCMs is the major source due to complex underlying processes involved in initial as well as the boundary conditions dealt in solving the air-sea interactions. Under Agricultural Modeling Intercomparison and Improvement Project (AgMIP), the Indo-Gangetic Plains Regional Research Team investigated the uncertainties caused due to selection of GCMs through sub-setting based on annual as well as crop-growth period of rice-wheat systems in AgMIP Integrated Assessment methodology. The AgMIP Phase II protocols were used to study the linking of climate-crop-economic models for two study sites Meerut and Karnal to analyse the sensitivity of current production systems to climate change. Climate Change Projections were made using 29 CMIP5 GCMs under RCP4.5 and RCP 8.5 during mid-century period (2040-2069). Two crop models (APSIM & DSSAT) were used. TOA-MD economic model was used for integrated assessment. Based on RAPs (Representative Agricultural Pathways), some of the parameters, which are not possible to get through modeling, derived from literature and interactions with stakeholders incorporated into the TOA-MD model for integrated assessment.

  6. Selecting informative subsets of sparse supermatrices increases the chance to find correct trees.

    PubMed

    Misof, Bernhard; Meyer, Benjamin; von Reumont, Björn Marcus; Kück, Patrick; Misof, Katharina; Meusemann, Karen

    2013-12-03

    Character matrices with extensive missing data are frequently used in phylogenomics with potentially detrimental effects on the accuracy and robustness of tree inference. Therefore, many investigators select taxa and genes with high data coverage. Drawbacks of these selections are their exclusive reliance on data coverage without consideration of actual signal in the data which might, thus, not deliver optimal data matrices in terms of potential phylogenetic signal. In order to circumvent this problem, we have developed a heuristics implemented in a software called mare which (1) assesses information content of genes in supermatrices using a measure of potential signal combined with data coverage and (2) reduces supermatrices with a simple hill climbing procedure to submatrices with high total information content. We conducted simulation studies using matrices of 50 taxa × 50 genes with heterogeneous phylogenetic signal among genes and data coverage between 10-30%. With matrices of 50 taxa × 50 genes with heterogeneous phylogenetic signal among genes and data coverage between 10-30% Maximum Likelihood (ML) tree reconstructions failed to recover correct trees. A selection of a data subset with the herein proposed approach increased the chance to recover correct partial trees more than 10-fold. The selection of data subsets with the herein proposed simple hill climbing procedure performed well either considering the information content or just a simple presence/absence information of genes. We also applied our approach on an empirical data set, addressing questions of vertebrate systematics. With this empirical dataset selecting a data subset with high information content and supporting a tree with high average boostrap support was most successful if information content of genes was considered. Our analyses of simulated and empirical data demonstrate that sparse supermatrices can be reduced on a formal basis outperforming the usually used simple selections of taxa and genes with high data coverage.

  7. Generating a Simulated Fluid Flow over a Surface Using Anisotropic Diffusion

    NASA Technical Reports Server (NTRS)

    Rodriguez, David L. (Inventor); Sturdza, Peter (Inventor)

    2016-01-01

    A fluid-flow simulation over a computer-generated surface is generated using a diffusion technique. The surface is comprised of a surface mesh of polygons. A boundary-layer fluid property is obtained for a subset of the polygons of the surface mesh. A gradient vector is determined for a selected polygon, the selected polygon belonging to the surface mesh but not one of the subset of polygons. A maximum and minimum diffusion rate is determined along directions determined using the gradient vector corresponding to the selected polygon. A diffusion-path vector is defined between a point in the selected polygon and a neighboring point in a neighboring polygon. An updated fluid property is determined for the selected polygon using a variable diffusion rate, the variable diffusion rate based on the minimum diffusion rate, maximum diffusion rate, and the gradient vector.

  8. An Iris Segmentation Algorithm based on Edge Orientation for Off-angle Iris Recognition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Karakaya, Mahmut; Barstow, Del R; Santos-Villalobos, Hector J

    Iris recognition is known as one of the most accurate and reliable biometrics. However, the accuracy of iris recognition systems depends on the quality of data capture and is negatively affected by several factors such as angle, occlusion, and dilation. In this paper, we present a segmentation algorithm for off-angle iris images that uses edge detection, edge elimination, edge classification, and ellipse fitting techniques. In our approach, we first detect all candidate edges in the iris image by using the canny edge detector; this collection contains edges from the iris and pupil boundaries as well as eyelash, eyelids, iris texturemore » etc. Edge orientation is used to eliminate the edges that cannot be part of the iris or pupil. Then, we classify the remaining edge points into two sets as pupil edges and iris edges. Finally, we randomly generate subsets of iris and pupil edge points, fit ellipses for each subset, select ellipses with similar parameters, and average to form the resultant ellipses. Based on the results from real experiments, the proposed method shows effectiveness in segmentation for off-angle iris images.« less

  9. Subjective study of preferred listening conditions in Italian Catholic churches

    NASA Astrophysics Data System (ADS)

    Martellotta, Francesco

    2008-10-01

    The paper describes the results of research aimed at investigating the preferred subjective listening conditions inside churches. The effect of different musical motifs (spanning Gregorian chants to symphonic music) was investigated and regression analysis was performed in order to point out the relationship between subjective ratings and acoustical parameters. In order to present realistic listening conditions to the subjects a small subset of nine churches was selected among a larger set of acoustic data collected in several Italian churches during a widespread on-site survey. The subset represented different architectural styles and shapes, and was characterized by average listening conditions. For each church a single source-receiver combination with fixed relative positions was chosen. Measured binaural impulse responses were cross-talk cancelled and then convolved with five anechoic motifs. Paired comparisons were finally performed, asking a trained panel of subjects their preference. Factor analysis pointed out a substantially common underlying pattern characterizing subjective responses. The results show that preferred listening conditions vary as a function of the musical motif, depending on early decay time for choral music and on a combination of initial time delay and lateral energy for instrumental music.

  10. Elucidation of Seventeen Human Peripheral Blood B cell Subsets and Quantification of the Tetanus Response Using a Density-Based Method for the Automated Identification of Cell Populations in Multidimensional Flow Cytometry Data

    PubMed Central

    Qian, Yu; Wei, Chungwen; Lee, F. Eun-Hyung; Campbell, John; Halliley, Jessica; Lee, Jamie A.; Cai, Jennifer; Kong, Megan; Sadat, Eva; Thomson, Elizabeth; Dunn, Patrick; Seegmiller, Adam C.; Karandikar, Nitin J.; Tipton, Chris; Mosmann, Tim; Sanz, Iñaki; Scheuermann, Richard H.

    2011-01-01

    Background Advances in multi-parameter flow cytometry (FCM) now allow for the independent detection of larger numbers of fluorochromes on individual cells, generating data with increasingly higher dimensionality. The increased complexity of these data has made it difficult to identify cell populations from high-dimensional FCM data using traditional manual gating strategies based on single-color or two-color displays. Methods To address this challenge, we developed a novel program, FLOCK (FLOw Clustering without K), that uses a density-based clustering approach to algorithmically identify biologically relevant cell populations from multiple samples in an unbiased fashion, thereby eliminating operator-dependent variability. Results FLOCK was used to objectively identify seventeen distinct B cell subsets in a human peripheral blood sample and to identify and quantify novel plasmablast subsets responding transiently to tetanus and other vaccinations in peripheral blood. FLOCK has been implemented in the publically available Immunology Database and Analysis Portal – ImmPort (http://www.immport.org) for open use by the immunology research community. Conclusions FLOCK is able to identify cell subsets in experiments that use multi-parameter flow cytometry through an objective, automated computational approach. The use of algorithms like FLOCK for FCM data analysis obviates the need for subjective and labor intensive manual gating to identify and quantify cell subsets. Novel populations identified by these computational approaches can serve as hypotheses for further experimental study. PMID:20839340

  11. A SOAP Web Service for accessing MODIS land product subsets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    SanthanaVannan, Suresh K; Cook, Robert B; Pan, Jerry Yun

    2011-01-01

    Remote sensing data from satellites have provided valuable information on the state of the earth for several decades. Since March 2000, the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on board NASA s Terra and Aqua satellites have been providing estimates of several land parameters useful in understanding earth system processes at global, continental, and regional scales. However, the HDF-EOS file format, specialized software needed to process the HDF-EOS files, data volume, and the high spatial and temporal resolution of MODIS data make it difficult for users wanting to extract small but valuable amounts of information from the MODIS record. Tomore » overcome this usability issue, the NASA-funded Distributed Active Archive Center (DAAC) for Biogeochemical Dynamics at Oak Ridge National Laboratory (ORNL) developed a Web service that provides subsets of MODIS land products using Simple Object Access Protocol (SOAP). The ORNL DAAC MODIS subsetting Web service is a unique way of serving satellite data that exploits a fairly established and popular Internet protocol to allow users access to massive amounts of remote sensing data. The Web service provides MODIS land product subsets up to 201 x 201 km in a non-proprietary comma delimited text file format. Users can programmatically query the Web service to extract MODIS land parameters for real time data integration into models, decision support tools or connect to workflow software. Information regarding the MODIS SOAP subsetting Web service is available on the World Wide Web (WWW) at http://daac.ornl.gov/modiswebservice.« less

  12. On Subset Selection Procedures for Poisson Processes and Some Applications to the Binomial and Multinomial Problems

    DTIC Science & Technology

    1976-07-01

    PURDUE UNIVERSITY DEPARTMENT OF STATISTICS DIVISION OF MATHEMATICAL SCIENCES ON SUBSET SELECTION PROCEDURES FOR POISSON PROCESSES AND SOME...Mathematical Sciences Mimeograph Series #457, July 1976 This research was supported by the Office of Naval Research under Contract NOOO14-75-C-0455 at Purdue...11 CON PC-111 riFIC-F ,A.F ANO ADDPFS Office of INaval ResearchJu#07 Washington, DC07 36AE 14~~~ rjCr; NF A ’ , A FAA D F 6 - I S it 9 i 1, - ,1 I

  13. Phylogenetic signal in the acoustic parameters of the advertisement calls of four clades of anurans.

    PubMed

    Gingras, Bruno; Mohandesan, Elmira; Boko, Drasko; Fitch, W Tecumseh

    2013-07-01

    Anuran vocalizations, especially their advertisement calls, are largely species-specific and can be used to identify taxonomic affiliations. Because anurans are not vocal learners, their vocalizations are generally assumed to have a strong genetic component. This suggests that the degree of similarity between advertisement calls may be related to large-scale phylogenetic relationships. To test this hypothesis, advertisement calls from 90 species belonging to four large clades (Bufo, Hylinae, Leptodactylus, and Rana) were analyzed. Phylogenetic distances were estimated based on the DNA sequences of the 12S mitochondrial ribosomal RNA gene, and, for a subset of 49 species, on the rhodopsin gene. Mean values for five acoustic parameters (coefficient of variation of root-mean-square amplitude, dominant frequency, spectral flux, spectral irregularity, and spectral flatness) were computed for each species. We then tested for phylogenetic signal on the body-size-corrected residuals of these five parameters, using three statistical tests (Moran's I, Mantel, and Blomberg's K) and three models of genetic distance (pairwise distances, Abouheif's proximities, and the variance-covariance matrix derived from the phylogenetic tree). A significant phylogenetic signal was detected for most acoustic parameters on the 12S dataset, across statistical tests and genetic distance models, both for the entire sample of 90 species and within clades in several cases. A further analysis on a subset of 49 species using genetic distances derived from rhodopsin and from 12S broadly confirmed the results obtained on the larger sample, indicating that the phylogenetic signals observed in these acoustic parameters can be detected using a variety of genetic distance models derived either from a variable mitochondrial sequence or from a conserved nuclear gene. We found a robust relationship, in a large number of species, between anuran phylogenetic relatedness and acoustic similarity in the advertisement calls in a taxon with no evidence for vocal learning, even after correcting for the effect of body size. This finding, covering a broad sample of species whose vocalizations are fairly diverse, indicates that the intense selection on certain call characteristics observed in many anurans does not eliminate all acoustic indicators of relatedness. Our approach could potentially be applied to other vocal taxa.

  14. Profiling dendritic cell subsets in head and neck squamous cell tonsillar cancer and benign tonsils.

    PubMed

    Abolhalaj, Milad; Askmyr, David; Sakellariou, Christina Alexandra; Lundberg, Kristina; Greiff, Lennart; Lindstedt, Malin

    2018-05-23

    Dendritic cells (DCs) have a key role in orchestrating immune responses and are considered important targets for immunotherapy against cancer. In order to develop effective cancer vaccines, detailed knowledge of the micromilieu in cancer lesions is warranted. In this study, flow cytometry and human transcriptome arrays were used to characterize subsets of DCs in head and neck squamous cell tonsillar cancer and compare them to their counterparts in benign tonsils to evaluate subset-selective biomarkers associated with tonsillar cancer. We describe, for the first time, four subsets of DCs in tonsillar cancer: CD123 + plasmacytoid DCs (pDC), CD1c + , CD141 + , and CD1c - CD141 - myeloid DCs (mDC). An increased frequency of DCs and an elevated mDC/pDC ratio were shown in malignant compared to benign tonsillar tissue. The microarray data demonstrates characteristics specific for tonsil cancer DC subsets, including expression of immunosuppressive molecules and lower expression levels of genes involved in development of effector immune responses in DCs in malignant tonsillar tissue, compared to their counterparts in benign tonsillar tissue. Finally, we present target candidates selectively expressed by different DC subsets in malignant tonsils and confirm expression of CD206/MRC1 and CD207/Langerin on CD1c + DCs at protein level. This study descibes DC characteristics in the context of head and neck cancer and add valuable steps towards future DC-based therapies against tonsillar cancer.

  15. Identifying developmental toxicity pathways for a subset of ToxCast chemicals using human embryonic stem cells and metabolomics

    EPA Science Inventory

    Metabolomics analysis was performed on the supernatant of human embryonic stem (hES) cell cultures exposed to a blinded subset of 11 chemicals selected from the chemical library of EPA's ToxCast™ chemical screening and prioritization research project. Metabolites from hES cultur...

  16. Canonical Measure of Correlation (CMC) and Canonical Measure of Distance (CMD) between sets of data. Part 3. Variable selection in classification.

    PubMed

    Ballabio, Davide; Consonni, Viviana; Mauri, Andrea; Todeschini, Roberto

    2010-01-11

    In multivariate regression and classification issues variable selection is an important procedure used to select an optimal subset of variables with the aim of producing more parsimonious and eventually more predictive models. Variable selection is often necessary when dealing with methodologies that produce thousands of variables, such as Quantitative Structure-Activity Relationships (QSARs) and highly dimensional analytical procedures. In this paper a novel method for variable selection for classification purposes is introduced. This method exploits the recently proposed Canonical Measure of Correlation between two sets of variables (CMC index). The CMC index is in this case calculated for two specific sets of variables, the former being comprised of the independent variables and the latter of the unfolded class matrix. The CMC values, calculated by considering one variable at a time, can be sorted and a ranking of the variables on the basis of their class discrimination capabilities results. Alternatively, CMC index can be calculated for all the possible combinations of variables and the variable subset with the maximal CMC can be selected, but this procedure is computationally more demanding and classification performance of the selected subset is not always the best one. The effectiveness of the CMC index in selecting variables with discriminative ability was compared with that of other well-known strategies for variable selection, such as the Wilks' Lambda, the VIP index based on the Partial Least Squares-Discriminant Analysis, and the selection provided by classification trees. A variable Forward Selection based on the CMC index was finally used in conjunction of Linear Discriminant Analysis. This approach was tested on several chemical data sets. Obtained results were encouraging.

  17. Mapping tropical rainforest canopies using multi-temporal spaceborne imaging spectroscopy

    NASA Astrophysics Data System (ADS)

    Somers, Ben; Asner, Gregory P.

    2013-10-01

    The use of imaging spectroscopy for florisic mapping of forests is complicated by the spectral similarity among coexisting species. Here we evaluated an alternative spectral unmixing strategy combining a time series of EO-1 Hyperion images and an automated feature selection strategy in MESMA. Instead of using the same spectral subset to unmix each image pixel, our modified approach allowed the spectral subsets to vary on a per pixel basis such that each pixel is evaluated using a spectral subset tuned towards maximal separability of its specific endmember class combination or species mixture. The potential of the new approach for floristic mapping of tree species in Hawaiian rainforests was quantitatively demonstrated using both simulated and actual hyperspectral image time-series. With a Cohen's Kappa coefficient of 0.65, our approach provided a more accurate tree species map compared to MESMA (Kappa = 0.54). In addition, by the selection of spectral subsets our approach was about 90% faster than MESMA. The flexible or adaptive use of band sets in spectral unmixing as such provides an interesting avenue to address spectral similarities in complex vegetation canopies.

  18. [Soluble interleukin 2 receptor as activity parameter in serum of systemic and discoid lupus erythematosus].

    PubMed

    Blum, C; Zillikens, D; Tony, H P; Hartmann, A A; Burg, G

    1993-05-01

    The evaluation of disease activity in systemic lupus erythematosus (SLE) is important for selection of the appropriate therapeutic regimen. In addition to the clinical picture, various laboratory parameters are taken into account. However, no validated criteria for the evaluation of the disease activity in SLE have yet been established. Recently, serum levels of soluble interleukin-2 receptor (sIL-2R) have been proposed as a potential parameter for disease activity in SLE. However, the studies reported on this subject so far have focused mainly on certain subsets of the disease, and the evaluation of the disease activity was based on a very limited number of parameters. In the present study, we determined serum levels of sIL-2R in 23 patients with SLE and 30 patients with discoid LE (DLE). Evaluation of disease activity in SLE was based on a comprehensive scale which considered numerous clinical signs and laboratory parameters. In SLE, serum levels of sIL-2R showed a better correlation with disease activity than all the other parameters investigated, including proteinuria, erythrocyte sedimentation rate, serum globulin concentration, titre of antibodies against double-stranded DNA, serum albumin concentration, serum complement levels and white blood cell count. For the first time, we report on elevated serum levels of sIL-2R in DLE, which also correlated with disease activity.

  19. Application of machine learning on brain cancer multiclass classification

    NASA Astrophysics Data System (ADS)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  20. Blinded evaluation of farnesoid X receptor (FXR) ligands binding using molecular docking and free energy calculations

    NASA Astrophysics Data System (ADS)

    Selwa, Edithe; Elisée, Eddy; Zavala, Agustin; Iorga, Bogdan I.

    2018-01-01

    Our participation to the D3R Grand Challenge 2 involved a protocol in two steps, with an initial analysis of the available structural data from the PDB allowing the selection of the most appropriate combination of docking software and scoring function. Subsequent docking calculations showed that the pose prediction can be carried out with a certain precision, but this is dependent on the specific nature of the ligands. The correct ranking of docking poses is still a problem and cannot be successful in the absence of good pose predictions. Our free energy calculations on two different subsets provided contrasted results, which might have the origin in non-optimal force field parameters associated with the sulfonamide chemical moiety.

  1. On determining important aspects of mathematical models: Application to problems in physics and chemistry

    NASA Technical Reports Server (NTRS)

    Rabitz, Herschel

    1987-01-01

    The use of parametric and functional gradient sensitivity analysis techniques is considered for models described by partial differential equations. By interchanging appropriate dependent and independent variables, questions of inverse sensitivity may be addressed to gain insight into the inversion of observational data for parameter and function identification in mathematical models. It may be argued that the presence of a subset of dominantly strong coupled dependent variables will result in the overall system sensitivity behavior collapsing into a simple set of scaling and self similarity relations amongst elements of the entire matrix of sensitivity coefficients. These general tools are generic in nature, but herein their application to problems arising in selected areas of physics and chemistry is presented.

  2. Dissipative particle dynamics: Systematic parametrization using water-octanol partition coefficients

    NASA Astrophysics Data System (ADS)

    Anderson, Richard L.; Bray, David J.; Ferrante, Andrea S.; Noro, Massimo G.; Stott, Ian P.; Warren, Patrick B.

    2017-09-01

    We present a systematic, top-down, thermodynamic parametrization scheme for dissipative particle dynamics (DPD) using water-octanol partition coefficients, supplemented by water-octanol phase equilibria and pure liquid phase density data. We demonstrate the feasibility of computing the required partition coefficients in DPD using brute-force simulation, within an adaptive semi-automatic staged optimization scheme. We test the methodology by fitting to experimental partition coefficient data for twenty one small molecules in five classes comprising alcohols and poly-alcohols, amines, ethers and simple aromatics, and alkanes (i.e., hexane). Finally, we illustrate the transferability of a subset of the determined parameters by calculating the critical micelle concentrations and mean aggregation numbers of selected alkyl ethoxylate surfactants, in good agreement with reported experimental values.

  3. Selecting electrode configurations for image-guided cochlear implant programming using template matching.

    PubMed

    Zhang, Dongqing; Zhao, Yiyuan; Noble, Jack H; Dawant, Benoit M

    2018-04-01

    Cochlear implants (CIs) are neural prostheses that restore hearing using an electrode array implanted in the cochlea. After implantation, the CI processor is programmed by an audiologist. One factor that negatively impacts outcomes and can be addressed by programming is cross-electrode neural stimulation overlap (NSO). We have proposed a system to assist the audiologist in programming the CI that we call image-guided CI programming (IGCIP). IGCIP permits using CT images to detect NSO and recommend deactivation of a subset of electrodes to avoid NSO. We have shown that IGCIP significantly improves hearing outcomes. Most of the IGCIP steps are robustly automated but electrode configuration selection still sometimes requires manual intervention. With expertise, distance-versus-frequency curves, which are a way to visualize the spatial relationship learned from CT between the electrodes and the nerves they stimulate, can be used to select the electrode configuration. We propose an automated technique for electrode configuration selection. A comparison between this approach and one we have previously proposed shows that our method produces results that are as good as those obtained with our previous method while being generic and requiring fewer parameters.

  4. A global optimization approach to multi-polarity sentiment analysis.

    PubMed

    Li, Xinmiao; Li, Jing; Wu, Yukeng

    2015-01-01

    Following the rapid development of social media, sentiment analysis has become an important social media mining technique. The performance of automatic sentiment analysis primarily depends on feature selection and sentiment classification. While information gain (IG) and support vector machines (SVM) are two important techniques, few studies have optimized both approaches in sentiment analysis. The effectiveness of applying a global optimization approach to sentiment analysis remains unclear. We propose a global optimization-based sentiment analysis (PSOGO-Senti) approach to improve sentiment analysis with IG for feature selection and SVM as the learning engine. The PSOGO-Senti approach utilizes a particle swarm optimization algorithm to obtain a global optimal combination of feature dimensions and parameters in the SVM. We evaluate the PSOGO-Senti model on two datasets from different fields. The experimental results showed that the PSOGO-Senti model can improve binary and multi-polarity Chinese sentiment analysis. We compared the optimal feature subset selected by PSOGO-Senti with the features in the sentiment dictionary. The results of this comparison indicated that PSOGO-Senti can effectively remove redundant and noisy features and can select a domain-specific feature subset with a higher-explanatory power for a particular sentiment analysis task. The experimental results showed that the PSOGO-Senti approach is effective and robust for sentiment analysis tasks in different domains. By comparing the improvements of two-polarity, three-polarity and five-polarity sentiment analysis results, we found that the five-polarity sentiment analysis delivered the largest improvement. The improvement of the two-polarity sentiment analysis was the smallest. We conclude that the PSOGO-Senti achieves higher improvement for a more complicated sentiment analysis task. We also compared the results of PSOGO-Senti with those of the genetic algorithm (GA) and grid search method. From the results of this comparison, we found that PSOGO-Senti is more suitable for improving a difficult multi-polarity sentiment analysis problem.

  5. Application of validation data for assessing spatial interpolation methods for 8-h ozone or other sparsely monitored constituents.

    PubMed

    Joseph, John; Sharif, Hatim O; Sunil, Thankam; Alamgir, Hasanat

    2013-07-01

    The adverse health effects of high concentrations of ground-level ozone are well-known, but estimating exposure is difficult due to the sparseness of urban monitoring networks. This sparseness discourages the reservation of a portion of the monitoring stations for validation of interpolation techniques precisely when the risk of overfitting is greatest. In this study, we test a variety of simple spatial interpolation techniques for 8-h ozone with thousands of randomly selected subsets of data from two urban areas with monitoring stations sufficiently numerous to allow for true validation. Results indicate that ordinary kriging with only the range parameter calibrated in an exponential variogram is the generally superior method, and yields reliable confidence intervals. Sparse data sets may contain sufficient information for calibration of the range parameter even if the Moran I p-value is close to unity. R script is made available to apply the methodology to other sparsely monitored constituents. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. Initial Navigation Alignment of Optical Instruments on GOES-R

    NASA Technical Reports Server (NTRS)

    Isaacson, Peter J.; DeLuccia, Frank J.; Reth, Alan D.; Igli, David A.; Carter, Delano R.

    2016-01-01

    Post-launch alignment errors for the Advanced Baseline Imager (ABI) and Geospatial Lightning Mapper (GLM) on GOES-R may be too large for the image navigation and registration (INR) processing algorithms to function without an initial adjustment to calibration parameters. We present an approach that leverages a combination of user-selected image-to-image tie points and image correlation algorithms to estimate this initial launch-induced offset and calculate adjustments to the Line of Sight Motion Compensation (LMC) parameters. We also present an approach to generate synthetic test images, to which shifts and rotations of known magnitude are applied. Results of applying the initial alignment tools to a subset of these synthetic test images are presented. The results for both ABI and GLM are within the specifications established for these tools, and indicate that application of these tools during the post-launch test (PLT) phase of GOES-R operations will enable the automated INR algorithms for both instruments to function as intended.

  7. Domino: Extracting, Comparing, and Manipulating Subsets across Multiple Tabular Datasets

    PubMed Central

    Gratzl, Samuel; Gehlenborg, Nils; Lex, Alexander; Pfister, Hanspeter; Streit, Marc

    2016-01-01

    Answering questions about complex issues often requires analysts to take into account information contained in multiple interconnected datasets. A common strategy in analyzing and visualizing large and heterogeneous data is dividing it into meaningful subsets. Interesting subsets can then be selected and the associated data and the relationships between the subsets visualized. However, neither the extraction and manipulation nor the comparison of subsets is well supported by state-of-the-art techniques. In this paper we present Domino, a novel multiform visualization technique for effectively representing subsets and the relationships between them. By providing comprehensive tools to arrange, combine, and extract subsets, Domino allows users to create both common visualization techniques and advanced visualizations tailored to specific use cases. In addition to the novel technique, we present an implementation that enables analysts to manage the wide range of options that our approach offers. Innovative interactive features such as placeholders and live previews support rapid creation of complex analysis setups. We introduce the technique and the implementation using a simple example and demonstrate scalability and effectiveness in a use case from the field of cancer genomics. PMID:26356916

  8. Intraclonal Cell Expansion and Selection Driven by B Cell Receptor in Chronic Lymphocytic Leukemia

    PubMed Central

    Colombo, Monica; Cutrona, Giovanna; Reverberi, Daniele; Fabris, Sonia; Neri, Antonino; Fabbi, Marina; Quintana, Giovanni; Quarta, Giovanni; Ghiotto, Fabio; Fais, Franco; Ferrarini, Manlio

    2011-01-01

    The mutational status of the immunoglobulin heavy-chain variable region (IGHV) genes utilized by chronic lymphocytic leukemia (CLL) clones defines two disease subgroups. Patients with unmutated IGHV have a more aggressive disease and a worse outcome than patients with cells having somatic IGHV gene mutations. Moreover, up to 30% of the unmutated CLL clones exhibit very similar or identical B cell receptors (BcR), often encoded by the same IG genes. These “stereotyped” BcRs have been classified into defined subsets. The presence of an IGHV gene somatic mutation and the utilization of a skewed gene repertoire compared with normal B cells together with the expression of stereotyped receptors by unmutated CLL clones may indicate stimulation/selection by antigenic epitopes. This antigenic stimulation may occur prior to or during neoplastic transformation, but it is unknown whether this stimulation/selection continues after leukemogenesis has ceased. In this study, we focused on seven CLL cases with stereotyped BcR Subset #8 found among a cohort of 700 patients; in six, the cells expressed IgG and utilized IGHV4-39 and IGKV1-39/IGKV1D-39 genes, as reported for Subset #8 BcR. One case exhibited special features, including expression of IgM or IgG by different subclones consequent to an isotype switch, allelic inclusion at the IGH locus in the IgM-expressing cells and a particular pattern of cytogenetic lesions. Collectively, the data indicate a process of antigenic stimulation/selection of the fully transformed CLL cells leading to the expansion of the Subset #8 IgG-bearing subclone. PMID:21541442

  9. Generating a Simulated Fluid Flow Over an Aircraft Surface Using Anisotropic Diffusion

    NASA Technical Reports Server (NTRS)

    Rodriguez, David L. (Inventor); Sturdza, Peter (Inventor)

    2013-01-01

    A fluid-flow simulation over a computer-generated aircraft surface is generated using a diffusion technique. The surface is comprised of a surface mesh of polygons. A boundary-layer fluid property is obtained for a subset of the polygons of the surface mesh. A pressure-gradient vector is determined for a selected polygon, the selected polygon belonging to the surface mesh but not one of the subset of polygons. A maximum and minimum diffusion rate is determined along directions determined using a pressure gradient vector corresponding to the selected polygon. A diffusion-path vector is defined between a point in the selected polygon and a neighboring point in a neighboring polygon. An updated fluid property is determined for the selected polygon using a variable diffusion rate, the variable diffusion rate based on the minimum diffusion rate, maximum diffusion rate, and angular difference between the diffusion-path vector and the pressure-gradient vector.

  10. System and method for progressive band selection for hyperspectral images

    NASA Technical Reports Server (NTRS)

    Fisher, Kevin (Inventor)

    2013-01-01

    Disclosed herein are systems, methods, and non-transitory computer-readable storage media for progressive band selection for hyperspectral images. A system having module configured to control a processor to practice the method calculates a virtual dimensionality of a hyperspectral image having multiple bands to determine a quantity Q of how many bands are needed for a threshold level of information, ranks each band based on a statistical measure, selects Q bands from the multiple bands to generate a subset of bands based on the virtual dimensionality, and generates a reduced image based on the subset of bands. This approach can create reduced datasets of full hyperspectral images tailored for individual applications. The system uses a metric specific to a target application to rank the image bands, and then selects the most useful bands. The number of bands selected can be specified manually or calculated from the hyperspectral image's virtual dimensionality.

  11. Efficient Simulation Budget Allocation for Selecting an Optimal Subset

    NASA Technical Reports Server (NTRS)

    Chen, Chun-Hung; He, Donghai; Fu, Michael; Lee, Loo Hay

    2008-01-01

    We consider a class of the subset selection problem in ranking and selection. The objective is to identify the top m out of k designs based on simulated output. Traditional procedures are conservative and inefficient. Using the optimal computing budget allocation framework, we formulate the problem as that of maximizing the probability of correc tly selecting all of the top-m designs subject to a constraint on the total number of samples available. For an approximation of this corre ct selection probability, we derive an asymptotically optimal allocat ion and propose an easy-to-implement heuristic sequential allocation procedure. Numerical experiments indicate that the resulting allocatio ns are superior to other methods in the literature that we tested, and the relative efficiency increases for larger problems. In addition, preliminary numerical results indicate that the proposed new procedur e has the potential to enhance computational efficiency for simulation optimization.

  12. Enhancing the Performance of LibSVM Classifier by Kernel F-Score Feature Selection

    NASA Astrophysics Data System (ADS)

    Sarojini, Balakrishnan; Ramaraj, Narayanasamy; Nickolas, Savarimuthu

    Medical Data mining is the search for relationships and patterns within the medical datasets that could provide useful knowledge for effective clinical decisions. The inclusion of irrelevant, redundant and noisy features in the process model results in poor predictive accuracy. Much research work in data mining has gone into improving the predictive accuracy of the classifiers by applying the techniques of feature selection. Feature selection in medical data mining is appreciable as the diagnosis of the disease could be done in this patient-care activity with minimum number of significant features. The objective of this work is to show that selecting the more significant features would improve the performance of the classifier. We empirically evaluate the classification effectiveness of LibSVM classifier on the reduced feature subset of diabetes dataset. The evaluations suggest that the feature subset selected improves the predictive accuracy of the classifier and reduce false negatives and false positives.

  13. High Aldehyde Dehydrogenase Activity Identifies a Subset of Human Mesenchymal Stromal Cells with Vascular Regenerative Potential.

    PubMed

    Sherman, Stephen E; Kuljanin, Miljan; Cooper, Tyler T; Putman, David M; Lajoie, Gilles A; Hess, David A

    2017-06-01

    During culture expansion, multipotent mesenchymal stromal cells (MSCs) differentially express aldehyde dehydrogenase (ALDH), an intracellular detoxification enzyme that protects long-lived cells against oxidative stress. Thus, MSC selection based on ALDH-activity may be used to reduce heterogeneity and distinguish MSC subsets with improved regenerative potency. After expansion of human bone marrow-derived MSCs, cell progeny was purified based on low versus high ALDH-activity (ALDH hi ) by fluorescence-activated cell sorting, and each subset was compared for multipotent stromal and provascular regenerative functions. Both ALDH l ° and ALDH hi MSC subsets demonstrated similar expression of stromal cell (>95% CD73 + , CD90 + , CD105 + ) and pericyte (>95% CD146 + ) surface markers and showed multipotent differentiation into bone, cartilage, and adipose cells in vitro. Conditioned media (CDM) generated by ALDH hi MSCs demonstrated a potent proliferative and prosurvival effect on human microvascular endothelial cells (HMVECs) under serum-free conditions and augmented HMVEC tube-forming capacity in growth factor-reduced matrices. After subcutaneous transplantation within directed in vivo angiogenesis assay implants into immunodeficient mice, ALDH hi MSC or CDM produced by ALDH hi MSC significantly augmented murine vascular cell recruitment and perfused vessel infiltration compared with ALDH l ° MSC. Although both subsets demonstrated strikingly similar mRNA expression patterns, quantitative proteomic analyses performed on subset-specific CDM revealed the ALDH hi MSC subset uniquely secreted multiple proangiogenic cytokines (vascular endothelial growth factor beta, platelet derived growth factor alpha, and angiogenin) and actively produced multiple factors with chemoattractant (transforming growth factor-β, C-X-C motif chemokine ligand 1, 2, and 3 (GRO), C-C motif chemokine ligand 5 (RANTES), monocyte chemotactic protein 1 (MCP-1), interleukin [IL]-6, IL-8) and matrix-modifying functions (tissue inhibitor of metalloprotinase 1 & 2 (TIMP1/2)). Collectively, MSCs selected for ALDH hi demonstrated enhanced proangiogenic secretory functions and represent a purified MSC subset amenable for vascular regenerative applications. Stem Cells 2017;35:1542-1553. © 2017 AlphaMed Press.

  14. Gait adaptations with aging in healthy participants and people with knee-joint osteoarthritis.

    PubMed

    Duffell, Lynsey D; Jordan, Stevan J; Cobb, Justin P; McGregor, Alison H

    2017-09-01

    The relationship between age and gait characteristics in people with and without medial compartment osteoarthritis (OA) remains unclear. We aimed to characterize this relationship and to relate biomechanical and structural parameters in a subset of OA patients. Twenty five participants with diagnosed unilateral medial knee OA and 84 healthy participants, with no known knee pathology were recruited. 3D motion capture was used to analyse sagittal and coronal plane gait parameters while participants walked at a comfortable speed. Participants were categorized according to age (18-30, 31-59 and 60+ years), and those with and without OA were compared between and within age groups. In a subset of OA patients, clinically available Computed Tomography images were used to assess joint structure. Differences in coronal plane kinematics at the hip and knee were noted in participants with OA particularly those who were older compared with our healthy controls, as well as increased knee moments. Knee adduction moment correlated with structural parameters in the subset of OA patients. Increased knee moments and altered kinematics were observed in older participants presenting with OA only, which seem to be related to morphological changes in the joint due to OA, as opposed to being related to the initial cause of medial knee OA. Copyright © 2017. Published by Elsevier B.V.

  15. Optimized hyperspectral band selection using hybrid genetic algorithm and gravitational search algorithm

    NASA Astrophysics Data System (ADS)

    Zhang, Aizhu; Sun, Genyun; Wang, Zhenjie

    2015-12-01

    The serious information redundancy in hyperspectral images (HIs) cannot contribute to the data analysis accuracy, instead it require expensive computational resources. Consequently, to identify the most useful and valuable information from the HIs, thereby improve the accuracy of data analysis, this paper proposed a novel hyperspectral band selection method using the hybrid genetic algorithm and gravitational search algorithm (GA-GSA). In the proposed method, the GA-GSA is mapped to the binary space at first. Then, the accuracy of the support vector machine (SVM) classifier and the number of selected spectral bands are utilized to measure the discriminative capability of the band subset. Finally, the band subset with the smallest number of spectral bands as well as covers the most useful and valuable information is obtained. To verify the effectiveness of the proposed method, studies conducted on an AVIRIS image against two recently proposed state-of-the-art GSA variants are presented. The experimental results revealed the superiority of the proposed method and indicated that the method can indeed considerably reduce data storage costs and efficiently identify the band subset with stable and high classification precision.

  16. Comparative study of feature selection with ensemble learning using SOM variants

    NASA Astrophysics Data System (ADS)

    Filali, Ameni; Jlassi, Chiraz; Arous, Najet

    2017-03-01

    Ensemble learning has succeeded in the growth of stability and clustering accuracy, but their runtime prohibits them from scaling up to real-world applications. This study deals the problem of selecting a subset of the most pertinent features for every cluster from a dataset. The proposed method is another extension of the Random Forests approach using self-organizing maps (SOM) variants to unlabeled data that estimates the out-of-bag feature importance from a set of partitions. Every partition is created using a various bootstrap sample and a random subset of the features. Then, we show that the process internal estimates are used to measure variable pertinence in Random Forests are also applicable to feature selection in unsupervised learning. This approach aims to the dimensionality reduction, visualization and cluster characterization at the same time. Hence, we provide empirical results on nineteen benchmark data sets indicating that RFS can lead to significant improvement in terms of clustering accuracy, over several state-of-the-art unsupervised methods, with a very limited subset of features. The approach proves promise to treat with very broad domains.

  17. The Subset Sum game.

    PubMed

    Darmann, Andreas; Nicosia, Gaia; Pferschy, Ulrich; Schauer, Joachim

    2014-03-16

    In this work we address a game theoretic variant of the Subset Sum problem, in which two decision makers (agents/players) compete for the usage of a common resource represented by a knapsack capacity. Each agent owns a set of integer weighted items and wants to maximize the total weight of its own items included in the knapsack. The solution is built as follows: Each agent, in turn, selects one of its items (not previously selected) and includes it in the knapsack if there is enough capacity. The process ends when the remaining capacity is too small for including any item left. We look at the problem from a single agent point of view and show that finding an optimal sequence of items to select is an [Formula: see text]-hard problem. Therefore we propose two natural heuristic strategies and analyze their worst-case performance when (1) the opponent is able to play optimally and (2) the opponent adopts a greedy strategy. From a centralized perspective we observe that some known results on the approximation of the classical Subset Sum can be effectively adapted to the multi-agent version of the problem.

  18. The Subset Sum game☆

    PubMed Central

    Darmann, Andreas; Nicosia, Gaia; Pferschy, Ulrich; Schauer, Joachim

    2014-01-01

    In this work we address a game theoretic variant of the Subset Sum problem, in which two decision makers (agents/players) compete for the usage of a common resource represented by a knapsack capacity. Each agent owns a set of integer weighted items and wants to maximize the total weight of its own items included in the knapsack. The solution is built as follows: Each agent, in turn, selects one of its items (not previously selected) and includes it in the knapsack if there is enough capacity. The process ends when the remaining capacity is too small for including any item left. We look at the problem from a single agent point of view and show that finding an optimal sequence of items to select is an NP-hard problem. Therefore we propose two natural heuristic strategies and analyze their worst-case performance when (1) the opponent is able to play optimally and (2) the opponent adopts a greedy strategy. From a centralized perspective we observe that some known results on the approximation of the classical Subset Sum can be effectively adapted to the multi-agent version of the problem. PMID:25844012

  19. Aeroelastic Model Structure Computation for Envelope Expansion

    NASA Technical Reports Server (NTRS)

    Kukreja, Sunil L.

    2007-01-01

    Structure detection is a procedure for selecting a subset of candidate terms, from a full model description, that best describes the observed output. This is a necessary procedure to compute an efficient system description which may afford greater insight into the functionality of the system or a simpler controller design. Structure computation as a tool for black-box modelling may be of critical importance in the development of robust, parsimonious models for the flight-test community. Moreover, this approach may lead to efficient strategies for rapid envelope expansion which may save significant development time and costs. In this study, a least absolute shrinkage and selection operator (LASSO) technique is investigated for computing efficient model descriptions of nonlinear aeroelastic systems. The LASSO minimises the residual sum of squares by the addition of an l(sub 1) penalty term on the parameter vector of the traditional 2 minimisation problem. Its use for structure detection is a natural extension of this constrained minimisation approach to pseudolinear regression problems which produces some model parameters that are exactly zero and, therefore, yields a parsimonious system description. Applicability of this technique for model structure computation for the F/A-18 Active Aeroelastic Wing using flight test data is shown for several flight conditions (Mach numbers) by identifying a parsimonious system description with a high percent fit for cross-validated data.

  20. Better diagnostic accuracy of neuropathy in obesity: A new challenge for neurologists.

    PubMed

    Callaghan, Brian C; Xia, Rong; Reynolds, Evan; Banerjee, Mousumi; Burant, Charles; Rothberg, Amy; Pop-Busui, Rodica; Villegas-Umana, Emily; Feldman, Eva L

    2018-03-01

    To determine the comparative diagnostic characteristics of neuropathy measures in an obese population. We recruited obese participants from the University of Michigan's Weight Management Program. Receiver operative characteristic analysis determined the area under the curve (AUC) of neuropathy measures for distal symmetric polyneuropathy (DSP), small fiber neuropathy (SFN), and cardiovascular autonomic neuropathy (CAN). The best test combinations were determined using stepwise and Score subset selection models. We enrolled 120 obese participants. For DSP, seven of 42 neuropathy measures (Utah Early Neuropathy Score (UENS, N = 62), Michigan Neuropathy Screening Instrument (MNSI) reduced combined index, MNSI examination, nerve fiber density (NFD) leg, tibial F response, MNSI questionnaire, peroneal distal motor latency) had AUCs ≥ 0.75. Three of 19 small fiber nerve measures for SFN (UENS, NFD leg, Sudoscan feet (N = 70)) and zero of 16 CAN measures had AUCs ≥ 0.75. Combinations of tests performed better than individual tests with AUCs of 0.82 for DSP (two parameters) and 0.84 for SFN (three parameters). Many neuropathy measures demonstrate good test performance for DSP in obese participants. Select few small fiber nerve measures performed well for SFN, and none for CAN. Specific combinations of tests should be used for research studies to maximize diagnostic performance in obese cohorts. Published by Elsevier B.V.

  1. Evaluating low pass filters on SPECT reconstructed cardiac orientation estimation

    NASA Astrophysics Data System (ADS)

    Dwivedi, Shekhar

    2009-02-01

    Low pass filters can affect the quality of clinical SPECT images by smoothing. Appropriate filter and parameter selection leads to optimum smoothing that leads to a better quantification followed by correct diagnosis and accurate interpretation by the physician. This study aims at evaluating the low pass filters on SPECT reconstruction algorithms. Criteria for evaluating the filters are estimating the SPECT reconstructed cardiac azimuth and elevation angle. Low pass filters studied are butterworth, gaussian, hamming, hanning and parzen. Experiments are conducted using three reconstruction algorithms, FBP (filtered back projection), MLEM (maximum likelihood expectation maximization) and OSEM (ordered subsets expectation maximization), on four gated cardiac patient projections (two patients with stress and rest projections). Each filter is applied with varying cutoff and order for each reconstruction algorithm (only butterworth used for MLEM and OSEM). The azimuth and elevation angles are calculated from the reconstructed volume and the variation observed in the angles with varying filter parameters is reported. Our results demonstrate that behavior of hamming, hanning and parzen filter (used with FBP) with varying cutoff is similar for all the datasets. Butterworth filter (cutoff > 0.4) behaves in a similar fashion for all the datasets using all the algorithms whereas with OSEM for a cutoff < 0.4, it fails to generate cardiac orientation due to oversmoothing, and gives an unstable response with FBP and MLEM. This study on evaluating effect of low pass filter cutoff and order on cardiac orientation using three different reconstruction algorithms provides an interesting insight into optimal selection of filter parameters.

  2. ProSelection: A Novel Algorithm to Select Proper Protein Structure Subsets for in Silico Target Identification and Drug Discovery Research.

    PubMed

    Wang, Nanyi; Wang, Lirong; Xie, Xiang-Qun

    2017-11-27

    Molecular docking is widely applied to computer-aided drug design and has become relatively mature in the recent decades. Application of docking in modeling varies from single lead compound optimization to large-scale virtual screening. The performance of molecular docking is highly dependent on the protein structures selected. It is especially challenging for large-scale target prediction research when multiple structures are available for a single target. Therefore, we have established ProSelection, a docking preferred-protein selection algorithm, in order to generate the proper structure subset(s). By the ProSelection algorithm, protein structures of "weak selectors" are filtered out whereas structures of "strong selectors" are kept. Specifically, the structure which has a good statistical performance of distinguishing active ligands from inactive ligands is defined as a strong selector. In this study, 249 protein structures of 14 autophagy-related targets are investigated. Surflex-dock was used as the docking engine to distinguish active and inactive compounds against these protein structures. Both t test and Mann-Whitney U test were used to distinguish the strong from the weak selectors based on the normality of the docking score distribution. The suggested docking score threshold for active ligands (SDA) was generated for each strong selector structure according to the receiver operating characteristic (ROC) curve. The performance of ProSelection was further validated by predicting the potential off-targets of 43 U.S. Federal Drug Administration approved small molecule antineoplastic drugs. Overall, ProSelection will accelerate the computational work in protein structure selection and could be a useful tool for molecular docking, target prediction, and protein-chemical database establishment research.

  3. Subunit-dependent postsynaptic expression of kainate receptors on hippocampal interneurons in area CA1

    PubMed Central

    Wondolowski, Joyce; Frerking, Matthew

    2009-01-01

    Kainate receptors (KARs) contribute to postsynaptic excitation in only a select subset of neurons. To define the parameters that specify the postsynaptic expression of KARs, we examined the contribution of KARs to EPSCs on hippocampal interneurons in area CA1. Interneurons in stratum radiatum/lacunosum-moleculare (SR/SLM) express KARs both with and without the GluR5 subunit, but KAR-mediated EPSCs are generated mainly, if not entirely, by GluR5-containing KARs. Extrasynaptic glutamate spillover profoundly recruits AMPARs with little effect on KARs, indicating that KARs are targeted at the synapse more precisely than AMPARs. However, spontaneous EPSCs with a conventional AMPAR component did not have a resolvable contribution of KARs, suggesting that the KARs that contribute to the evoked EPSCs are at a distinct set of synapses. GluR5-containing KARs on interneurons in stratum oriens do not contribute substantially to the EPSC. We conclude that KARs are localized to synapses by cell type-, synapse-, and subunit-selective mechanisms. PMID:19144856

  4. A Multifeatures Fusion and Discrete Firefly Optimization Method for Prediction of Protein Tyrosine Sulfation Residues.

    PubMed

    Guo, Song; Liu, Chunhua; Zhou, Peng; Li, Yanling

    2016-01-01

    Tyrosine sulfation is one of the ubiquitous protein posttranslational modifications, where some sulfate groups are added to the tyrosine residues. It plays significant roles in various physiological processes in eukaryotic cells. To explore the molecular mechanism of tyrosine sulfation, one of the prerequisites is to correctly identify possible protein tyrosine sulfation residues. In this paper, a novel method was presented to predict protein tyrosine sulfation residues from primary sequences. By means of informative feature construction and elaborate feature selection and parameter optimization scheme, the proposed predictor achieved promising results and outperformed many other state-of-the-art predictors. Using the optimal features subset, the proposed method achieved mean MCC of 94.41% on the benchmark dataset, and a MCC of 90.09% on the independent dataset. The experimental performance indicated that our new proposed method could be effective in identifying the important protein posttranslational modifications and the feature selection scheme would be powerful in protein functional residues prediction research fields.

  5. A Multifeatures Fusion and Discrete Firefly Optimization Method for Prediction of Protein Tyrosine Sulfation Residues

    PubMed Central

    Liu, Chunhua; Zhou, Peng; Li, Yanling

    2016-01-01

    Tyrosine sulfation is one of the ubiquitous protein posttranslational modifications, where some sulfate groups are added to the tyrosine residues. It plays significant roles in various physiological processes in eukaryotic cells. To explore the molecular mechanism of tyrosine sulfation, one of the prerequisites is to correctly identify possible protein tyrosine sulfation residues. In this paper, a novel method was presented to predict protein tyrosine sulfation residues from primary sequences. By means of informative feature construction and elaborate feature selection and parameter optimization scheme, the proposed predictor achieved promising results and outperformed many other state-of-the-art predictors. Using the optimal features subset, the proposed method achieved mean MCC of 94.41% on the benchmark dataset, and a MCC of 90.09% on the independent dataset. The experimental performance indicated that our new proposed method could be effective in identifying the important protein posttranslational modifications and the feature selection scheme would be powerful in protein functional residues prediction research fields. PMID:27034949

  6. PSYCHE Pure Shift NMR Spectroscopy.

    PubMed

    Foroozandeh, Mohammadali; Morris, Gareth; Nilsson, Mathias

    2018-03-13

    Broadband homodecoupling techniques in NMR, also known as "pure shift" methods, aim to enhance spectral resolution by suppressing the effects of homonuclear coupling interactions to turn multiplet signals into singlets. Such techniques typically work by selecting a subset of "active" nuclear spins to observe, and selectively inverting the remaining, "passive", spins to reverse the effects of coupling. Pure Shift Yielded by Chirp Excitation (PSYCHE) is one such method; it is relatively recent, but has already been successfully implemented in a range of different NMR experiments. Paradoxically, PSYCHE is one of the trickiest of pure shift NMR techniques to understand but one of the easiest to use. Here we offer some insights into theoretical and practical aspects of the method, and into the effects and importance of the experimental parameters. Some recent improvements that enhance the spectral purity of PSYCHE spectra will be presented, and some experimental frameworks including examples in 1D and 2D NMR spectroscopy, for the implementation of PSYCHE will be introduced. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Delineation of soil temperature regimes from HCMM data

    NASA Technical Reports Server (NTRS)

    Day, R. L.; Petersen, G. W. (Principal Investigator)

    1981-01-01

    Supplementary data including photographs as well as topographic, geologic, and soil maps were obtained and evaluated for ground truth purposes and control point selection. A study area (approximately 450 by 450 pixels) was subset from LANDSAT scene No. 2477-17142. Geometric corrections and scaling were performed. Initial enhancement techniques were initiated to aid control point selection and soils interpretation. The SUBSET program was modified to read HCMM tapes and HCMM data were reformated so that they are compatible with the ORSER system. Initial NMAP products of geometrically corrected and scaled raw data tapes (unregistered) of the study were produced.

  8. Decision Aids for Airborne Intercept Operations in Advanced Aircrafts

    NASA Technical Reports Server (NTRS)

    Madni, A.; Freedy, A.

    1981-01-01

    A tactical decision aid (TDA) for the F-14 aircrew, i.e., the naval flight officer and pilot, in conducting a multitarget attack during the performance of a Combat Air Patrol (CAP) role is presented. The TDA employs hierarchical multiattribute utility models for characterizing mission objectives in operationally measurable terms, rule based AI-models for tactical posture selection, and fast time simulation for maneuver consequence prediction. The TDA makes aspect maneuver recommendations, selects and displays the optimum mission posture, evaluates attackable and potentially attackable subsets, and recommends the 'best' attackable subset along with the required course perturbation.

  9. Evaluation of reconstruction techniques in regional cerebral blood flow SPECT using trade-off plots: a Monte Carlo study.

    PubMed

    Olsson, Anna; Arlig, Asa; Carlsson, Gudrun Alm; Gustafsson, Agnetha

    2007-09-01

    The image quality of single photon emission computed tomography (SPECT) depends on the reconstruction algorithm used. The purpose of the present study was to evaluate parameters in ordered subset expectation maximization (OSEM) and to compare systematically with filtered back-projection (FBP) for reconstruction of regional cerebral blood flow (rCBF) SPECT, incorporating attenuation and scatter correction. The evaluation was based on the trade-off between contrast recovery and statistical noise using different sizes of subsets, number of iterations and filter parameters. Monte Carlo simulated SPECT studies of a digital human brain phantom were used. The contrast recovery was calculated as measured contrast divided by true contrast. Statistical noise in the reconstructed images was calculated as the coefficient of variation in pixel values. A constant contrast level was reached above 195 equivalent maximum likelihood expectation maximization iterations. The choice of subset size was not crucial as long as there were > or = 2 projections per subset. The OSEM reconstruction was found to give 5-14% higher contrast recovery than FBP for all clinically relevant noise levels in rCBF SPECT. The Butterworth filter, power 6, achieved the highest stable contrast recovery level at all clinically relevant noise levels. The cut-off frequency should be chosen according to the noise level accepted in the image. Trade-off plots are shown to be a practical way of deciding the number of iterations and subset size for the OSEM reconstruction and can be used for other examination types in nuclear medicine.

  10. Can We Predict Patient Wait Time?

    PubMed

    Pianykh, Oleg S; Rosenthal, Daniel I

    2015-10-01

    The importance of patient wait-time management and predictability can hardly be overestimated: For most hospitals, it is the patient queues that drive and define every bit of clinical workflow. The objective of this work was to study the predictability of patient wait time and identify its most influential predictors. To solve this problem, we developed a comprehensive list of 25 wait-related parameters, suggested in earlier work and observed in our own experiments. All parameters were chosen as derivable from a typical Hospital Information System dataset. The parameters were fed into several time-predicting models, and the best parameter subsets, discovered through exhaustive model search, were applied to a large sample of actual patient wait data. We were able to discover the most efficient wait-time prediction factors and models, such as the line-size models introduced in this work. Moreover, these models proved to be equally accurate and computationally efficient. Finally, the selected models were implemented in our patient waiting areas, displaying predicted wait times on the monitors located at the front desks. The limitations of these models are also discussed. Optimal regression models based on wait-line sizes can provide accurate and efficient predictions for patient wait time. Copyright © 2015 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  11. Association analysis using USDA diverse rice (Oryza sativa L.) germplasm collections to identify loci influencing grain quality traits

    USDA-ARS?s Scientific Manuscript database

    he USDA rice (Oryza sativa L.) core subset (RCS) was assembled to represent the genetic diversity of the entire USDA-ARS National Small Grains Collection and consists of 1,794 accessions from 114 countries. The USDA rice mini-core (MC) is a subset of 217 accessions from the RCS and was selected to ...

  12. Targeting Stereotyped B Cell Receptors from Chronic Lymphocytic Leukemia Patients with Synthetic Antigen Surrogates.

    PubMed

    Sarkar, Mohosin; Liu, Yun; Qi, Junpeng; Peng, Haiyong; Morimoto, Jumpei; Rader, Christoph; Chiorazzi, Nicholas; Kodadek, Thomas

    2016-04-01

    Chronic lymphocytic leukemia (CLL) is a disease in which a single B-cell clone proliferates relentlessly in peripheral lymphoid organs, bone marrow, and blood. DNA sequencing experiments have shown that about 30% of CLL patients have stereotyped antigen-specific B-cell receptors (BCRs) with a high level of sequence homology in the variable domains of the heavy and light chains. These include many of the most aggressive cases that haveIGHV-unmutated BCRs whose sequences have not diverged significantly from the germ line. This suggests a personalized therapy strategy in which a toxin or immune effector function is delivered selectively to the pathogenic B-cells but not to healthy B-cells. To execute this strategy, serum-stable, drug-like compounds able to target the antigen-binding sites of most or all patients in a stereotyped subset are required. We demonstrate here the feasibility of this approach with the discovery of selective, high affinity ligands for CLL BCRs of the aggressive, stereotyped subset 7P that cross-react with the BCRs of several CLL patients in subset 7p, but not with BCRs from patients outside this subset. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  13. The Model Human Processor and the Older Adult: Parameter Estimation and Validation within a Mobile Phone Task

    ERIC Educational Resources Information Center

    Jastrzembski, Tiffany S.; Charness, Neil

    2007-01-01

    The authors estimate weighted mean values for nine information processing parameters for older adults using the Card, Moran, and Newell (1983) Model Human Processor model. The authors validate a subset of these parameters by modeling two mobile phone tasks using two different phones and comparing model predictions to a sample of younger (N = 20;…

  14. Evaluation of the effect of selective serotonin-reuptake inhibitors on lymphocyte subsets in patients with a major depressive disorder.

    PubMed

    Hernandez, Maria Eugenia; Martinez-Fong, Daniel; Perez-Tapia, Mayra; Estrada-Garcia, Iris; Estrada-Parra, Sergio; Pavón, Lenin

    2010-02-01

    To date, only the effect of a short-term antidepressant treatment (<12 weeks) on neuroendocrinoimmune alterations in patients with a major depressive disorder has been evaluated. Our objective was to determine the effect of a 52-week long treatment with selective serotonin-reuptake inhibitors on lymphocyte subsets. The participants were thirty-one patients and twenty-two healthy volunteers. The final number of patients (10) resulted from selection and course, as detailed in the enrollment scheme. Methods used to psychiatrically analyze the participants included the Mini-International Neuropsychiatric Interview, Hamilton Depression Scale and Beck Depression Inventory. The peripheral lymphocyte subsets were measured in peripheral blood using flow cytometry. Before treatment, increased counts of natural killer (NK) cells in patients were statistically significant when compared with those of healthy volunteers (312+/-29 versus 158+/-30; cells/mL), but no differences in the populations of T and B cells were found. The patients showed remission of depressive episodes after 20 weeks of treatment along with an increase in NK cell and B cell populations, which remained increased until the end of the study. At the 52nd week of treatment, patients showed an increase in the counts of NK cells (396+/-101 cells/mL) and B cells (268+/-64 cells/mL) compared to healthy volunteers (NK, 159+/-30 cells/mL; B cells, 179+/-37 cells/mL). We conclude that long-term treatment with selective serotonin-reuptake inhibitors not only causes remission of depressive symptoms, but also affects lymphocyte subset populations. The physiopathological consequence of these changes remains to be determined.

  15. Updating estimates of low streamflow statistics to account for possible trends

    NASA Astrophysics Data System (ADS)

    Blum, A. G.; Archfield, S. A.; Hirsch, R. M.; Vogel, R. M.; Kiang, J. E.; Dudley, R. W.

    2017-12-01

    Given evidence of both increasing and decreasing trends in low flows in many streams, methods are needed to update estimators of low flow statistics used in water resources management. One such metric is the 10-year annual low-flow statistic (7Q10) calculated as the annual minimum seven-day streamflow which is exceeded in nine out of ten years on average. Historical streamflow records may not be representative of current conditions at a site if environmental conditions are changing. We present a new approach to frequency estimation under nonstationary conditions that applies a stationary nonparametric quantile estimator to a subset of the annual minimum flow record. Monte Carlo simulation experiments were used to evaluate this approach across a range of trend and no trend scenarios. Relative to the standard practice of using the entire available streamflow record, use of a nonparametric quantile estimator combined with selection of the most recent 30 or 50 years for 7Q10 estimation were found to improve accuracy and reduce bias. Benefits of data subset selection approaches were greater for higher magnitude trends annual minimum flow records with lower coefficients of variation. A nonparametric trend test approach for subset selection did not significantly improve upon always selecting the last 30 years of record. At 174 stream gages in the Chesapeake Bay region, 7Q10 estimators based on the most recent 30 years of flow record were compared to estimators based on the entire period of record. Given the availability of long records of low streamflow, using only a subset of the flow record ( 30 years) can be used to update 7Q10 estimators to better reflect current streamflow conditions.

  16. Spectral Band Selection for Urban Material Classification Using Hyperspectral Libraries

    NASA Astrophysics Data System (ADS)

    Le Bris, A.; Chehata, N.; Briottet, X.; Paparoditis, N.

    2016-06-01

    In urban areas, information concerning very high resolution land cover and especially material maps are necessary for several city modelling or monitoring applications. That is to say, knowledge concerning the roofing materials or the different kinds of ground areas is required. Airborne remote sensing techniques appear to be convenient for providing such information at a large scale. However, results obtained using most traditional processing methods based on usual red-green-blue-near infrared multispectral images remain limited for such applications. A possible way to improve classification results is to enhance the imagery spectral resolution using superspectral or hyperspectral sensors. In this study, it is intended to design a superspectral sensor dedicated to urban materials classification and this work particularly focused on the selection of the optimal spectral band subsets for such sensor. First, reflectance spectral signatures of urban materials were collected from 7 spectral libraires. Then, spectral optimization was performed using this data set. The band selection workflow included two steps, optimising first the number of spectral bands using an incremental method and then examining several possible optimised band subsets using a stochastic algorithm. The same wrapper relevance criterion relying on a confidence measure of Random Forests classifier was used at both steps. To cope with the limited number of available spectra for several classes, additional synthetic spectra were generated from the collection of reference spectra: intra-class variability was simulated by multiplying reference spectra by a random coefficient. At the end, selected band subsets were evaluated considering the classification quality reached using a rbf svm classifier. It was confirmed that a limited band subset was sufficient to classify common urban materials. The important contribution of bands from the Short Wave Infra-Red (SWIR) spectral domain (1000-2400 nm) to material classification was also shown.

  17. Collectively loading an application in a parallel computer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.

    Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.

  18. Non-native Speech Perception Training Using Vowel Subsets: Effects of Vowels in Sets and Order of Training

    PubMed Central

    Nishi, Kanae; Kewley-Port, Diane

    2008-01-01

    Purpose Nishi and Kewley-Port (2007) trained Japanese listeners to perceive nine American English monophthongs and showed that a protocol using all nine vowels (fullset) produced better results than the one using only the three more difficult vowels (subset). The present study extended the target population to Koreans and examined whether protocols combining the two stimulus sets would provide more effective training. Method Three groups of five Korean listeners were trained on American English vowels for nine days using one of the three protocols: fullset only, first three days on subset then six days on fullset, or first six days on fullset then three days on subset. Participants' performance was assessed by pre- and post-training tests, as well as by a mid-training test. Results 1) Fullset training was also effective for Koreans; 2) no advantage was found for the two combined protocols over the fullset only protocol, and 3) sustained “non-improvement” was observed for training using one of the combined protocols. Conclusions In using subsets for training American English vowels, care should be taken not only in the selection of subset vowels, but also for the training orders of subsets. PMID:18664694

  19. A local segmentation parameter optimization approach for mapping heterogeneous urban environments using VHR imagery

    NASA Astrophysics Data System (ADS)

    Grippa, Tais; Georganos, Stefanos; Lennert, Moritz; Vanhuysse, Sabine; Wolff, Eléonore

    2017-10-01

    Mapping large heterogeneous urban areas using object-based image analysis (OBIA) remains challenging, especially with respect to the segmentation process. This could be explained both by the complex arrangement of heterogeneous land-cover classes and by the high diversity of urban patterns which can be encountered throughout the scene. In this context, using a single segmentation parameter to obtain satisfying segmentation results for the whole scene can be impossible. Nonetheless, it is possible to subdivide the whole city into smaller local zones, rather homogeneous according to their urban pattern. These zones can then be used to optimize the segmentation parameter locally, instead of using the whole image or a single representative spatial subset. This paper assesses the contribution of a local approach for the optimization of segmentation parameter compared to a global approach. Ouagadougou, located in sub-Saharan Africa, is used as case studies. First, the whole scene is segmented using a single globally optimized segmentation parameter. Second, the city is subdivided into 283 local zones, homogeneous in terms of building size and building density. Each local zone is then segmented using a locally optimized segmentation parameter. Unsupervised segmentation parameter optimization (USPO), relying on an optimization function which tends to maximize both intra-object homogeneity and inter-object heterogeneity, is used to select the segmentation parameter automatically for both approaches. Finally, a land-use/land-cover classification is performed using the Random Forest (RF) classifier. The results reveal that the local approach outperforms the global one, especially by limiting confusions between buildings and their bare-soil neighbors.

  20. Rough sets and Laplacian score based cost-sensitive feature selection

    PubMed Central

    Yu, Shenglong

    2018-01-01

    Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. PMID:29912884

  1. Rough sets and Laplacian score based cost-sensitive feature selection.

    PubMed

    Yu, Shenglong; Zhao, Hong

    2018-01-01

    Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.

  2. Comparison of Different EHG Feature Selection Methods for the Detection of Preterm Labor

    PubMed Central

    Alamedine, D.; Khalil, M.; Marque, C.

    2013-01-01

    Numerous types of linear and nonlinear features have been extracted from the electrohysterogram (EHG) in order to classify labor and pregnancy contractions. As a result, the number of available features is now very large. The goal of this study is to reduce the number of features by selecting only the relevant ones which are useful for solving the classification problem. This paper presents three methods for feature subset selection that can be applied to choose the best subsets for classifying labor and pregnancy contractions: an algorithm using the Jeffrey divergence (JD) distance, a sequential forward selection (SFS) algorithm, and a binary particle swarm optimization (BPSO) algorithm. The two last methods are based on a classifier and were tested with three types of classifiers. These methods have allowed us to identify common features which are relevant for contraction classification. PMID:24454536

  3. HIV-1 protease cleavage site prediction based on two-stage feature selection method.

    PubMed

    Niu, Bing; Yuan, Xiao-Cheng; Roeper, Preston; Su, Qiang; Peng, Chun-Rong; Yin, Jing-Yuan; Ding, Juan; Li, HaiPeng; Lu, Wen-Cong

    2013-03-01

    Knowledge of the mechanism of HIV protease cleavage specificity is critical to the design of specific and effective HIV inhibitors. Searching for an accurate, robust, and rapid method to correctly predict the cleavage sites in proteins is crucial when searching for possible HIV inhibitors. In this article, HIV-1 protease specificity was studied using the correlation-based feature subset (CfsSubset) selection method combined with Genetic Algorithms method. Thirty important biochemical features were found based on a jackknife test from the original data set containing 4,248 features. By using the AdaBoost method with the thirty selected features the prediction model yields an accuracy of 96.7% for the jackknife test and 92.1% for an independent set test, with increased accuracy over the original dataset by 6.7% and 77.4%, respectively. Our feature selection scheme could be a useful technique for finding effective competitive inhibitors of HIV protease.

  4. Selective principal component regression analysis of fluorescence hyperspectral image to assess aflatoxin contamination in corn

    USDA-ARS?s Scientific Manuscript database

    Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...

  5. Sample size determination for bibliographic retrieval studies

    PubMed Central

    Yao, Xiaomei; Wilczynski, Nancy L; Walter, Stephen D; Haynes, R Brian

    2008-01-01

    Background Research for developing search strategies to retrieve high-quality clinical journal articles from MEDLINE is expensive and time-consuming. The objective of this study was to determine the minimal number of high-quality articles in a journal subset that would need to be hand-searched to update or create new MEDLINE search strategies for treatment, diagnosis, and prognosis studies. Methods The desired width of the 95% confidence intervals (W) for the lowest sensitivity among existing search strategies was used to calculate the number of high-quality articles needed to reliably update search strategies. New search strategies were derived in journal subsets formed by 2 approaches: random sampling of journals and top journals (having the most high-quality articles). The new strategies were tested in both the original large journal database and in a low-yielding journal (having few high-quality articles) subset. Results For treatment studies, if W was 10% or less for the lowest sensitivity among our existing search strategies, a subset of 15 randomly selected journals or 2 top journals were adequate for updating search strategies, based on each approach having at least 99 high-quality articles. The new strategies derived in 15 randomly selected journals or 2 top journals performed well in the original large journal database. Nevertheless, the new search strategies developed using the random sampling approach performed better than those developed using the top journal approach in a low-yielding journal subset. For studies of diagnosis and prognosis, no journal subset had enough high-quality articles to achieve the expected W (10%). Conclusion The approach of randomly sampling a small subset of journals that includes sufficient high-quality articles is an efficient way to update or create search strategies for high-quality articles on therapy in MEDLINE. The concentrations of diagnosis and prognosis articles are too low for this approach. PMID:18823538

  6. What MISR data are available for field experiments?

    Atmospheric Science Data Center

    2014-12-08

    MISR data and imagery are available for many field campaigns. Select data products are subset for the region and dates of interest. Special gridded regional products may be available as well as Local Mode data for select targets...

  7. Efficiently Selecting the Best Web Services

    NASA Astrophysics Data System (ADS)

    Goncalves, Marlene; Vidal, Maria-Esther; Regalado, Alfredo; Yacoubi Ayadi, Nadia

    Emerging technologies and linking data initiatives have motivated the publication of a large number of datasets, and provide the basis for publishing Web services and tools to manage the available data. This wealth of resources opens a world of possibilities to satisfy user requests. However, Web services may have similar functionality and assess different performance; therefore, it is required to identify among the Web services that satisfy a user request, the ones with the best quality. In this paper we propose a hybrid approach that combines reasoning tasks with ranking techniques to aim at the selection of the Web services that best implement a user request. Web service functionalities are described in terms of input and output attributes annotated with existing ontologies, non-functionality is represented as Quality of Services (QoS) parameters, and user requests correspond to conjunctive queries whose sub-goals impose restrictions on the functionality and quality of the services to be selected. The ontology annotations are used in different reasoning tasks to infer service implicit properties and to augment the size of the service search space. Furthermore, QoS parameters are considered by a ranking metric to classify the services according to how well they meet a user non-functional condition. We assume that all the QoS parameters of the non-functional condition are equally important, and apply the Top-k Skyline approach to select the k services that best meet this condition. Our proposal relies on a two-fold solution which fires a deductive-based engine that performs different reasoning tasks to discover the services that satisfy the requested functionality, and an efficient implementation of the Top-k Skyline approach to compute the top-k services that meet the majority of the QoS constraints. Our Top-k Skyline solution exploits the properties of the Skyline Frequency metric and identifies the top-k services by just analyzing a subset of the services that meet the non-functional condition. We report on the effects of the proposed reasoning tasks, the quality of the top-k services selected by the ranking metric, and the performance of the proposed ranking techniques. Our results suggest that the number of services can be augmented by up two orders of magnitude. In addition, our ranking techniques are able to identify services that have the best values in at least half of the QoS parameters, while the performance is improved.

  8. An unsupervised technique for optimal feature selection in attribute profiles for spectral-spatial classification of hyperspectral images

    NASA Astrophysics Data System (ADS)

    Bhardwaj, Kaushal; Patra, Swarnajyoti

    2018-04-01

    Inclusion of spatial information along with spectral features play a significant role in classification of remote sensing images. Attribute profiles have already proved their ability to represent spatial information. In order to incorporate proper spatial information, multiple attributes are required and for each attribute large profiles need to be constructed by varying the filter parameter values within a wide range. Thus, the constructed profiles that represent spectral-spatial information of an hyperspectral image have huge dimension which leads to Hughes phenomenon and increases computational burden. To mitigate these problems, this work presents an unsupervised feature selection technique that selects a subset of filtered image from the constructed high dimensional multi-attribute profile which are sufficiently informative to discriminate well among classes. In this regard the proposed technique exploits genetic algorithms (GAs). The fitness function of GAs are defined in an unsupervised way with the help of mutual information. The effectiveness of the proposed technique is assessed using one-against-all support vector machine classifier. The experiments conducted on three hyperspectral data sets show the robustness of the proposed method in terms of computation time and classification accuracy.

  9. Dancing your moves away: How memory retrieval shapes complex motor action.

    PubMed

    Tempel, Tobias; Loran, Igor; Frings, Christian

    2015-09-01

    Human memory is subject to continuous change. Besides the accumulation of contents as a consequence of encoding new information, the accessing of memory influences later accessibility. The authors investigated how retrieval-related memory-shaping processes affect intentionally acquired complex motion patterns. Dance figures served as the material to be learned. The authors found that selectively retrieving a subset of dance moves facilitated later recall of the retrieved dance figures, whereas figures that were related to these but that did not receive selective practice suffered from forgetting. These opposing effects were shown in experiments with different designs involving either the learning of only 1 set of body movements or 2 sets of movements categorized into 2 dances. A 3rd experiment showed that selective restudy also entailed a recall benefit for restudied dance figures but did not induce forgetting for related nonrestudied dance figures. The results suggest that motor programs representing the motion patterns in a format closely corresponding to parameters of movement execution were affected. The reported experiments demonstrate how retrieval determines motor memory plasticity and emphasize the importance of separating restudy and retrieval practice when teaching people new movements. (c) 2015 APA, all rights reserved).

  10. Adoptive therapy with chimeric antigen receptor-modified T cells of defined subset composition.

    PubMed

    Riddell, Stanley R; Sommermeyer, Daniel; Berger, Carolina; Liu, Lingfeng Steven; Balakrishnan, Ashwini; Salter, Alex; Hudecek, Michael; Maloney, David G; Turtle, Cameron J

    2014-01-01

    The ability to engineer T cells to recognize tumor cells through genetic modification with a synthetic chimeric antigen receptor has ushered in a new era in cancer immunotherapy. The most advanced clinical applications are in targeting CD19 on B-cell malignancies. The clinical trials of CD19 chimeric antigen receptor therapy have thus far not attempted to select defined subsets before transduction or imposed uniformity of the CD4 and CD8 cell composition of the cell products. This review will discuss the rationale for and challenges to using adoptive therapy with genetically modified T cells of defined subset and phenotypic composition.

  11. A ℓ2, 1 norm regularized multi-kernel learning for false positive reduction in Lung nodule CAD.

    PubMed

    Cao, Peng; Liu, Xiaoli; Zhang, Jian; Li, Wei; Zhao, Dazhe; Huang, Min; Zaiane, Osmar

    2017-03-01

    The aim of this paper is to describe a novel algorithm for False Positive Reduction in lung nodule Computer Aided Detection(CAD). In this paper, we describes a new CT lung CAD method which aims to detect solid nodules. Specially, we proposed a multi-kernel classifier with a ℓ 2, 1 norm regularizer for heterogeneous feature fusion and selection from the feature subset level, and designed two efficient strategies to optimize the parameters of kernel weights in non-smooth ℓ 2, 1 regularized multiple kernel learning algorithm. The first optimization algorithm adapts a proximal gradient method for solving the ℓ 2, 1 norm of kernel weights, and use an accelerated method based on FISTA; the second one employs an iterative scheme based on an approximate gradient descent method. The results demonstrates that the FISTA-style accelerated proximal descent method is efficient for the ℓ 2, 1 norm formulation of multiple kernel learning with the theoretical guarantee of the convergence rate. Moreover, the experimental results demonstrate the effectiveness of the proposed methods in terms of Geometric mean (G-mean) and Area under the ROC curve (AUC), and significantly outperforms the competing methods. The proposed approach exhibits some remarkable advantages both in heterogeneous feature subsets fusion and classification phases. Compared with the fusion strategies of feature-level and decision level, the proposed ℓ 2, 1 norm multi-kernel learning algorithm is able to accurately fuse the complementary and heterogeneous feature sets, and automatically prune the irrelevant and redundant feature subsets to form a more discriminative feature set, leading a promising classification performance. Moreover, the proposed algorithm consistently outperforms the comparable classification approaches in the literature. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  12. The UXO Discrimination Study at the Former Camp Sibert

    DTIC Science & Technology

    2009-01-01

    extrinsic and intrinsic parameters of the target of interest. Extrinsic parameters include the target’s location (easting and northing), orientation and...1-2 1.2 Demonstration Motivation ....................................................................... 1-3 1.3 General Approach...subset of these points. The remaining points are discussed in the summary final report produced by ESTCP [15]. 1-3 1.2 DEMONSTRATION MOTIVATION

  13. Column Subset Selection, Matrix Factorization, and Eigenvalue Optimization

    DTIC Science & Technology

    2008-07-01

    Pietsch and Grothendieck, which are regarded as basic instruments in modern functional analysis [Pis86]. • The methods for computing these... Pietsch factorization and the maxcut semi- definite program [GW95]. 1.2. Overview. We focus on the algorithmic version of the Kashin–Tzafriri theorem...will see that the desired subset is exposed by factoring the random submatrix. This factorization, which was invented by Pietsch , is regarded as a basic

  14. Selection-Fusion Approach for Classification of Datasets with Missing Values

    PubMed Central

    Ghannad-Rezaie, Mostafa; Soltanian-Zadeh, Hamid; Ying, Hao; Dong, Ming

    2010-01-01

    This paper proposes a new approach based on missing value pattern discovery for classifying incomplete data. This approach is particularly designed for classification of datasets with a small number of samples and a high percentage of missing values where available missing value treatment approaches do not usually work well. Based on the pattern of the missing values, the proposed approach finds subsets of samples for which most of the features are available and trains a classifier for each subset. Then, it combines the outputs of the classifiers. Subset selection is translated into a clustering problem, allowing derivation of a mathematical framework for it. A trade off is established between the computational complexity (number of subsets) and the accuracy of the overall classifier. To deal with this trade off, a numerical criterion is proposed for the prediction of the overall performance. The proposed method is applied to seven datasets from the popular University of California, Irvine data mining archive and an epilepsy dataset from Henry Ford Hospital, Detroit, Michigan (total of eight datasets). Experimental results show that classification accuracy of the proposed method is superior to those of the widely used multiple imputations method and four other methods. They also show that the level of superiority depends on the pattern and percentage of missing values. PMID:20212921

  15. Evaluation of a developmental hierarchy for breast cancer cells to assess risk-based patient selection for targeted treatment.

    PubMed

    Bliss, Sarah A; Paul, Sunirmal; Pobiarzyn, Piotr W; Ayer, Seda; Sinha, Garima; Pant, Saumya; Hilton, Holly; Sharma, Neha; Cunha, Maria F; Engelberth, Daniel J; Greco, Steven J; Bryan, Margarette; Kucia, Magdalena J; Kakar, Sham S; Ratajczak, Mariusz Z; Rameshwar, Pranela

    2018-01-10

    This study proposes that a novel developmental hierarchy of breast cancer (BC) cells (BCCs) could predict treatment response and outcome. The continued challenge to treat BC requires stratification of BCCs into distinct subsets. This would provide insights on how BCCs evade treatment and adapt dormancy for decades. We selected three subsets, based on the relative expression of octamer-binding transcription factor 4 A (Oct4A) and then analysed each with Affymetrix gene chip. Oct4A is a stem cell gene and would separate subsets based on maturation. Data analyses and gene validation identified three membrane proteins, TMEM98, GPR64 and FAT4. BCCs from cell lines and blood from BC patients were analysed for these three membrane proteins by flow cytometry, along with known markers of cancer stem cells (CSCs), CD44, CD24 and Oct4, aldehyde dehydrogenase 1 (ALDH1) activity and telomere length. A novel working hierarchy of BCCs was established with the most immature subset as CSCs. This group was further subdivided into long- and short-term CSCs. Analyses of 20 post-treatment blood indicated that circulating CSCs and early BC progenitors may be associated with recurrence or early death. These results suggest that the novel hierarchy may predict treatment response and prognosis.

  16. Microbiota of the Small Intestine Is Selectively Engulfed by Phagocytes of the Lamina Propria and Peyer’s Patches

    PubMed Central

    Morikawa, Masatoshi; Tsujibe, Satoshi; Kiyoshima-Shibata, Junko; Watanabe, Yohei; Kato-Nagaoka, Noriko; Shida, Kan; Matsumoto, Satoshi

    2016-01-01

    Phagocytes such as dendritic cells and macrophages, which are distributed in the small intestinal mucosa, play a crucial role in maintaining mucosal homeostasis by sampling the luminal gut microbiota. However, there is limited information regarding microbial uptake in a steady state. We investigated the composition of murine gut microbiota that is engulfed by phagocytes of specific subsets in the small intestinal lamina propria (SILP) and Peyer’s patches (PP). Analysis of bacterial 16S rRNA gene amplicon sequences revealed that: 1) all the phagocyte subsets in the SILP primarily engulfed Lactobacillus (the most abundant microbe in the small intestine), whereas CD11bhi and CD11bhiCD11chi cell subsets in PP mostly engulfed segmented filamentous bacteria (indigenous bacteria in rodents that are reported to adhere to intestinal epithelial cells); and 2) among the Lactobacillus species engulfed by the SILP cell subsets, L. murinus was engulfed more frequently than L. taiwanensis, although both these Lactobacillus species were abundant in the small intestine under physiological conditions. These results suggest that small intestinal microbiota is selectively engulfed by phagocytes that localize in the adjacent intestinal mucosa in a steady state. These observations may provide insight into the crucial role of phagocytes in immune surveillance of the small intestinal mucosa. PMID:27701454

  17. Microbiota of the Small Intestine Is Selectively Engulfed by Phagocytes of the Lamina Propria and Peyer's Patches.

    PubMed

    Morikawa, Masatoshi; Tsujibe, Satoshi; Kiyoshima-Shibata, Junko; Watanabe, Yohei; Kato-Nagaoka, Noriko; Shida, Kan; Matsumoto, Satoshi

    2016-01-01

    Phagocytes such as dendritic cells and macrophages, which are distributed in the small intestinal mucosa, play a crucial role in maintaining mucosal homeostasis by sampling the luminal gut microbiota. However, there is limited information regarding microbial uptake in a steady state. We investigated the composition of murine gut microbiota that is engulfed by phagocytes of specific subsets in the small intestinal lamina propria (SILP) and Peyer's patches (PP). Analysis of bacterial 16S rRNA gene amplicon sequences revealed that: 1) all the phagocyte subsets in the SILP primarily engulfed Lactobacillus (the most abundant microbe in the small intestine), whereas CD11bhi and CD11bhiCD11chi cell subsets in PP mostly engulfed segmented filamentous bacteria (indigenous bacteria in rodents that are reported to adhere to intestinal epithelial cells); and 2) among the Lactobacillus species engulfed by the SILP cell subsets, L. murinus was engulfed more frequently than L. taiwanensis, although both these Lactobacillus species were abundant in the small intestine under physiological conditions. These results suggest that small intestinal microbiota is selectively engulfed by phagocytes that localize in the adjacent intestinal mucosa in a steady state. These observations may provide insight into the crucial role of phagocytes in immune surveillance of the small intestinal mucosa.

  18. Modulating Wnt Signaling Pathway to Enhance Allograft Integration in Orthopedic Trauma Treatment

    DTIC Science & Technology

    2013-10-01

    presented below. Quantitative output provides an extensive set of data but we have chosen to present the most relevant parameters that are reflected in...multiple parameters .  Most samples have been mechanically tested and data extracted for multiple parameters .  Histological evaluation of subset of...Sumner, D. R. Saline Irrigation Does Not Affect Bone Formation or Fixation Strength of Hydroxyapatite /Tricalcium Phosphate-Coated Implants in a Rat Model

  19. Optimising rigid motion compensation for small animal brain PET imaging

    NASA Astrophysics Data System (ADS)

    Spangler-Bickell, Matthew G.; Zhou, Lin; Kyme, Andre Z.; De Laat, Bart; Fulton, Roger R.; Nuyts, Johan

    2016-10-01

    Motion compensation (MC) in PET brain imaging of awake small animals is attracting increased attention in preclinical studies since it avoids the confounding effects of anaesthesia and enables behavioural tests during the scan. A popular MC technique is to use multiple external cameras to track the motion of the animal’s head, which is assumed to be represented by the motion of a marker attached to its forehead. In this study we have explored several methods to improve the experimental setup and the reconstruction procedures of this method: optimising the camera-marker separation; improving the temporal synchronisation between the motion tracker measurements and the list-mode stream; post-acquisition smoothing and interpolation of the motion data; and list-mode reconstruction with appropriately selected subsets. These techniques have been tested and verified on measurements of a moving resolution phantom and brain scans of an awake rat. The proposed techniques improved the reconstructed spatial resolution of the phantom by 27% and of the rat brain by 14%. We suggest a set of optimal parameter values to use for awake animal PET studies and discuss the relative significance of each parameter choice.

  20. Aeroelastic Model Structure Computation for Envelope Expansion

    NASA Technical Reports Server (NTRS)

    Kukreja, Sunil L.

    2007-01-01

    Structure detection is a procedure for selecting a subset of candidate terms, from a full model description, that best describes the observed output. This is a necessary procedure to compute an efficient system description which may afford greater insight into the functionality of the system or a simpler controller design. Structure computation as a tool for black-box modeling may be of critical importance in the development of robust, parsimonious models for the flight-test community. Moreover, this approach may lead to efficient strategies for rapid envelope expansion that may save significant development time and costs. In this study, a least absolute shrinkage and selection operator (LASSO) technique is investigated for computing efficient model descriptions of non-linear aeroelastic systems. The LASSO minimises the residual sum of squares with the addition of an l(Sub 1) penalty term on the parameter vector of the traditional l(sub 2) minimisation problem. Its use for structure detection is a natural extension of this constrained minimisation approach to pseudo-linear regression problems which produces some model parameters that are exactly zero and, therefore, yields a parsimonious system description. Applicability of this technique for model structure computation for the F/A-18 (McDonnell Douglas, now The Boeing Company, Chicago, Illinois) Active Aeroelastic Wing project using flight test data is shown for several flight conditions (Mach numbers) by identifying a parsimonious system description with a high percent fit for cross-validated data.

  1. Nanolaminate microfluidic device for mobility selection of particles

    DOEpatents

    Surh, Michael P [Livermore, CA; Wilson, William D [Pleasanton, CA; Barbee, Jr., Troy W.; Lane, Stephen M [Oakland, CA

    2006-10-10

    A microfluidic device made from nanolaminate materials that are capable of electrophoretic selection of particles on the basis of their mobility. Nanolaminate materials are generally alternating layers of two materials (one conducting, one insulating) that are made by sputter coating a flat substrate with a large number of layers. Specific subsets of the conducting layers are coupled together to form a single, extended electrode, interleaved with other similar electrodes. Thereby, the subsets of conducting layers may be dynamically charged to create time-dependent potential fields that can trap or transport charge colloidal particles. The addition of time-dependence is applicable to all geometries of nanolaminate electrophoretic and electrochemical designs from sinusoidal to nearly step-like.

  2. Machine Learning Techniques for Global Sensitivity Analysis in Climate Models

    NASA Astrophysics Data System (ADS)

    Safta, C.; Sargsyan, K.; Ricciuto, D. M.

    2017-12-01

    Climate models studies are not only challenged by the compute intensive nature of these models but also by the high-dimensionality of the input parameter space. In our previous work with the land model components (Sargsyan et al., 2014) we identified subsets of 10 to 20 parameters relevant for each QoI via Bayesian compressive sensing and variance-based decomposition. Nevertheless the algorithms were challenged by the nonlinear input-output dependencies for some of the relevant QoIs. In this work we will explore a combination of techniques to extract relevant parameters for each QoI and subsequently construct surrogate models with quantified uncertainty necessary to future developments, e.g. model calibration and prediction studies. In the first step, we will compare the skill of machine-learning models (e.g. neural networks, support vector machine) to identify the optimal number of classes in selected QoIs and construct robust multi-class classifiers that will partition the parameter space in regions with smooth input-output dependencies. These classifiers will be coupled with techniques aimed at building sparse and/or low-rank surrogate models tailored to each class. Specifically we will explore and compare sparse learning techniques with low-rank tensor decompositions. These models will be used to identify parameters that are important for each QoI. Surrogate accuracy requirements are higher for subsequent model calibration studies and we will ascertain the performance of this workflow for multi-site ALM simulation ensembles.

  3. Facial Affect Recognition Using Regularized Discriminant Analysis-Based Algorithms

    NASA Astrophysics Data System (ADS)

    Lee, Chien-Cheng; Huang, Shin-Sheng; Shih, Cheng-Yuan

    2010-12-01

    This paper presents a novel and effective method for facial expression recognition including happiness, disgust, fear, anger, sadness, surprise, and neutral state. The proposed method utilizes a regularized discriminant analysis-based boosting algorithm (RDAB) with effective Gabor features to recognize the facial expressions. Entropy criterion is applied to select the effective Gabor feature which is a subset of informative and nonredundant Gabor features. The proposed RDAB algorithm uses RDA as a learner in the boosting algorithm. The RDA combines strengths of linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA). It solves the small sample size and ill-posed problems suffered from QDA and LDA through a regularization technique. Additionally, this study uses the particle swarm optimization (PSO) algorithm to estimate optimal parameters in RDA. Experiment results demonstrate that our approach can accurately and robustly recognize facial expressions.

  4. A single-index threshold Cox proportional hazard model for identifying a treatment-sensitive subset based on multiple biomarkers.

    PubMed

    He, Ye; Lin, Huazhen; Tu, Dongsheng

    2018-06-04

    In this paper, we introduce a single-index threshold Cox proportional hazard model to select and combine biomarkers to identify patients who may be sensitive to a specific treatment. A penalized smoothed partial likelihood is proposed to estimate the parameters in the model. A simple, efficient, and unified algorithm is presented to maximize this likelihood function. The estimators based on this likelihood function are shown to be consistent and asymptotically normal. Under mild conditions, the proposed estimators also achieve the oracle property. The proposed approach is evaluated through simulation analyses and application to the analysis of data from two clinical trials, one involving patients with locally advanced or metastatic pancreatic cancer and one involving patients with resectable lung cancer. Copyright © 2018 John Wiley & Sons, Ltd.

  5. A simple method for constructing the inhomogeneous quantum group IGLq(n) and its universal enveloping algebra Uq(igl(n))

    NASA Astrophysics Data System (ADS)

    Shariati, A.; Aghamohammadi, A.

    1995-12-01

    We propose a simple and concise method to construct the inhomogeneous quantum group IGLq(n) and its universal enveloping algebra Uq(igl(n)). Our technique is based on embedding an n-dimensional quantum space in an n+1-dimensional one as the set xn+1=1. This is possible only if one considers the multiparametric quantum space whose parameters are fixed in a specific way. The quantum group IGLq(n) is then the subset of GLq(n+1), which leaves the xn+1=1 subset invariant. For the deformed universal enveloping algebra Uq(igl(n)), we will show that it can also be embedded in Uq(gl(n+1)), provided one uses the multiparametric deformation of U(gl(n+1)) with a specific choice of its parameters.

  6. Upper bounds on sequential decoding performance parameters

    NASA Technical Reports Server (NTRS)

    Jelinek, F.

    1974-01-01

    This paper presents the best obtainable random coding and expurgated upper bounds on the probabilities of undetectable error, of t-order failure (advance to depth t into an incorrect subset), and of likelihood rise in the incorrect subset, applicable to sequential decoding when the metric bias G is arbitrary. Upper bounds on the Pareto exponent are also presented. The G-values optimizing each of the parameters of interest are determined, and are shown to lie in intervals that in general have nonzero widths. The G-optimal expurgated bound on undetectable error is shown to agree with that for maximum likelihood decoding of convolutional codes, and that on failure agrees with the block code expurgated bound. Included are curves evaluating the bounds for interesting choices of G and SNR for a binary-input quantized-output Gaussian additive noise channel.

  7. Selecting sequence variants to improve genomic predictions for dairy cattle

    USDA-ARS?s Scientific Manuscript database

    Millions of genetic variants have been identified by population-scale sequencing projects, but subsets are needed for routine genomic predictions or to include on genotyping arrays. Methods of selecting sequence variants were compared using both simulated sequence genotypes and actual data from run ...

  8. Analysis of blood pressure signal in patients with different ventricular ejection fraction using linear and non-linear methods.

    PubMed

    Arcentales, Andres; Rivera, Patricio; Caminal, Pere; Voss, Andreas; Bayes-Genis, Antonio; Giraldo, Beatriz F

    2016-08-01

    Changes in the left ventricle function produce alternans in the hemodynamic and electric behavior of the cardiovascular system. A total of 49 cardiomyopathy patients have been studied based on the blood pressure signal (BP), and were classified according to the left ventricular ejection fraction (LVEF) in low risk (LR: LVEF>35%, 17 patients) and high risk (HR: LVEF≤35, 32 patients) groups. We propose to characterize these patients using a linear and a nonlinear methods, based on the spectral estimation and the recurrence plot, respectively. From BP signal, we extracted each systolic time interval (STI), upward systolic slope (BPsl), and the difference between systolic and diastolic BP, defined as pulse pressure (PP). After, the best subset of parameters were obtained through the sequential feature selection (SFS) method. According to the results, the best classification was obtained using a combination of linear and nonlinear features from STI and PP parameters. For STI, the best combination was obtained considering the frequency peak and the diagonal structures of RP, with an area under the curve (AUC) of 79%. The same results were obtained when comparing PP values. Consequently, the use of combined linear and nonlinear parameters could improve the risk stratification of cardiomyopathy patients.

  9. Classification of urine sediment based on convolution neural network

    NASA Astrophysics Data System (ADS)

    Pan, Jingjing; Jiang, Cunbo; Zhu, Tiantian

    2018-04-01

    By designing a new convolution neural network framework, this paper breaks the constraints of the original convolution neural network framework requiring large training samples and samples of the same size. Move and cropping the input images, generate the same size of the sub-graph. And then, the generated sub-graph uses the method of dropout, increasing the diversity of samples and preventing the fitting generation. Randomly select some proper subset in the sub-graphic set and ensure that the number of elements in the proper subset is same and the proper subset is not the same. The proper subsets are used as input layers for the convolution neural network. Through the convolution layer, the pooling, the full connection layer and output layer, we can obtained the classification loss rate of test set and training set. In the red blood cells, white blood cells, calcium oxalate crystallization classification experiment, the classification accuracy rate of 97% or more.

  10. Eradication of melanomas by targeted elimination of a minor subset of tumor cells

    PubMed Central

    Schmidt, Patrick; Kopecky, Caroline; Hombach, Andreas; Zigrino, Paola; Mauch, Cornelia; Abken, Hinrich

    2011-01-01

    Proceeding on the assumption that all cancer cells have equal malignant capacities, current regimens in cancer therapy attempt to eradicate all malignant cells of a tumor lesion. Using in vivo targeting of tumor cell subsets, we demonstrate that selective elimination of a definite, minor tumor cell subpopulation is particularly effective in eradicating established melanoma lesions irrespective of the bulk of cancer cells. Tumor cell subsets were specifically eliminated in a tumor lesion by adoptive transfer of engineered cytotoxic T cells redirected in an antigen-restricted manner via a chimeric antigen receptor. Targeted elimination of less than 2% of the tumor cells that coexpress high molecular weight melanoma-associated antigen (HMW-MAA) (melanoma-associated chondroitin sulfate proteoglycan, MCSP) and CD20 lastingly eradicated melanoma lesions, whereas targeting of any random 10% tumor cell subset was not effective. Our data challenge the biological therapy and current drug development paradigms in the treatment of cancer. PMID:21282657

  11. Orbiter Flying Qualities (OFQ) Workstation user's guide

    NASA Technical Reports Server (NTRS)

    Myers, Thomas T.; Parseghian, Zareh; Hogue, Jeffrey R.

    1988-01-01

    This project was devoted to the development of a software package, called the Orbiter Flying Qualities (OFQ) Workstation, for working with the OFQ Archives which are specially selected sets of space shuttle entry flight data relevant to flight control and flying qualities. The basic approach to creation of the workstation software was to federate and extend commercial software products to create a low cost package that operates on personal computers. Provision was made to link the workstation to large computers, but the OFQ Archive files were also converted to personal computer diskettes and can be stored on workstation hard disk drives. The primary element of the workstation developed in the project is the Interactive Data Handler (IDH) which allows the user to select data subsets from the archives and pass them to specialized analysis programs. The IDH was developed as an application in a relational database management system product. The specialized analysis programs linked to the workstation include a spreadsheet program, FREDA for spectral analysis, MFP for frequency domain system identification, and NIPIP for pilot-vehicle system parameter identification. The workstation also includes capability for ensemble analysis over groups of missions.

  12. Closed-form solutions for linear regulator design of mechanical systems including optimal weighting matrix selection

    NASA Technical Reports Server (NTRS)

    Hanks, Brantley R.; Skelton, Robert E.

    1991-01-01

    Vibration in modern structural and mechanical systems can be reduced in amplitude by increasing stiffness, redistributing stiffness and mass, and/or adding damping if design techniques are available to do so. Linear Quadratic Regulator (LQR) theory in modern multivariable control design, attacks the general dissipative elastic system design problem in a global formulation. The optimal design, however, allows electronic connections and phase relations which are not physically practical or possible in passive structural-mechanical devices. The restriction of LQR solutions (to the Algebraic Riccati Equation) to design spaces which can be implemented as passive structural members and/or dampers is addressed. A general closed-form solution to the optimal free-decay control problem is presented which is tailored for structural-mechanical system. The solution includes, as subsets, special cases such as the Rayleigh Dissipation Function and total energy. Weighting matrix selection is a constrained choice among several parameters to obtain desired physical relationships. The closed-form solution is also applicable to active control design for systems where perfect, collocated actuator-sensor pairs exist.

  13. Chemical library subset selection algorithms: a unified derivation using spatial statistics.

    PubMed

    Hamprecht, Fred A; Thiel, Walter; van Gunsteren, Wilfred F

    2002-01-01

    If similar compounds have similar activity, rational subset selection becomes superior to random selection in screening for pharmacological lead discovery programs. Traditional approaches to this experimental design problem fall into two classes: (i) a linear or quadratic response function is assumed (ii) some space filling criterion is optimized. The assumptions underlying the first approach are clear but not always defendable; the second approach yields more intuitive designs but lacks a clear theoretical foundation. We model activity in a bioassay as realization of a stochastic process and use the best linear unbiased estimator to construct spatial sampling designs that optimize the integrated mean square prediction error, the maximum mean square prediction error, or the entropy. We argue that our approach constitutes a unifying framework encompassing most proposed techniques as limiting cases and sheds light on their underlying assumptions. In particular, vector quantization is obtained, in dimensions up to eight, in the limiting case of very smooth response surfaces for the integrated mean square error criterion. Closest packing is obtained for very rough surfaces under the integrated mean square error and entropy criteria. We suggest to use either the integrated mean square prediction error or the entropy as optimization criteria rather than approximations thereof and propose a scheme for direct iterative minimization of the integrated mean square prediction error. Finally, we discuss how the quality of chemical descriptors manifests itself and clarify the assumptions underlying the selection of diverse or representative subsets.

  14. Selection of reliable reference genes for quantitative real-time PCR gene expression analysis in Jute (Corchorus capsularis) under stress treatments

    PubMed Central

    Niu, Xiaoping; Qi, Jianmin; Zhang, Gaoyang; Xu, Jiantang; Tao, Aifen; Fang, Pingping; Su, Jianguang

    2015-01-01

    To accurately measure gene expression using quantitative reverse transcription PCR (qRT-PCR), reliable reference gene(s) are required for data normalization. Corchorus capsularis, an annual herbaceous fiber crop with predominant biodegradability and renewability, has not been investigated for the stability of reference genes with qRT-PCR. In this study, 11 candidate reference genes were selected and their expression levels were assessed using qRT-PCR. To account for the influence of experimental approach and tissue type, 22 different jute samples were selected from abiotic and biotic stress conditions as well as three different tissue types. The stability of the candidate reference genes was evaluated using geNorm, NormFinder, and BestKeeper programs, and the comprehensive rankings of gene stability were generated by aggregate analysis. For the biotic stress and NaCl stress subsets, ACT7 and RAN were suitable as stable reference genes for gene expression normalization. For the PEG stress subset, UBC, and DnaJ were sufficient for accurate normalization. For the tissues subset, four reference genes TUBβ, UBI, EF1α, and RAN were sufficient for accurate normalization. The selected genes were further validated by comparing expression profiles of WRKY15 in various samples, and two stable reference genes were recommended for accurate normalization of qRT-PCR data. Our results provide researchers with appropriate reference genes for qRT-PCR in C. capsularis, and will facilitate gene expression study under these conditions. PMID:26528312

  15. Mining nutrigenetics patterns related to obesity: use of parallel multifactor dimensionality reduction.

    PubMed

    Karayianni, Katerina N; Grimaldi, Keith A; Nikita, Konstantina S; Valavanis, Ioannis K

    2015-01-01

    This paper aims to enlighten the complex etiology beneath obesity by analysing data from a large nutrigenetics study, in which nutritional and genetic factors associated with obesity were recorded for around two thousand individuals. In our previous work, these data have been analysed using artificial neural network methods, which identified optimised subsets of factors to predict one's obesity status. These methods did not reveal though how the selected factors interact with each other in the obtained predictive models. For that reason, parallel Multifactor Dimensionality Reduction (pMDR) was used here to further analyse the pre-selected subsets of nutrigenetic factors. Within pMDR, predictive models using up to eight factors were constructed, further reducing the input dimensionality, while rules describing the interactive effects of the selected factors were derived. In this way, it was possible to identify specific genetic variations and their interactive effects with particular nutritional factors, which are now under further study.

  16. Optimisation algorithms for ECG data compression.

    PubMed

    Haugland, D; Heber, J G; Husøy, J H

    1997-07-01

    The use of exact optimisation algorithms for compressing digital electrocardiograms (ECGs) is demonstrated. As opposed to traditional time-domain methods, which use heuristics to select a small subset of representative signal samples, the problem of selecting the subset is formulated in rigorous mathematical terms. This approach makes it possible to derive algorithms guaranteeing the smallest possible reconstruction error when a bounded selection of signal samples is interpolated. The proposed model resembles well-known network models and is solved by a cubic dynamic programming algorithm. When applied to standard test problems, the algorithm produces a compressed representation for which the distortion is about one-half of that obtained by traditional time-domain compression techniques at reasonable compression ratios. This illustrates that, in terms of the accuracy of decoded signals, existing time-domain heuristics for ECG compression may be far from what is theoretically achievable. The paper is an attempt to bridge this gap.

  17. UNCLES: method for the identification of genes differentially consistently co-expressed in a specific subset of datasets.

    PubMed

    Abu-Jamous, Basel; Fa, Rui; Roberts, David J; Nandi, Asoke K

    2015-06-04

    Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.

  18. Heteroresistance at the single-cell level: adapting to antibiotic stress through a population-based strategy and growth-controlled interphenotypic coordination.

    PubMed

    Wang, Xiaorong; Kang, Yu; Luo, Chunxiong; Zhao, Tong; Liu, Lin; Jiang, Xiangdan; Fu, Rongrong; An, Shuchang; Chen, Jichao; Jiang, Ning; Ren, Lufeng; Wang, Qi; Baillie, J Kenneth; Gao, Zhancheng; Yu, Jun

    2014-02-11

    Heteroresistance refers to phenotypic heterogeneity of microbial clonal populations under antibiotic stress, and it has been thought to be an allocation of a subset of "resistant" cells for surviving in higher concentrations of antibiotic. The assumption fits the so-called bet-hedging strategy, where a bacterial population "hedges" its "bet" on different phenotypes to be selected by unpredicted environment stresses. To test this hypothesis, we constructed a heteroresistance model by introducing a blaCTX-M-14 gene (coding for a cephalosporin hydrolase) into a sensitive Escherichia coli strain. We confirmed heteroresistance in this clone and that a subset of the cells expressed more hydrolase and formed more colonies in the presence of ceftriaxone (exhibited stronger "resistance"). However, subsequent single-cell-level investigation by using a microfluidic device showed that a subset of cells with a distinguishable phenotype of slowed growth and intensified hydrolase expression emerged, and they were not positively selected but increased their proportion in the population with ascending antibiotic concentrations. Therefore, heteroresistance--the gradually decreased colony-forming capability in the presence of antibiotic--was a result of a decreased growth rate rather than of selection for resistant cells. Using a mock strain without the resistance gene, we further demonstrated the existence of two nested growth-centric feedback loops that control the expression of the hydrolase and maximize population growth in various antibiotic concentrations. In conclusion, phenotypic heterogeneity is a population-based strategy beneficial for bacterial survival and propagation through task allocation and interphenotypic collaboration, and the growth rate provides a critical control for the expression of stress-related genes and an essential mechanism in responding to environmental stresses. Heteroresistance is essentially phenotypic heterogeneity, where a population-based strategy is thought to be at work, being assumed to be variable cell-to-cell resistance to be selected under antibiotic stress. Exact mechanisms of heteroresistance and its roles in adaptation to antibiotic stress have yet to be fully understood at the molecular and single-cell levels. In our study, we have not been able to detect any apparent subset of "resistant" cells selected by antibiotics; on the contrary, cell populations differentiate into phenotypic subsets with variable growth statuses and hydrolase expression. The growth rate appears to be sensitive to stress intensity and plays a key role in controlling hydrolase expression at both the bulk population and single-cell levels. We have shown here, for the first time, that phenotypic heterogeneity can be beneficial to a growing bacterial population through task allocation and interphenotypic collaboration other than partitioning cells into different categories of selective advantage.

  19. Automatic Target Recognition: Statistical Feature Selection of Non-Gaussian Distributed Target Classes

    DTIC Science & Technology

    2011-06-01

    implementing, and evaluating many feature selection algorithms. Mucciardi and Gose compared seven different techniques for choosing subsets of pattern...122 THIS PAGE INTENTIONALLY LEFT BLANK 123 LIST OF REFERENCES [1] A. Mucciardi and E. Gose , “A comparison of seven techniques for

  20. Characterization of the CD4+ and CD8+ tumor infiltrating lymphocytes propagated with bispecific monoclonal antibodies.

    PubMed

    Wong, J T; Pinto, C E; Gifford, J D; Kurnick, J T; Kradin, R L

    1989-11-15

    To study the CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) in the antitumor response, we propagated these subsets directly from tumor tissues with anti-CD3:anti-CD8 (CD3,8) and anti-CD3:anti-CD4 (CD3,4) bispecific mAb (BSMAB). CD3,8 BSMAB cause selective cytolysis of CD8+ lymphocytes by bridging the CD8 molecules of target lymphocytes to the CD3 molecular complex of cytolytic T lymphocytes with concurrent activation and proliferation of residual CD3+CD4+ T lymphocytes. Similarly, CD3,4 BSMAB cause selective lysis of CD4+ lymphocytes whereas concurrently activating the residual CD3+CD8+ T cells. Small tumor fragments from four malignant melanoma and three renal cell carcinoma patients were cultured in medium containing CD3,8 + IL-2, CD3,4 + IL-2, or IL-2 alone. CD3,8 led to selective propagation of the CD4+ TIL whereas CD3,4 led to selective propagation of the CD8+ TIL from each of the tumors. The phenotypes of the TIL subset cultures were generally stable when assayed over a 1 to 3 months period and after further expansion with anti-CD3 mAb or lectins. Specific 51Cr release of labeled target cells that were bridged to the CD3 molecular complexes of TIL suggested that both CD4+ and CD8+ TIL cultures have the capacity of mediating cytolysis via their Ti/CD3 TCR complexes. In addition, both CD4+ and CD8+ TIL cultures from most patients caused substantial (greater than 20%) lysis of the NK-sensitive K562 cell line. The majority of CD4+ but not CD8+ TIL cultures also produced substantial lysis of the NK-resistant Daudi cell line. Lysis of the autologous tumor by the TIL subsets was assessed in two patients with malignant melanoma. The CD8+ TIL from one tumor demonstrated cytotoxic activity against the autologous tumor but negligible lysis of allogeneic melanoma targets. In conclusion, immunocompetent CD4+ and CD8+ TIL subsets can be isolated and expanded directly from small tumor fragments of malignant melanoma and renal cell carcinoma using BSMAB. The resultant TIL subsets can be further expanded for detailed studies or for adoptive immunotherapy.

  1. Spatial and Functional Selectivity of Peripheral Nerve Signal Recording With the Transversal Intrafascicular Multichannel Electrode (TIME).

    PubMed

    Badia, Jordi; Raspopovic, Stanisa; Carpaneto, Jacopo; Micera, Silvestro; Navarro, Xavier

    2016-01-01

    The selection of suitable peripheral nerve electrodes for biomedical applications implies a trade-off between invasiveness and selectivity. The optimal design should provide the highest selectivity for targeting a large number of nerve fascicles with the least invasiveness and potential damage to the nerve. The transverse intrafascicular multichannel electrode (TIME), transversally inserted in the peripheral nerve, has been shown to be useful for the selective activation of subsets of axons, both at inter- and intra-fascicular levels, in the small sciatic nerve of the rat. In this study we assessed the capabilities of TIME for the selective recording of neural activity, considering the topographical selectivity and the distinction of neural signals corresponding to different sensory types. Topographical recording selectivity was proved by the differential recording of CNAPs from different subsets of nerve fibers, such as those innervating toes 2 and 4 of the hindpaw of the rat. Neural signals elicited by sensory stimuli applied to the rat paw were successfully recorded. Signal processing allowed distinguishing three different types of sensory stimuli such as tactile, proprioceptive and nociceptive ones with high performance. These findings further support the suitability of TIMEs for neuroprosthetic applications, by exploiting the transversal topographical structure of the peripheral nerves.

  2. Better physical activity classification using smartphone acceleration sensor.

    PubMed

    Arif, Muhammad; Bilal, Mohsin; Kattan, Ahmed; Ahamed, S Iqbal

    2014-09-01

    Obesity is becoming one of the serious problems for the health of worldwide population. Social interactions on mobile phones and computers via internet through social e-networks are one of the major causes of lack of physical activities. For the health specialist, it is important to track the record of physical activities of the obese or overweight patients to supervise weight loss control. In this study, acceleration sensor present in the smartphone is used to monitor the physical activity of the user. Physical activities including Walking, Jogging, Sitting, Standing, Walking upstairs and Walking downstairs are classified. Time domain features are extracted from the acceleration data recorded by smartphone during different physical activities. Time and space complexity of the whole framework is done by optimal feature subset selection and pruning of instances. Classification results of six physical activities are reported in this paper. Using simple time domain features, 99 % classification accuracy is achieved. Furthermore, attributes subset selection is used to remove the redundant features and to minimize the time complexity of the algorithm. A subset of 30 features produced more than 98 % classification accuracy for the six physical activities.

  3. A computer program for fast and easy typing of partial endoglucanase gene sequence into phylotypes and sequevars 1&2 (select agents) of Ralstonia solanacearum

    USDA-ARS?s Scientific Manuscript database

    The phytopathogen Ralstonia solanacearum is a species complex that contains a subset of strains that are quarantined or select agent pathogens. An unidentified R. solanacearum strain is considered a select agent in the US until proven otherwise, which can be done by phylogenetic analysis of a partia...

  4. Hybrid Binary Imperialist Competition Algorithm and Tabu Search Approach for Feature Selection Using Gene Expression Data.

    PubMed

    Wang, Shuaiqun; Aorigele; Kong, Wei; Zeng, Weiming; Hong, Xiaomin

    2016-01-01

    Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes.

  5. Hybrid Binary Imperialist Competition Algorithm and Tabu Search Approach for Feature Selection Using Gene Expression Data

    PubMed Central

    Aorigele; Zeng, Weiming; Hong, Xiaomin

    2016-01-01

    Gene expression data composed of thousands of genes play an important role in classification platforms and disease diagnosis. Hence, it is vital to select a small subset of salient features over a large number of gene expression data. Lately, many researchers devote themselves to feature selection using diverse computational intelligence methods. However, in the progress of selecting informative genes, many computational methods face difficulties in selecting small subsets for cancer classification due to the huge number of genes (high dimension) compared to the small number of samples, noisy genes, and irrelevant genes. In this paper, we propose a new hybrid algorithm HICATS incorporating imperialist competition algorithm (ICA) which performs global search and tabu search (TS) that conducts fine-tuned search. In order to verify the performance of the proposed algorithm HICATS, we have tested it on 10 well-known benchmark gene expression classification datasets with dimensions varying from 2308 to 12600. The performance of our proposed method proved to be superior to other related works including the conventional version of binary optimization algorithm in terms of classification accuracy and the number of selected genes. PMID:27579323

  6. Choosing "Something Else" as a Sexual Identity: Evaluating Response Options on the National Health Interview Survey.

    PubMed

    Eliason, Michele J; Streed, Carl G

    2017-10-01

    Researchers struggle to find effective ways to measure sexual and gender identities to determine whether there are health differences among subsets of the LGBTQ+ population. This study examines responses on the National Health Interview Survey (NHIS) sexual identity questions among 277 LGBTQ+ healthcare providers. Eighteen percent indicated that their sexual identity was "something else" on the first question, and 57% of those also selected "something else" on the second question. Half of the genderqueer/gender variant participants and 100% of transgender-identified participants selected "something else" as their sexual identity. The NHIS question does not allow all respondents in LGBTQ+ populations to be categorized, thus we are potentially missing vital health disparity information about subsets of the LGBTQ+ population.

  7. A Novel Protocol for Model Calibration in Biological Wastewater Treatment

    PubMed Central

    Zhu, Ao; Guo, Jianhua; Ni, Bing-Jie; Wang, Shuying; Yang, Qing; Peng, Yongzhen

    2015-01-01

    Activated sludge models (ASMs) have been widely used for process design, operation and optimization in wastewater treatment plants. However, it is still a challenge to achieve an efficient calibration for reliable application by using the conventional approaches. Hereby, we propose a novel calibration protocol, i.e. Numerical Optimal Approaching Procedure (NOAP), for the systematic calibration of ASMs. The NOAP consists of three key steps in an iterative scheme flow: i) global factors sensitivity analysis for factors fixing; ii) pseudo-global parameter correlation analysis for non-identifiable factors detection; and iii) formation of a parameter subset through an estimation by using genetic algorithm. The validity and applicability are confirmed using experimental data obtained from two independent wastewater treatment systems, including a sequencing batch reactor and a continuous stirred-tank reactor. The results indicate that the NOAP can effectively determine the optimal parameter subset and successfully perform model calibration and validation for these two different systems. The proposed NOAP is expected to use for automatic calibration of ASMs and be applied potentially to other ordinary differential equations models. PMID:25682959

  8. Mapping the Chevallier-Polarski-Linder parametrization onto physical dark energy Models

    NASA Astrophysics Data System (ADS)

    Scherrer, Robert J.

    2015-08-01

    We examine the Chevallier-Polarski-Linder (CPL) parametrization, in the context of quintessence and barotropic dark energy models, to determine the subset of such models to which it can provide a good fit. The CPL parametrization gives the equation of state parameter w for the dark energy as a linear function of the scale factor a , namely w =w0+wa(1 -a ). In the case of quintessence models, we find that over most of the w0, wa parameter space the CPL parametrization maps onto a fairly narrow form of behavior for the potential V (ϕ ), while a one-dimensional subset of parameter space, for which wa=κ (1 +w0) , with κ constant, corresponds to a wide range of functional forms for V (ϕ ). For barotropic models, we show that the functional dependence of the pressure on the density, up to a multiplicative constant, depends only on wi=wa+w0 and not on w0 and wa separately. Our results suggest that the CPL parametrization may not be optimal for testing either type of model.

  9. Distinguishing Different Strategies of Across-Dimension Attentional Selection

    ERIC Educational Resources Information Center

    Huang, Liqiang; Pashler, Harold

    2012-01-01

    Selective attention in multidimensional displays has usually been examined using search tasks requiring the detection of a single target. We examined the ability to perceive a spatial structure in multi-item subsets of a display that were defined either conjunctively or disjunctively. Observers saw two adjacent displays and indicated whether the…

  10. Testing Different Model Building Procedures Using Multiple Regression.

    ERIC Educational Resources Information Center

    Thayer, Jerome D.

    The stepwise regression method of selecting predictors for computer assisted multiple regression analysis was compared with forward, backward, and best subsets regression, using 16 data sets. The results indicated the stepwise method was preferred because of its practical nature, when the models chosen by different selection methods were similar…

  11. A Metacommunity Framework for Enhancing the Effectiveness of Biological Monitoring Strategies

    PubMed Central

    Roque, Fabio O.; Cottenie, Karl

    2012-01-01

    Because of inadequate knowledge and funding, the use of biodiversity indicators is often suggested as a way to support management decisions. Consequently, many studies have analyzed the performance of certain groups as indicator taxa. However, in addition to knowing whether certain groups can adequately represent the biodiversity as a whole, we must also know whether they show similar responses to the main structuring processes affecting biodiversity. Here we present an application of the metacommunity framework for evaluating the effectiveness of biodiversity indicators. Although the metacommunity framework has contributed to a better understanding of biodiversity patterns, there is still limited discussion about its implications for conservation and biomonitoring. We evaluated the effectiveness of indicator taxa in representing spatial variation in macroinvertebrate community composition in Atlantic Forest streams, and the processes that drive this variation. We focused on analyzing whether some groups conform to environmental processes and other groups are more influenced by spatial processes, and on how this can help in deciding which indicator group or groups should be used. We showed that a relatively small subset of taxa from the metacommunity would represent 80% of the variation in community composition shown by the entire metacommunity. Moreover, this subset does not have to be composed of predetermined taxonomic groups, but rather can be defined based on random subsets. We also found that some random subsets composed of a small number of genera performed better in responding to major environmental gradients. There were also random subsets that seemed to be affected by spatial processes, which could indicate important historical processes. We were able to integrate in the same theoretical and practical framework, the selection of biodiversity surrogates, indicators of environmental conditions, and more importantly, an explicit integration of environmental and spatial processes into the selection approach. PMID:22937068

  12. Increasing signal processing sophistication in the calculation of the respiratory modulation of the photoplethysmogram (DPOP).

    PubMed

    Addison, Paul S; Wang, Rui; Uribe, Alberto A; Bergese, Sergio D

    2015-06-01

    DPOP (∆POP or Delta-POP) is a non-invasive parameter which measures the strength of respiratory modulations present in the pulse oximetry photoplethysmogram (pleth) waveform. It has been proposed as a non-invasive surrogate parameter for pulse pressure variation (PPV) used in the prediction of the response to volume expansion in hypovolemic patients. Many groups have reported on the DPOP parameter and its correlation with PPV using various semi-automated algorithmic implementations. The study reported here demonstrates the performance gains made by adding increasingly sophisticated signal processing components to a fully automated DPOP algorithm. A DPOP algorithm was coded and its performance systematically enhanced through a series of code module alterations and additions. Each algorithm iteration was tested on data from 20 mechanically ventilated OR patients. Correlation coefficients and ROC curve statistics were computed at each stage. For the purposes of the analysis we split the data into a manually selected 'stable' region subset of the data containing relatively noise free segments and a 'global' set incorporating the whole data record. Performance gains were measured in terms of correlation against PPV measurements in OR patients undergoing controlled mechanical ventilation. Through increasingly advanced pre-processing and post-processing enhancements to the algorithm, the correlation coefficient between DPOP and PPV improved from a baseline value of R = 0.347 to R = 0.852 for the stable data set, and, correspondingly, R = 0.225 to R = 0.728 for the more challenging global data set. Marked gains in algorithm performance are achievable for manually selected stable regions of the signals using relatively simple algorithm enhancements. Significant additional algorithm enhancements, including a correction for low perfusion values, were required before similar gains were realised for the more challenging global data set.

  13. Plausible combinations: An improved method to evaluate the covariate structure of Cormack-Jolly-Seber mark-recapture models

    USGS Publications Warehouse

    Bromaghin, Jeffrey F.; McDonald, Trent L.; Amstrup, Steven C.

    2013-01-01

    Mark-recapture models are extensively used in quantitative population ecology, providing estimates of population vital rates, such as survival, that are difficult to obtain using other methods. Vital rates are commonly modeled as functions of explanatory covariates, adding considerable flexibility to mark-recapture models, but also increasing the subjectivity and complexity of the modeling process. Consequently, model selection and the evaluation of covariate structure remain critical aspects of mark-recapture modeling. The difficulties involved in model selection are compounded in Cormack-Jolly- Seber models because they are composed of separate sub-models for survival and recapture probabilities, which are conceptualized independently even though their parameters are not statistically independent. The construction of models as combinations of sub-models, together with multiple potential covariates, can lead to a large model set. Although desirable, estimation of the parameters of all models may not be feasible. Strategies to search a model space and base inference on a subset of all models exist and enjoy widespread use. However, even though the methods used to search a model space can be expected to influence parameter estimation, the assessment of covariate importance, and therefore the ecological interpretation of the modeling results, the performance of these strategies has received limited investigation. We present a new strategy for searching the space of a candidate set of Cormack-Jolly-Seber models and explore its performance relative to existing strategies using computer simulation. The new strategy provides an improved assessment of the importance of covariates and covariate combinations used to model survival and recapture probabilities, while requiring only a modest increase in the number of models on which inference is based in comparison to existing techniques.

  14. Entropy-based gene ranking without selection bias for the predictive classification of microarray data.

    PubMed

    Furlanello, Cesare; Serafini, Maria; Merler, Stefano; Jurman, Giuseppe

    2003-11-06

    We describe the E-RFE method for gene ranking, which is useful for the identification of markers in the predictive classification of array data. The method supports a practical modeling scheme designed to avoid the construction of classification rules based on the selection of too small gene subsets (an effect known as the selection bias, in which the estimated predictive errors are too optimistic due to testing on samples already considered in the feature selection process). With E-RFE, we speed up the recursive feature elimination (RFE) with SVM classifiers by eliminating chunks of uninteresting genes using an entropy measure of the SVM weights distribution. An optimal subset of genes is selected according to a two-strata model evaluation procedure: modeling is replicated by an external stratified-partition resampling scheme, and, within each run, an internal K-fold cross-validation is used for E-RFE ranking. Also, the optimal number of genes can be estimated according to the saturation of Zipf's law profiles. Without a decrease of classification accuracy, E-RFE allows a speed-up factor of 100 with respect to standard RFE, while improving on alternative parametric RFE reduction strategies. Thus, a process for gene selection and error estimation is made practical, ensuring control of the selection bias, and providing additional diagnostic indicators of gene importance.

  15. TIMSS 2011 Student and Teacher Predictors for Mathematics Achievement Explored and Identified via Elastic Net.

    PubMed

    Yoo, Jin Eun

    2018-01-01

    A substantial body of research has been conducted on variables relating to students' mathematics achievement with TIMSS. However, most studies have employed conventional statistical methods, and have focused on selected few indicators instead of utilizing hundreds of variables TIMSS provides. This study aimed to find a prediction model for students' mathematics achievement using as many TIMSS student and teacher variables as possible. Elastic net, the selected machine learning technique in this study, takes advantage of both LASSO and ridge in terms of variable selection and multicollinearity, respectively. A logistic regression model was also employed to predict TIMSS 2011 Korean 4th graders' mathematics achievement. Ten-fold cross-validation with mean squared error was employed to determine the elastic net regularization parameter. Among 162 TIMSS variables explored, 12 student and 5 teacher variables were selected in the elastic net model, and the prediction accuracy, sensitivity, and specificity were 76.06, 70.23, and 80.34%, respectively. This study showed that the elastic net method can be successfully applied to educational large-scale data by selecting a subset of variables with reasonable prediction accuracy and finding new variables to predict students' mathematics achievement. Newly found variables via machine learning can shed light on the existing theories from a totally different perspective, which in turn propagates creation of a new theory or complement of existing ones. This study also examined the current scale development convention from a machine learning perspective.

  16. TIMSS 2011 Student and Teacher Predictors for Mathematics Achievement Explored and Identified via Elastic Net

    PubMed Central

    Yoo, Jin Eun

    2018-01-01

    A substantial body of research has been conducted on variables relating to students' mathematics achievement with TIMSS. However, most studies have employed conventional statistical methods, and have focused on selected few indicators instead of utilizing hundreds of variables TIMSS provides. This study aimed to find a prediction model for students' mathematics achievement using as many TIMSS student and teacher variables as possible. Elastic net, the selected machine learning technique in this study, takes advantage of both LASSO and ridge in terms of variable selection and multicollinearity, respectively. A logistic regression model was also employed to predict TIMSS 2011 Korean 4th graders' mathematics achievement. Ten-fold cross-validation with mean squared error was employed to determine the elastic net regularization parameter. Among 162 TIMSS variables explored, 12 student and 5 teacher variables were selected in the elastic net model, and the prediction accuracy, sensitivity, and specificity were 76.06, 70.23, and 80.34%, respectively. This study showed that the elastic net method can be successfully applied to educational large-scale data by selecting a subset of variables with reasonable prediction accuracy and finding new variables to predict students' mathematics achievement. Newly found variables via machine learning can shed light on the existing theories from a totally different perspective, which in turn propagates creation of a new theory or complement of existing ones. This study also examined the current scale development convention from a machine learning perspective. PMID:29599736

  17. Progressive Sampling Technique for Efficient and Robust Uncertainty and Sensitivity Analysis of Environmental Systems Models: Stability and Convergence

    NASA Astrophysics Data System (ADS)

    Sheikholeslami, R.; Hosseini, N.; Razavi, S.

    2016-12-01

    Modern earth and environmental models are usually characterized by a large parameter space and high computational cost. These two features prevent effective implementation of sampling-based analysis such as sensitivity and uncertainty analysis, which require running these computationally expensive models several times to adequately explore the parameter/problem space. Therefore, developing efficient sampling techniques that scale with the size of the problem, computational budget, and users' needs is essential. In this presentation, we propose an efficient sequential sampling strategy, called Progressive Latin Hypercube Sampling (PLHS), which provides an increasingly improved coverage of the parameter space, while satisfying pre-defined requirements. The original Latin hypercube sampling (LHS) approach generates the entire sample set in one stage; on the contrary, PLHS generates a series of smaller sub-sets (also called `slices') while: (1) each sub-set is Latin hypercube and achieves maximum stratification in any one dimensional projection; (2) the progressive addition of sub-sets remains Latin hypercube; and thus (3) the entire sample set is Latin hypercube. Therefore, it has the capability to preserve the intended sampling properties throughout the sampling procedure. PLHS is deemed advantageous over the existing methods, particularly because it nearly avoids over- or under-sampling. Through different case studies, we show that PHLS has multiple advantages over the one-stage sampling approaches, including improved convergence and stability of the analysis results with fewer model runs. In addition, PLHS can help to minimize the total simulation time by only running the simulations necessary to achieve the desired level of quality (e.g., accuracy, and convergence rate).

  18. Predicting the disease of Alzheimer with SNP biomarkers and clinical data using data mining classification approach: decision tree.

    PubMed

    Erdoğan, Onur; Aydin Son, Yeşim

    2014-01-01

    Single Nucleotide Polymorphisms (SNPs) are the most common genomic variations where only a single nucleotide differs between individuals. Individual SNPs and SNP profiles associated with diseases can be utilized as biological markers. But there is a need to determine the SNP subsets and patients' clinical data which is informative for the diagnosis. Data mining approaches have the highest potential for extracting the knowledge from genomic datasets and selecting the representative SNPs as well as most effective and informative clinical features for the clinical diagnosis of the diseases. In this study, we have applied one of the widely used data mining classification methodology: "decision tree" for associating the SNP biomarkers and significant clinical data with the Alzheimer's disease (AD), which is the most common form of "dementia". Different tree construction parameters have been compared for the optimization, and the most accurate tree for predicting the AD is presented.

  19. Sv-map between type I and heterotic sigma models

    NASA Astrophysics Data System (ADS)

    Fan, Wei; Fotopoulos, A.; Stieberger, S.; Taylor, T. R.

    2018-05-01

    The scattering amplitudes of gauge bosons in heterotic and open superstring theories are related by the single-valued projection which yields heterotic amplitudes by selecting a subset of multiple zeta value coefficients in the α‧ (string tension parameter) expansion of open string amplitudes. In the present work, we argue that this relation holds also at the level of low-energy expansions (or individual Feynman diagrams) of the respective effective actions, by investigating the beta functions of two-dimensional sigma models describing world-sheets of open and heterotic strings. We analyze the sigma model Feynman diagrams generating identical effective action terms in both theories and show that the heterotic coefficients are given by the single-valued projection of the open ones. The single-valued projection appears as a result of summing over all radial orderings of heterotic vertices on the complex plane representing string world-sheet.

  20. Development of a Rapid Fluorescence-Based High-Throughput Screening Assay to Identify Novel Kynurenine 3-Monooxygenase Inhibitor Scaffolds.

    PubMed

    Jacobs, K R; Guillemin, G J; Lovejoy, D B

    2018-02-01

    Kynurenine 3-monooxygenase (KMO) is a well-validated therapeutic target for the treatment of neurodegenerative diseases, including Alzheimer's disease (AD) and Huntington's disease (HD). This work reports a facile fluorescence-based KMO assay optimized for high-throughput screening (HTS) that achieves a throughput approximately 20-fold higher than the fastest KMO assay currently reported. The screen was run with excellent performance (average Z' value of 0.80) from 110,000 compounds across 341 plates and exceeded all statistical parameters used to describe a robust HTS assay. A subset of molecules was selected for validation by ultra-high-performance liquid chromatography, resulting in the confirmation of a novel hit with an IC 50 comparable to that of the well-described KMO inhibitor Ro-61-8048. A medicinal chemistry program is currently underway to further develop our novel KMO inhibitor scaffolds.

  1. Yield of illicit indoor cannabis cultivation in the Netherlands.

    PubMed

    Toonen, Marcel; Ribot, Simon; Thissen, Jac

    2006-09-01

    To obtain a reliable estimation on the yield of illicit indoor cannabis cultivation in The Netherlands, cannabis plants confiscated by the police were used to determine the yield of dried female flower buds. The developmental stage of flower buds of the seized plants was described on a scale from 1 to 10 where the value of 10 indicates a fully developed flower bud ready for harvesting. Using eight additional characteristics describing the grow room and cultivation parameters, regression analysis with subset selection was carried out to develop two models for the yield of indoor cannabis cultivation. The median Dutch illicit grow room consists of 259 cannabis plants, has a plant density of 15 plants/m(2), and 510 W of growth lamps per m(2). For the median Dutch grow room, the predicted yield of female flower buds at the harvestable developmental stage (stage 10) was 33.7 g/plant or 505 g/m(2).

  2. Comparing the role of shape and texture on staging hepatic fibrosis from medical imaging

    NASA Astrophysics Data System (ADS)

    Zhang, Xuejun; Louie, Ryan; Liu, Brent J.; Gao, Xin; Tan, Xiaomin; Qu, Xianghe; Long, Liling

    2016-03-01

    The purpose of this study is to investigate the role of shape and texture in the classification of hepatic fibrosis by selecting the optimal parameters for a better Computer-aided diagnosis (CAD) system. 10 surface shape features are extracted from a standardized profile of liver; while15 texture features calculated from gray level co-occurrence matrix (GLCM) are extracted within an ROI in liver. Each combination of these input subsets is checked by using support vector machine (SVM) with leave-one-case-out method to differentiate fibrosis into two groups: normal or abnormal. The accurate rate value of all 10/15 types number of features is 66.83% by texture, while 85.74% by shape features, respectively. The irregularity of liver shape can demonstrate fibrotic grade efficiently and texture feature of CT image is not recommended to use with shape feature for interpretation of cirrhosis.

  3. A survey of parametrized variational principles and applications to computational mechanics

    NASA Technical Reports Server (NTRS)

    Felippa, Carlos A.

    1993-01-01

    This survey paper describes recent developments in the area of parametrized variational principles (PVP's) and selected applications to finite-element computational mechanics. A PVP is a variational principle containing free parameters that have no effect on the Euler-Lagrange equations. The theory of single-field PVP's based on gauge functions (also known as null Lagrangians) is a subset of the inverse problem of variational calculus that has limited value. On the other hand, multifield PVP's are more interesting from theoretical and practical standpoints. Following a tutorial introduction, the paper describes the recent construction of multifield PVP's in several areas of elasticity and electromagnetics. It then discusses three applications to finite-element computational mechanics: the derivation of high-performance finite elements, the development of element-level error indicators, and the constructions of finite element templates. The paper concludes with an overview of open research areas.

  4. Utility of correlation techniques in gravity and magnetic interpretation

    NASA Technical Reports Server (NTRS)

    Chandler, V. W.; Koski, J. S.; Braice, L. W.; Hinze, W. J.

    1977-01-01

    Internal correspondence uses Poisson's Theorem in a moving-window linear regression analysis between the anomalous first vertical derivative of gravity and total magnetic field reduced to the pole. The regression parameters provide critical information on source characteristics. The correlation coefficient indicates the strength of the relation between magnetics and gravity. Slope value gives delta j/delta sigma estimates of the anomalous source. The intercept furnishes information on anomaly interference. Cluster analysis consists of the classification of subsets of data into groups of similarity based on correlation of selected characteristics of the anomalies. Model studies are used to illustrate implementation and interpretation procedures of these methods, particularly internal correspondence. Analysis of the results of applying these methods to data from the midcontinent and a transcontinental profile shows they can be useful in identifying crustal provinces, providing information on horizontal and vertical variations of physical properties over province size zones, validating long wavelength anomalies, and isolating geomagnetic field removal problems.

  5. The Swift GRB Host Galaxy Legacy Survey

    NASA Astrophysics Data System (ADS)

    Perley, Daniel A.

    2015-01-01

    I introduce the Swift Host Galaxy Legacy Survey (SHOALS), a comprehensive multiwavelength program to characterize the demographics of the GRB host population across its entire redshift range. Using unbiased selection criteria we have designated a subset of 130 Swift gamma-ray bursts which are now being targeted with intensive observational follow-up. Deep Spitzer imaging of every field has already been obtained and analyzed, with major programs ongoing at Keck, GTC, and Gemini to obtain complementary optical/NIR photometry to enable full SED modeling and derivation of fundamental physical parameters such as mass, extinction, and star-formation rate. Using these data I will present an unbiased measurement of the GRB host-galaxy luminosity and mass functions and their evolution with redshift between z=0 and z=5, compare GRB hosts to other star-forming galaxy populations, and discuss implications for the nature of the GRB progenitor and the ability of GRBs to probe cosmic star-formation.

  6. A linear and non-linear polynomial neural network modeling of dissolved oxygen content in surface water: Inter- and extrapolation performance with inputs' significance analysis.

    PubMed

    Šiljić Tomić, Aleksandra; Antanasijević, Davor; Ristić, Mirjana; Perić-Grujić, Aleksandra; Pocajt, Viktor

    2018-01-01

    Accurate prediction of water quality parameters (WQPs) is an important task in the management of water resources. Artificial neural networks (ANNs) are frequently applied for dissolved oxygen (DO) prediction, but often only their interpolation performance is checked. The aims of this research, beside interpolation, were the determination of extrapolation performance of ANN model, which was developed for the prediction of DO content in the Danube River, and the assessment of relationship between the significance of inputs and prediction error in the presence of values which were of out of the range of training. The applied ANN is a polynomial neural network (PNN) which performs embedded selection of most important inputs during learning, and provides a model in the form of linear and non-linear polynomial functions, which can then be used for a detailed analysis of the significance of inputs. Available dataset that contained 1912 monitoring records for 17 water quality parameters was split into a "regular" subset that contains normally distributed and low variability data, and an "extreme" subset that contains monitoring records with outlier values. The results revealed that the non-linear PNN model has good interpolation performance (R 2 =0.82), but it was not robust in extrapolation (R 2 =0.63). The analysis of extrapolation results has shown that the prediction errors are correlated with the significance of inputs. Namely, the out-of-training range values of the inputs with low importance do not affect significantly the PNN model performance, but their influence can be biased by the presence of multi-outlier monitoring records. Subsequently, linear PNN models were successfully applied to study the effect of water quality parameters on DO content. It was observed that DO level is mostly affected by temperature, pH, biological oxygen demand (BOD) and phosphorus concentration, while in extreme conditions the importance of alkalinity and bicarbonates rises over pH and BOD. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Relationship of spasticity to knee angular velocity and motion during gait in cerebral palsy.

    PubMed

    Damiano, Diane L; Laws, Edward; Carmines, Dave V; Abel, Mark F

    2006-01-01

    This study investigated the effects of spasticity in the hamstrings and quadriceps muscles on gait parameters including temporal spatial measures, knee position, excursion and angular velocity in 25 children with spastic diplegic cerebral palsy (CP) as compared to 17 age-matched peers. While subjects were instructed to relax, an isokinetic device alternately flexed and extended the left knee at one of the three constant velocities 30 degrees/s, 60 degrees/s and 120 degrees/s, while surface electromyography (EMG) electrodes over the biceps femoris and the rectus femoris recorded muscle activity. Patients then participated in 3D gait analysis at a self-selected speed. Results showed that, those with CP who exhibited heightened stretch responses (spasticity) in both muscles, had significantly slower knee angular velocities during the swing phase of gait as compared to those with and without CP who did not exhibit stretch responses at the joint and the tested speeds. The measured amount (torque) of the resistance to passive flexion or extension was not related to gait parameters in subjects with CP; however, the rate of change in resistance torque per unit angle change (stiffness) at the fastest test speed of 120 degrees/s showed weak to moderate relationships with knee angular velocity and motion during gait. For the subset of seven patients with CP who subsequently underwent a selective dorsal rhizotomy, knee angular extension and flexion velocity increased post-operatively, suggesting some degree of causality between spasticity and movement speed.

  8. Data Mining for Efficient and Accurate Large Scale Retrieval of Geophysical Parameters

    NASA Astrophysics Data System (ADS)

    Obradovic, Z.; Vucetic, S.; Peng, K.; Han, B.

    2004-12-01

    Our effort is devoted to developing data mining technology for improving efficiency and accuracy of the geophysical parameter retrievals by learning a mapping from observation attributes to the corresponding parameters within the framework of classification and regression. We will describe a method for efficient learning of neural network-based classification and regression models from high-volume data streams. The proposed procedure automatically learns a series of neural networks of different complexities on smaller data stream chunks and then properly combines them into an ensemble predictor through averaging. Based on the idea of progressive sampling the proposed approach starts with a very simple network trained on a very small chunk and then gradually increases the model complexity and the chunk size until the learning performance no longer improves. Our empirical study on aerosol retrievals from data obtained with the MISR instrument mounted at Terra satellite suggests that the proposed method is successful in learning complex concepts from large data streams with near-optimal computational effort. We will also report on a method that complements deterministic retrievals by constructing accurate predictive algorithms and applying them on appropriately selected subsets of observed data. The method is based on developing more accurate predictors aimed to catch global and local properties synthesized in a region. The procedure starts by learning the global properties of data sampled over the entire space, and continues by constructing specialized models on selected localized regions. The global and local models are integrated through an automated procedure that determines the optimal trade-off between the two components with the objective of minimizing the overall mean square errors over a specific region. Our experimental results on MISR data showed that the combined model can increase the retrieval accuracy significantly. The preliminary results on various large heterogeneous spatial-temporal datasets provide evidence that the benefits of the proposed methodology for efficient and accurate learning exist beyond the area of retrieval of geophysical parameters.

  9. [The study on the changes of serum IL- 6, TNF-α and peripheral blood T lymphocyte subsets in the pregnant women during perinatal period].

    PubMed

    Li, Juan

    2011-03-01

    To study the change law of serum IL-6, TNF-α and peripheral blood T lymphocyte subsets in the pregnant women during perinatal period. 100 pregnant women in our hospital from November 2009 to October 2010 were selected as research object, and the serum IL-6, TNF-α and peripheral blood T lymphocyte subsets be-fore and at labor onset occurring, after delivery at the first and third day were analyzed and compared. According the study, the serum IL-6 and TNF-aat labor onset occurring were higher than those before labor onset and af-ter delivery at the first and third day , the CD3(+), CD4 (+), CD8(+) and CD4/CD8 decreased first and then increased, all P < 0. 05, there were significant differences. The changes of serum IL-6, TNF-α and peripheral blood T lymphocyte subsets in the pregnant women during perinatal period has a regular pattern, and it is worthy of.

  10. MODIS Interactive Subsetting Tool (MIST)

    NASA Astrophysics Data System (ADS)

    McAllister, M.; Duerr, R.; Haran, T.; Khalsa, S. S.; Miller, D.

    2008-12-01

    In response to requests from the user community, NSIDC has teamed with the Oak Ridge National Laboratory Distributive Active Archive Center (ORNL DAAC) and the Moderate Resolution Data Center (MrDC) to provide time series subsets of satellite data covering stations in the Greenland Climate Network (GC-NET) and the International Arctic Systems for Observing the Atmosphere (IASOA) network. To serve these data NSIDC created the MODIS Interactive Subsetting Tool (MIST). MIST works with 7 km by 7 km subset time series of certain Version 5 (V005) MODIS products over GC-Net and IASOA stations. User- selected data are delivered in a text Comma Separated Value (CSV) file format. MIST also provides online analysis capabilities that include generating time series and scatter plots. Currently, MIST is a Beta prototype and NSIDC intends that user requests will drive future development of the tool. The intent of this poster is to introduce MIST to the MODIS data user audience and illustrate some of the online analysis capabilities.

  11. Fish swarm intelligent to optimize real time monitoring of chips drying using machine vision

    NASA Astrophysics Data System (ADS)

    Hendrawan, Y.; Hawa, L. C.; Damayanti, R.

    2018-03-01

    This study attempted to apply machine vision-based chips drying monitoring system which is able to optimise the drying process of cassava chips. The objective of this study is to propose fish swarm intelligent (FSI) optimization algorithms to find the most significant set of image features suitable for predicting water content of cassava chips during drying process using artificial neural network model (ANN). Feature selection entails choosing the feature subset that maximizes the prediction accuracy of ANN. Multi-Objective Optimization (MOO) was used in this study which consisted of prediction accuracy maximization and feature-subset size minimization. The results showed that the best feature subset i.e. grey mean, L(Lab) Mean, a(Lab) energy, red entropy, hue contrast, and grey homogeneity. The best feature subset has been tested successfully in ANN model to describe the relationship between image features and water content of cassava chips during drying process with R2 of real and predicted data was equal to 0.9.

  12. The transcription factor NRSF contributes to epileptogenesis by selective repression of a subset of target genes

    PubMed Central

    McClelland, Shawn; Brennan, Gary P; Dubé, Celine; Rajpara, Seeta; Iyer, Shruti; Richichi, Cristina; Bernard, Christophe; Baram, Tallie Z

    2014-01-01

    The mechanisms generating epileptic neuronal networks following insults such as severe seizures are unknown. We have previously shown that interfering with the function of the neuron-restrictive silencer factor (NRSF/REST), an important transcription factor that influences neuronal phenotype, attenuated development of this disorder. In this study, we found that epilepsy-provoking seizures increased the low NRSF levels in mature hippocampus several fold yet surprisingly, provoked repression of only a subset (∼10%) of potential NRSF target genes. Accordingly, the repressed gene-set was rescued when NRSF binding to chromatin was blocked. Unexpectedly, genes selectively repressed by NRSF had mid-range binding frequencies to the repressor, a property that rendered them sensitive to moderate fluctuations of NRSF levels. Genes selectively regulated by NRSF during epileptogenesis coded for ion channels, receptors, and other crucial contributors to neuronal function. Thus, dynamic, selective regulation of NRSF target genes may play a role in influencing neuronal properties in pathological and physiological contexts. DOI: http://dx.doi.org/10.7554/eLife.01267.001 PMID:25117540

  13. NASA GES DISC On-line Visualization and Analysis System for Gridded Remote Sensing Data

    NASA Technical Reports Server (NTRS)

    Leptoukh, Gregory G.; Berrick, S.; Rui, H.; Liu, Z.; Zhu, T.; Teng, W.; Shen, S.; Qin, J.

    2005-01-01

    The ability to use data stored in the current NASA Earth Observing System (EOS) archives for studying regional or global phenomena is highly dependent on having a detailed understanding of the data's internal structure and physical implementation. Gaining this understanding and applying it to data reduction is a time-consuming task that must be undertaken before the core investigation can begin. This is an especially difficult challenge when science objectives require users to deal with large multi-sensor data sets that are usually of different formats, structures, and resolutions. The NASA Goddard Earth Sciences Data and Information Services Center (GES DISC) has taken a major step towards meeting this challenge by developing an infrastructure with a Web interface that allows users to perform interactive analysis online without downloading any data, the GES-DISC Interactive Online Visualization and Analysis Infrastructure or "Giovanni." Giovanni provides interactive, online, analysis tools for data users to facilitate their research. There have been several instances of this interface created to serve TRMM users, Aerosol scientists, Ocean Color and Agriculture applications users. The first generation of these tools support gridded data only. The user selects geophysical parameters, area of interest, time period; and the system generates an output on screen in a matter of seconds. The currently available output options are: Area plot averaged or accumulated over any available data period for any rectangular area; Time plot time series averaged over any rectangular area; Hovmoller plots image view of any longitude-time and latitude-time cross sections; ASCII output for all plot types; Image animation for area plot. Another analysis suite deals with parameter intercomparison: scatter plots, temporal correlation maps, GIs-compatible outputs, etc. This allow user to focus on data content (i.e. science parameters) and eliminate the need for expensive learning, development and processing tasks that are redundantly incurred by an archive's user community. The current implementation utilizes the GrADS-DODS Server (GDS), and provides subsetting and analysis services across the Internet for any GrADS-readable dataset. The subsetting capability allows users to retrieve a specified temporal and/or spatial subdomain from a large dataset, eliminating the need to download everything simply to access a small relevant portion of a dataset. The analysis capability allows users to retrieve the results of an operation applied to one or more datasets on the server. We use this approach to read pre-processed binary files and/or to read and extract the needed parts directly from HDF or HDF-EOS files. These subsets then serve as inputs into GrADS analysis scripts. It can be used in a wide variety of Earth science applications: climate and weather events study and monitoring; modeling. It can be easily configured for new applications.

  14. Dense mesh sampling for video-based facial animation

    NASA Astrophysics Data System (ADS)

    Peszor, Damian; Wojciechowska, Marzena

    2016-06-01

    The paper describes an approach for selection of feature points on three-dimensional, triangle mesh obtained using various techniques from several video footages. This approach has a dual purpose. First, it allows to minimize the data stored for the purpose of facial animation, so that instead of storing position of each vertex in each frame, one could store only a small subset of vertices for each frame and calculate positions of others based on the subset. Second purpose is to select feature points that could be used for anthropometry-based retargeting of recorded mimicry to another model, with sampling density beyond that which can be achieved using marker-based performance capture techniques. Developed approach was successfully tested on artificial models, models constructed using structured light scanner, and models constructed from video footages using stereophotogrammetry.

  15. Optimized probability sampling of study sites to improve generalizability in a multisite intervention trial.

    PubMed

    Kraschnewski, Jennifer L; Keyserling, Thomas C; Bangdiwala, Shrikant I; Gizlice, Ziya; Garcia, Beverly A; Johnston, Larry F; Gustafson, Alison; Petrovic, Lindsay; Glasgow, Russell E; Samuel-Hodge, Carmen D

    2010-01-01

    Studies of type 2 translation, the adaption of evidence-based interventions to real-world settings, should include representative study sites and staff to improve external validity. Sites for such studies are, however, often selected by convenience sampling, which limits generalizability. We used an optimized probability sampling protocol to select an unbiased, representative sample of study sites to prepare for a randomized trial of a weight loss intervention. We invited North Carolina health departments within 200 miles of the research center to participate (N = 81). Of the 43 health departments that were eligible, 30 were interested in participating. To select a representative and feasible sample of 6 health departments that met inclusion criteria, we generated all combinations of 6 from the 30 health departments that were eligible and interested. From the subset of combinations that met inclusion criteria, we selected 1 at random. Of 593,775 possible combinations of 6 counties, 15,177 (3%) met inclusion criteria. Sites in the selected subset were similar to all eligible sites in terms of health department characteristics and county demographics. Optimized probability sampling improved generalizability by ensuring an unbiased and representative sample of study sites.

  16. Fuzzy Subspace Clustering

    NASA Astrophysics Data System (ADS)

    Borgelt, Christian

    In clustering we often face the situation that only a subset of the available attributes is relevant for forming clusters, even though this may not be known beforehand. In such cases it is desirable to have a clustering algorithm that automatically weights attributes or even selects a proper subset. In this paper I study such an approach for fuzzy clustering, which is based on the idea to transfer an alternative to the fuzzifier (Klawonn and Höppner, What is fuzzy about fuzzy clustering? Understanding and improving the concept of the fuzzifier, In: Proc. 5th Int. Symp. on Intelligent Data Analysis, 254-264, Springer, Berlin, 2003) to attribute weighting fuzzy clustering (Keller and Klawonn, Int J Uncertain Fuzziness Knowl Based Syst 8:735-746, 2000). In addition, by reformulating Gustafson-Kessel fuzzy clustering, a scheme for weighting and selecting principal axes can be obtained. While in Borgelt (Feature weighting and feature selection in fuzzy clustering, In: Proc. 17th IEEE Int. Conf. on Fuzzy Systems, IEEE Press, Piscataway, NJ, 2008) I already presented such an approach for a global selection of attributes and principal axes, this paper extends it to a cluster-specific selection, thus arriving at a fuzzy subspace clustering algorithm (Parsons, Haque, and Liu, 2004).

  17. Hypergraph Based Feature Selection Technique for Medical Diagnosis.

    PubMed

    Somu, Nivethitha; Raman, M R Gauthama; Kirthivasan, Kannan; Sriram, V S Shankar

    2016-11-01

    The impact of internet and information systems across various domains have resulted in substantial generation of multidimensional datasets. The use of data mining and knowledge discovery techniques to extract the original information contained in the multidimensional datasets play a significant role in the exploitation of complete benefit provided by them. The presence of large number of features in the high dimensional datasets incurs high computational cost in terms of computing power and time. Hence, feature selection technique has been commonly used to build robust machine learning models to select a subset of relevant features which projects the maximal information content of the original dataset. In this paper, a novel Rough Set based K - Helly feature selection technique (RSKHT) which hybridize Rough Set Theory (RST) and K - Helly property of hypergraph representation had been designed to identify the optimal feature subset or reduct for medical diagnostic applications. Experiments carried out using the medical datasets from the UCI repository proves the dominance of the RSKHT over other feature selection techniques with respect to the reduct size, classification accuracy and time complexity. The performance of the RSKHT had been validated using WEKA tool, which shows that RSKHT had been computationally attractive and flexible over massive datasets.

  18. The proposal of architecture for chemical splitting to optimize QSAR models for aquatic toxicity.

    PubMed

    Colombo, Andrea; Benfenati, Emilio; Karelson, Mati; Maran, Uko

    2008-06-01

    One of the challenges in the field of quantitative structure-activity relationship (QSAR) analysis is the correct classification of a chemical compound to an appropriate model for the prediction of activity. Thus, in previous studies, compounds have been divided into distinct groups according to their mode of action or chemical class. In the current study, theoretical molecular descriptors were used to divide 568 organic substances into subsets with toxicity measured for the 96-h lethal median concentration for the Fathead minnow (Pimephales promelas). Simple constitutional descriptors such as the number of aliphatic and aromatic rings and a quantum chemical descriptor, maximum bond order of a carbon atom divide compounds into nine subsets. For each subset of compounds the automatic forward selection of descriptors was applied to construct QSAR models. Significant correlations were achieved for each subset of chemicals and all models were validated with the leave-one-out internal validation procedure (R(2)(cv) approximately 0.80). The results encourage to consider this alternative way for the prediction of toxicity using QSAR subset models without direct reference to the mechanism of toxic action or the traditional chemical classification.

  19. Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.).

    PubMed

    Rincent, R; Laloë, D; Nicolas, S; Altmann, T; Brunel, D; Revilla, P; Rodríguez, V M; Moreno-Gonzalez, J; Melchinger, A; Bauer, E; Schoen, C-C; Meyer, N; Giauffret, C; Bauland, C; Jamin, P; Laborde, J; Monod, H; Flament, P; Charcosset, A; Moreau, L

    2012-10-01

    Genomic selection refers to the use of genotypic information for predicting breeding values of selection candidates. A prediction formula is calibrated with the genotypes and phenotypes of reference individuals constituting the calibration set. The size and the composition of this set are essential parameters affecting the prediction reliabilities. The objective of this study was to maximize reliabilities by optimizing the calibration set. Different criteria based on the diversity or on the prediction error variance (PEV) derived from the realized additive relationship matrix-best linear unbiased predictions model (RA-BLUP) were used to select the reference individuals. For the latter, we considered the mean of the PEV of the contrasts between each selection candidate and the mean of the population (PEVmean) and the mean of the expected reliabilities of the same contrasts (CDmean). These criteria were tested with phenotypic data collected on two diversity panels of maize (Zea mays L.) genotyped with a 50k SNPs array. In the two panels, samples chosen based on CDmean gave higher reliabilities than random samples for various calibration set sizes. CDmean also appeared superior to PEVmean, which can be explained by the fact that it takes into account the reduction of variance due to the relatedness between individuals. Selected samples were close to optimality for a wide range of trait heritabilities, which suggests that the strategy presented here can efficiently sample subsets in panels of inbred lines. A script to optimize reference samples based on CDmean is available on request.

  20. Accurate Identification of MCI Patients via Enriched White-Matter Connectivity Network

    NASA Astrophysics Data System (ADS)

    Wee, Chong-Yaw; Yap, Pew-Thian; Brownyke, Jeffery N.; Potter, Guy G.; Steffens, David C.; Welsh-Bohmer, Kathleen; Wang, Lihong; Shen, Dinggang

    Mild cognitive impairment (MCI), often a prodromal phase of Alzheimer's disease (AD), is frequently considered to be a good target for early diagnosis and therapeutic interventions of AD. Recent emergence of reliable network characterization techniques have made understanding neurological disorders at a whole brain connectivity level possible. Accordingly, we propose a network-based multivariate classification algorithm, using a collection of measures derived from white-matter (WM) connectivity networks, to accurately identify MCI patients from normal controls. An enriched description of WM connections, utilizing six physiological parameters, i.e., fiber penetration count, fractional anisotropy (FA), mean diffusivity (MD), and principal diffusivities (λ 1, λ 2, λ 3), results in six connectivity networks for each subject to account for the connection topology and the biophysical properties of the connections. Upon parcellating the brain into 90 regions-of-interest (ROIs), the average statistics of each ROI in relation to the remaining ROIs are extracted as features for classification. These features are then sieved to select the most discriminant subset of features for building an MCI classifier via support vector machines (SVMs). Cross-validation results indicate better diagnostic power of the proposed enriched WM connection description than simple description with any single physiological parameter.

  1. Uncertainty Analysis via Failure Domain Characterization: Polynomial Requirement Functions

    NASA Technical Reports Server (NTRS)

    Crespo, Luis G.; Munoz, Cesar A.; Narkawicz, Anthony J.; Kenny, Sean P.; Giesy, Daniel P.

    2011-01-01

    This paper proposes an uncertainty analysis framework based on the characterization of the uncertain parameter space. This characterization enables the identification of worst-case uncertainty combinations and the approximation of the failure and safe domains with a high level of accuracy. Because these approximations are comprised of subsets of readily computable probability, they enable the calculation of arbitrarily tight upper and lower bounds to the failure probability. A Bernstein expansion approach is used to size hyper-rectangular subsets while a sum of squares programming approach is used to size quasi-ellipsoidal subsets. These methods are applicable to requirement functions whose functional dependency on the uncertainty is a known polynomial. Some of the most prominent features of the methodology are the substantial desensitization of the calculations from the uncertainty model assumed (i.e., the probability distribution describing the uncertainty) as well as the accommodation for changes in such a model with a practically insignificant amount of computational effort.

  2. Satellite Level 3 & 4 Data Subsetting at NASA GES DISC

    NASA Technical Reports Server (NTRS)

    Huwe, Paul; Su, Jian; Loeser, Carlee; Ostrenga, Dana; Rui, Hualan; Vollmer, Bruce

    2017-01-01

    Earth Science data are available in many file formats (NetCDF, HDF, GRB, etc.) and in a wide range of sizes, from kilobytes to gigabytes. These properties have become a challenge to users if they are not familiar with these formats or only want a small region of interest (ROI) from a specific dataset. At NASA Goddard Earth Sciences Data and Information Services Center (GES DISC), we have developed and implemented a multipurpose subset service to ease user access to Earth Science data. Our Level 3 & 4 Regridder is capable of subsetting across multiple parameters (spatially, temporally, by level, and by variable) as well as having additional beneficial features (temporal means, regridding to target grids, and file conversion to other data formats). In this presentation, we will demonstrate how users can use this service to better access only the data they need in the form they require.

  3. Satellite Level 3 & 4 Data Subsetting at NASA GES DISC

    NASA Astrophysics Data System (ADS)

    Huwe, P.; Su, J.; Loeser, C. F.; Ostrenga, D.; Rui, H.; Vollmer, B.

    2017-12-01

    Earth Science data are available in many file formats (NetCDF, HDF, GRB, etc.) and in a wide range of sizes, from kilobytes to gigabytes. These properties have become a challenge to users if they are not familiar with these formats or only want a small region of interest (ROI) from a specific dataset. At NASA Goddard Earth Sciences Data and Information Services Center (GES DISC), we have developed and implemented a multipurpose subset service to ease user access to Earth Science data. Our Level 3 & 4 Regridder is capable of subsetting across multiple parameters (spatially, temporally, by level, and by variable) as well as having additional beneficial features (temporal means, regridding to target grids, and file conversion to other data formats). In this presentation, we will demonstrate how users can use this service to better access only the data they need in the form they require.

  4. Toward optimal feature and time segment selection by divergence method for EEG signals classification.

    PubMed

    Wang, Jie; Feng, Zuren; Lu, Na; Luo, Jing

    2018-06-01

    Feature selection plays an important role in the field of EEG signals based motor imagery pattern classification. It is a process that aims to select an optimal feature subset from the original set. Two significant advantages involved are: lowering the computational burden so as to speed up the learning procedure and removing redundant and irrelevant features so as to improve the classification performance. Therefore, feature selection is widely employed in the classification of EEG signals in practical brain-computer interface systems. In this paper, we present a novel statistical model to select the optimal feature subset based on the Kullback-Leibler divergence measure, and automatically select the optimal subject-specific time segment. The proposed method comprises four successive stages: a broad frequency band filtering and common spatial pattern enhancement as preprocessing, features extraction by autoregressive model and log-variance, the Kullback-Leibler divergence based optimal feature and time segment selection and linear discriminate analysis classification. More importantly, this paper provides a potential framework for combining other feature extraction models and classification algorithms with the proposed method for EEG signals classification. Experiments on single-trial EEG signals from two public competition datasets not only demonstrate that the proposed method is effective in selecting discriminative features and time segment, but also show that the proposed method yields relatively better classification results in comparison with other competitive methods. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. Enantioselectivity in Candida antarctica lipase B: A molecular dynamics study

    PubMed Central

    Raza, Sami; Fransson, Linda; Hult, Karl

    2001-01-01

    A major problem in predicting the enantioselectivity of an enzyme toward substrate molecules is that even high selectivity toward one substrate enantiomer over the other corresponds to a very small difference in free energy. However, total free energies in enzyme-substrate systems are very large and fluctuate significantly because of general protein motion. Candida antarctica lipase B (CALB), a serine hydrolase, displays enantioselectivity toward secondary alcohols. Here, we present a modeling study where the aim has been to develop a molecular dynamics-based methodology for the prediction of enantioselectivity in CALB. The substrates modeled (seven in total) were 3-methyl-2-butanol with various aliphatic carboxylic acids and also 2-butanol, as well as 3,3-dimethyl-2-butanol with octanoic acid. The tetrahedral reaction intermediate was used as a model of the transition state. Investigative analyses were performed on ensembles of nonminimized structures and focused on the potential energies of a number of subsets within the modeled systems to determine which specific regions are important for the prediction of enantioselectivity. One category of subset was based on atoms that make up the core structural elements of the transition state. We considered that a more favorable energetic conformation of such a subset should relate to a greater likelihood for catalysis to occur, thus reflecting higher selectivity. The results of this study conveyed that the use of this type of subset was viable for the analysis of structural ensembles and yielded good predictions of enantioselectivity. PMID:11266619

  6. Identifying Depressed Older Adults in Primary Care: A Secondary Analysis of a Multisite Randomized Controlled Trial

    PubMed Central

    Voils, Corrine I.; Olsen, Maren K.; Williams, John W.; for the IMPACT Study Investigators

    2008-01-01

    Objective: To determine whether a subset of depressive symptoms could be identified to facilitate diagnosis of depression in older adults in primary care. Method: Secondary analysis was conducted on 898 participants aged 60 years or older with major depressive disorder and/or dysthymic disorder (according to DSM-IV criteria) who participated in the Improving Mood–Promoting Access to Collaborative Treatment (IMPACT) study, a multisite, randomized trial of collaborative care for depression (recruitment from July 1999 to August 2001). Linear regression was used to identify a core subset of depressive symptoms associated with decreased social, physical, and mental functioning. The sensitivity and specificity, adjusting for selection bias, were evaluated for these symptoms. The sensitivity and specificity of a second subset of 4 depressive symptoms previously validated in a midlife sample was also evaluated. Results: Psychomotor changes, fatigue, and suicidal ideation were associated with decreased functioning and served as the core set of symptoms. Adjusting for selection bias, the sensitivity of these 3 symptoms was 0.012 and specificity 0.994. The sensitivity of the 4 symptoms previously validated in a midlife sample was 0.019 and specificity was 0.997. Conclusion: We identified 3 depression symptoms that were highly specific for major depressive disorder in older adults. However, these symptoms and a previously identified subset were too insensitive for accurate diagnosis. Therefore, we recommend a full assessment of DSM-IV depression criteria for accurate diagnosis. PMID:18311416

  7. Atlas ranking and selection for automatic segmentation of the esophagus from CT scans

    NASA Astrophysics Data System (ADS)

    Yang, Jinzhong; Haas, Benjamin; Fang, Raymond; Beadle, Beth M.; Garden, Adam S.; Liao, Zhongxing; Zhang, Lifei; Balter, Peter; Court, Laurence

    2017-12-01

    In radiation treatment planning, the esophagus is an important organ-at-risk that should be spared in patients with head and neck cancer or thoracic cancer who undergo intensity-modulated radiation therapy. However, automatic segmentation of the esophagus from CT scans is extremely challenging because of the structure’s inconsistent intensity, low contrast against the surrounding tissues, complex and variable shape and location, and random air bubbles. The goal of this study is to develop an online atlas selection approach to choose a subset of optimal atlases for multi-atlas segmentation to the delineate esophagus automatically. We performed atlas selection in two phases. In the first phase, we used the correlation coefficient of the image content in a cubic region between each atlas and the new image to evaluate their similarity and to rank the atlases in an atlas pool. A subset of atlases based on this ranking was selected, and deformable image registration was performed to generate deformed contours and deformed images in the new image space. In the second phase of atlas selection, we used Kullback-Leibler divergence to measure the similarity of local-intensity histograms between the new image and each of the deformed images, and the measurements were used to rank the previously selected atlases. Deformed contours were overlapped sequentially, from the most to the least similar, and the overlap ratio was examined. We further identified a subset of optimal atlases by analyzing the variation of the overlap ratio versus the number of atlases. The deformed contours from these optimal atlases were fused together using a modified simultaneous truth and performance level estimation algorithm to produce the final segmentation. The approach was validated with promising results using both internal data sets (21 head and neck cancer patients and 15 thoracic cancer patients) and external data sets (30 thoracic patients).

  8. Radiographic estimation in seropositive and seronegative rheumatoid arthritis

    PubMed Central

    Sahatçiu-Meka, Vjollca; Rexhepi, Sylejman; Manxhuka-Kërliu, Suzana; Rexhepi, Mjellma

    2011-01-01

    Long since it have been suggested that a subpopulation of patients with rheumatoid arthritis, diagnosed with negative rheumatoid factor tests, represents a clinical entity quite distinct from that of seropositive rheumatoid arthritis (RA). Our aim was to establish a scientific comparative analysis between seronegative and seropositive rheumatoid arthritis, regarding some radiological and clinical parameters, applied for the first time on patients from Kosovo. Two hundred fifty patients with rheumatoid arthritis according to the American College of Rheumatology criteria were retrospectively studied by analysis the radiographic damage and clinical parameters of the disease, using a data base. All examinees were between 25-60 years of age (Xb=49.96, SD=10.37) with disease duration between 1-27 years (Xb = 6.41, SD=6.47). All patients underwent a standardised evaluation radiographs. Baseline standardised poster anterior radiographs of hands and feet and radiographs of other joints, depending on indications, were assessed. Erythrocyte sedimentation rate values correlated with the radiological damages and statistical difference was found for seronegative subset (r=0.24, p<0.01). Longer duration of the disease resulted in the increase of radiological changes in both subsets (r=0.66, p<0.01) seronegative, (r=0.49, p<0.01) seropositive. Anatomic changes of IInd and IIIrd level were nearly equally distributed in both subsets, 76 (60.8%) seronegative, 75 (60%) seropositive. Radiological damages are nearly equal in both subsets, elevate in relation to the duration of the disease and correlate with ESR values. Regarding the sero-status, differences within sex, with some exceptions, are not relevant. Although there are some definite quantitative and qualitative differences regarding sero-status, obviously there is a great deal of overlap between the two groups. PMID:21875421

  9. Radiographic estimation in seropositive and seronegative rheumatoid arthritis.

    PubMed

    Sahatçiu-Meka, Vjollca; Rexhepi, Sylejman; Manxhuka-Kërliu, Suzana; Rexhepi, Mjellma

    2011-08-01

    Long since it have been suggested that a subpopulation of patients with rheumatoid arthritis, diagnosed with negative rheumatoid factor tests, represents a clinical entity quite distinct from that of seropositive rheumatoid arthritis (RA). Our aim was to establish a scientific comparative analysis between seronegative and seropositive rheumatoid arthritis, regarding some radiological and clinical parameters, applied for the first time on patients from Kosovo. Two hundred fifty patients with rheumatoid arthritis according to the American College of Rheumatology criteria were retrospectively studied by analysis the radiographic damage and clinical parameters of the disease, using a data base. All examinees were between 25-60 years of age (Xb=49.96, SD=10.37) with disease duration between 1-27 years (Xb = 6.41, SD=6.47). All patients underwent a standardised evaluation radiographs. Baseline standardised poster anterior radiographs of hands and feet and radiographs of other joints, depending on indications, were assessed. Erythrocyte sedimentation rate values correlated with the radiological damages and statistical difference was found for seronegative subset (r=0.24, p<0.01). Longer duration of the disease resulted in the increase of radiological changes in both subsets (r=0.66, p<0.01) seronegative, (r=0.49, p<0.01) seropositive. Anatomic changes of IInd and IIIrd level were nearly equally distributed in both subsets, 76 (60.8%) seronegative, 75 (60%) seropositive. Radiological damages are nearly equal in both subsets, elevate in relation to the duration of the disease and correlate with ESR values. Regarding the sero-status, differences within sex, with some exceptions, are not relevant. Although there are some definite quantitative and qualitative differences regarding sero-status, obviously there is a great deal of overlap between the two groups.

  10. 77 FR 43879 - Self-Regulatory Organizations; NYSE Arca, Inc.; Notice of Designation of a Longer Period for...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-26

    ... Proposed Rule Change Amending NYSE Arca Equities Rule 7.31(h) To Add a PL Select Order Type July 20, 2012...(h) to add a PL Select Order type. The proposed rule change was published for comment in the Federal... security at a specified, undisplayed price. The PL Select Order would be a subset of the PL Order that...

  11. Gene selection for microarray data classification via subspace learning and manifold regularization.

    PubMed

    Tang, Chang; Cao, Lijuan; Zheng, Xiao; Wang, Minhui

    2017-12-19

    With the rapid development of DNA microarray technology, large amount of genomic data has been generated. Classification of these microarray data is a challenge task since gene expression data are often with thousands of genes but a small number of samples. In this paper, an effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed. Compared with original data, the selected gene subset can benefit the classification task. We formulate the gene selection task as a manifold regularized subspace learning problem. In detail, a projection matrix is used to project the original high dimensional microarray data into a lower dimensional subspace, with the constraint that the original genes can be well represented by the selected genes. Meanwhile, the local manifold structure of original data is preserved by a Laplacian graph regularization term on the low-dimensional data space. The projection matrix can serve as an importance indicator of different genes. An iterative update algorithm is developed for solving the problem. Experimental results on six publicly available microarray datasets and one clinical dataset demonstrate that the proposed method performs better when compared with other state-of-the-art methods in terms of microarray data classification. Graphical Abstract The graphical abstract of this work.

  12. A small number of candidate gene SNPs reveal continental ancestry in African Americans

    PubMed Central

    KODAMAN, NURI; ALDRICH, MELINDA C.; SMITH, JEFFREY R.; SIGNORELLO, LISA B.; BRADLEY, KEVIN; BREYER, JOAN; COHEN, SARAH S.; LONG, JIRONG; CAI, QIUYIN; GILES, JUSTIN; BUSH, WILLIAM S.; BLOT, WILLIAM J.; MATTHEWS, CHARLES E.; WILLIAMS, SCOTT M.

    2013-01-01

    SUMMARY Using genetic data from an obesity candidate gene study of self-reported African Americans and European Americans, we investigated the number of Ancestry Informative Markers (AIMs) and candidate gene SNPs necessary to infer continental ancestry. Proportions of African and European ancestry were assessed with STRUCTURE (K=2), using 276 AIMs. These reference values were compared to estimates derived using 120, 60, 30, and 15 SNP subsets randomly chosen from the 276 AIMs and from 1144 SNPs in 44 candidate genes. All subsets generated estimates of ancestry consistent with the reference estimates, with mean correlations greater than 0.99 for all subsets of AIMs, and mean correlations of 0.99±0.003; 0.98± 0.01; 0.93±0.03; and 0.81± 0.11 for subsets of 120, 60, 30, and 15 candidate gene SNPs, respectively. Among African Americans, the median absolute difference from reference African ancestry values ranged from 0.01 to 0.03 for the four AIMs subsets and from 0.03 to 0.09 for the four candidate gene SNP subsets. Furthermore, YRI/CEU Fst values provided a metric to predict the performance of candidate gene SNPs. Our results demonstrate that a small number of SNPs randomly selected from candidate genes can be used to estimate admixture proportions in African Americans reliably. PMID:23278390

  13. A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization.

    PubMed

    Vafaee Sharbaf, Fatemeh; Mosafer, Sara; Moattar, Mohammad Hossein

    2016-06-01

    This paper proposes an approach for gene selection in microarray data. The proposed approach consists of a primary filter approach using Fisher criterion which reduces the initial genes and hence the search space and time complexity. Then, a wrapper approach which is based on cellular learning automata (CLA) optimized with ant colony method (ACO) is used to find the set of features which improve the classification accuracy. CLA is applied due to its capability to learn and model complicated relationships. The selected features from the last phase are evaluated using ROC curve and the most effective while smallest feature subset is determined. The classifiers which are evaluated in the proposed framework are K-nearest neighbor; support vector machine and naïve Bayes. The proposed approach is evaluated on 4 microarray datasets. The evaluations confirm that the proposed approach can find the smallest subset of genes while approaching the maximum accuracy. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Modelling shock to detonation transition in PETN using HERMES and CREST

    NASA Astrophysics Data System (ADS)

    Maheswaran, Mary-Ann; Curtis, John; Reaugh, Jack

    2013-06-01

    The High Explosive Response to MEchanical Stimulus (HERMES) model has been developed to address High Explosive Violent Response (HEVR). It is a material model for use in the both the LS-DYNA finite element and ALE3D hydrocodes that enables the modelling of both shock to detonation (SDT) and deflagration to detonation (DDT) transition. As part of its ongoing development and application, model parameters for the explosive PETN were found by using experimental data for PETN at different densities. PETN was selected because of the availability of both SDT and DDT data. To model SDT and DDT, HERMES uses a subset of the CREST reactive burn model with the Mie-Gruneisen equation of state (EOS) for the unreacted explosive and a look-up table for the gas EOS as generated by Cheetah. The unreacted EOS parameters were found first by calculating the principal isentrope of unreacted PETN at TMD from PETN shock Hugoniot data. Then Pop-plot data for PETN was used to fit the CREST parameters at each density. The resulting new PETN HERMES material model provides a platform for further investigations of SDT and DDT in low density PETN powder. JER's activity was performed under the auspices of the US DOE by LLNL under Contract DE-AC52-07NA27344, and partially funded by the Joint US DoD/DOE Munitions Technology Development Program.

  15. Automatic design of basin-specific drought indexes for highly regulated water systems

    NASA Astrophysics Data System (ADS)

    Zaniolo, Marta; Giuliani, Matteo; Castelletti, Andrea Francesco; Pulido-Velazquez, Manuel

    2018-04-01

    Socio-economic costs of drought are progressively increasing worldwide due to undergoing alterations of hydro-meteorological regimes induced by climate change. Although drought management is largely studied in the literature, traditional drought indexes often fail at detecting critical events in highly regulated systems, where natural water availability is conditioned by the operation of water infrastructures such as dams, diversions, and pumping wells. Here, ad hoc index formulations are usually adopted based on empirical combinations of several, supposed-to-be significant, hydro-meteorological variables. These customized formulations, however, while effective in the design basin, can hardly be generalized and transferred to different contexts. In this study, we contribute FRIDA (FRamework for Index-based Drought Analysis), a novel framework for the automatic design of basin-customized drought indexes. In contrast to ad hoc empirical approaches, FRIDA is fully automated, generalizable, and portable across different basins. FRIDA builds an index representing a surrogate of the drought conditions of the basin, computed by combining all the relevant available information about the water circulating in the system identified by means of a feature extraction algorithm. We used the Wrapper for Quasi-Equally Informative Subset Selection (W-QEISS), which features a multi-objective evolutionary algorithm to find Pareto-efficient subsets of variables by maximizing the wrapper accuracy, minimizing the number of selected variables, and optimizing relevance and redundancy of the subset. The preferred variable subset is selected among the efficient solutions and used to formulate the final index according to alternative model structures. We apply FRIDA to the case study of the Jucar river basin (Spain), a drought-prone and highly regulated Mediterranean water resource system, where an advanced drought management plan relying on the formulation of an ad hoc state index is used for triggering drought management measures. The state index was constructed empirically with a trial-and-error process begun in the 1980s and finalized in 2007, guided by the experts from the Confederación Hidrográfica del Júcar (CHJ). Our results show that the automated variable selection outcomes align with CHJ's 25-year-long empirical refinement. In addition, the resultant FRIDA index outperforms the official State Index in terms of accuracy in reproducing the target variable and cardinality of the selected inputs set.

  16. Study on experimental characterization of carbon fiber reinforced polymer panel using digital image correlation: A sensitivity analysis

    NASA Astrophysics Data System (ADS)

    Kashfuddoja, Mohammad; Prasath, R. G. R.; Ramji, M.

    2014-11-01

    In this work, the experimental characterization of polymer-matrix and polymer based carbon fiber reinforced composite laminate by employing a whole field non-contact digital image correlation (DIC) technique is presented. The properties are evaluated based on full field data obtained from DIC measurements by performing a series of tests as per ASTM standards. The evaluated properties are compared with the results obtained from conventional testing and analytical models and they are found to closely match. Further, sensitivity of DIC parameters on material properties is investigated and their optimum value is identified. It is found that the subset size has more influence on material properties as compared to step size and their predicted optimum value for the case of both matrix and composite material is found consistent with each other. The aspect ratio of region of interest (ROI) chosen for correlation should be the same as that of camera resolution aspect ratio for better correlation. Also, an open cutout panel made of the same composite laminate is taken into consideration to demonstrate the sensitivity of DIC parameters on predicting complex strain field surrounding the hole. It is observed that the strain field surrounding the hole is much more sensitive to step size rather than subset size. Lower step size produced highly pixilated strain field, showing sensitivity of local strain at the expense of computational time in addition with random scattered noisy pattern whereas higher step size mitigates the noisy pattern at the expense of losing the details present in data and even alters the natural trend of strain field leading to erroneous maximum strain locations. The subset size variation mainly presents a smoothing effect, eliminating noise from strain field while maintaining the details in the data without altering their natural trend. However, the increase in subset size significantly reduces the strain data at hole edge due to discontinuity in correlation. Also, the DIC results are compared with FEA prediction to ascertain the suitable value of DIC parameters towards better accuracy.

  17. Biowaste home composting: experimental process monitoring and quality control.

    PubMed

    Tatàno, Fabio; Pagliaro, Giacomo; Di Giovanni, Paolo; Floriani, Enrico; Mangani, Filippo

    2015-04-01

    Because home composting is a prevention option in managing biowaste at local levels, the objective of the present study was to contribute to the knowledge of the process evolution and compost quality that can be expected and obtained, respectively, in this decentralized option. In this study, organized as the research portion of a provincial project on home composting in the territory of Pesaro-Urbino (Central Italy), four experimental composters were first initiated and temporally monitored. Second, two small sub-sets of selected provincial composters (directly operated by households involved in the project) underwent quality control on their compost products at two different temporal steps. The monitored experimental composters showed overall decreasing profiles versus composting time for moisture, organic carbon, and C/N, as well as overall increasing profiles for electrical conductivity and total nitrogen, which represented qualitative indications of progress in the process. Comparative evaluations of the monitored experimental composters also suggested some interactions in home composting, i.e., high C/N ratios limiting organic matter decomposition rates and final humification levels; high moisture contents restricting the internal temperature regime; nearly horizontal phosphorus and potassium evolutions contributing to limit the rates of increase in electrical conductivity; and prolonged biowaste additions contributing to limit the rate of decrease in moisture. The measures of parametric data variability in the two sub-sets of controlled provincial composters showed decreased variability in moisture, organic carbon, and C/N from the seventh to fifteenth month of home composting, as well as increased variability in electrical conductivity, total nitrogen, and humification rate, which could be considered compatible with the respective nature of decreasing and increasing parameters during composting. The modeled parametric kinetics in the monitored experimental composters, along with the evaluation of the parametric central tendencies in the sub-sets of controlled provincial composters, all indicate that 12-15 months is a suitable duration for the appropriate development of home composting in final and simultaneous compliance with typical reference limits. Copyright © 2014 Elsevier Ltd. All rights reserved.

  18. Parameter Estimation of Computationally Expensive Watershed Models Through Efficient Multi-objective Optimization and Interactive Decision Analytics

    NASA Astrophysics Data System (ADS)

    Akhtar, Taimoor; Shoemaker, Christine

    2016-04-01

    Watershed model calibration is inherently a multi-criteria problem. Conflicting trade-offs exist between different quantifiable calibration criterions indicating the non-existence of a single optimal parameterization. Hence, many experts prefer a manual approach to calibration where the inherent multi-objective nature of the calibration problem is addressed through an interactive, subjective, time-intensive and complex decision making process. Multi-objective optimization can be used to efficiently identify multiple plausible calibration alternatives and assist calibration experts during the parameter estimation process. However, there are key challenges to the use of multi objective optimization in the parameter estimation process which include: 1) multi-objective optimization usually requires many model simulations, which is difficult for complex simulation models that are computationally expensive; and 2) selection of one from numerous calibration alternatives provided by multi-objective optimization is non-trivial. This study proposes a "Hybrid Automatic Manual Strategy" (HAMS) for watershed model calibration to specifically address the above-mentioned challenges. HAMS employs a 3-stage framework for parameter estimation. Stage 1 incorporates the use of an efficient surrogate multi-objective algorithm, GOMORS, for identification of numerous calibration alternatives within a limited simulation evaluation budget. The novelty of HAMS is embedded in Stages 2 and 3 where an interactive visual and metric based analytics framework is available as a decision support tool to choose a single calibration from the numerous alternatives identified in Stage 1. Stage 2 of HAMS provides a goodness-of-fit measure / metric based interactive framework for identification of a small subset (typically less than 10) of meaningful and diverse set of calibration alternatives from the numerous alternatives obtained in Stage 1. Stage 3 incorporates the use of an interactive visual analytics framework for decision support in selection of one parameter combination from the alternatives identified in Stage 2. HAMS is applied for calibration of flow parameters of a SWAT model, (Soil and Water Assessment Tool) designed to simulate flow in the Cannonsville watershed in upstate New York. Results from the application of HAMS to Cannonsville indicate that efficient multi-objective optimization and interactive visual and metric based analytics can bridge the gap between the effective use of both automatic and manual strategies for parameter estimation of computationally expensive watershed models.

  19. An overview of the Columbia Habitat Monitoring Program's (CHaMP) spatial-temporal design framework

    EPA Science Inventory

    We briefly review the concept of a master sample applied to stream networks in which a randomized set of stream sites is selected across a broad region to serve as a list of sites from which a subset of sites is selected to achieve multiple objectives of specific designs. The Col...

  20. Selecting and Validating Tasks from a Kindergarten Screening Battery that Best Predict Third Grade Educational Placement

    ERIC Educational Resources Information Center

    Scott, Marcia Strong; Delgado, Christine F.; Tu, Shihfen; Fletcher, Kathryn L.

    2005-01-01

    In this study, predictive classification accuracy was used to select those tasks from a kindergarten screening battery that best identified children who, three years later, were classified as educable mentally handicapped or as having a specific learning disability. A subset of measures enabled correct classification of 91% of the children in…

  1. Selecting climate change scenarios using impact-relevant sensitivities

    Treesearch

    Julie A. Vano; John B. Kim; David E. Rupp; Philip W. Mote

    2015-01-01

    Climate impact studies often require the selection of a small number of climate scenarios. Ideally, a subset would have simulations that both (1) appropriately represent the range of possible futures for the variable/s most important to the impact under investigation and (2) come from global climate models (GCMs) that provide plausible results for future climate in the...

  2. Effect of high-dose pitavastatin on glucose homeostasis in patients at elevated risk of new-onset diabetes: insights from the CAPITAIN and PREVAIL-US studies.

    PubMed

    Chapman, M J; Orsoni, A; Robillard, P; Hounslow, N; Sponseller, C A; Giral, P

    2014-05-01

    Statin treatment may impair glucose homeostasis and increase the risk of new-onset diabetes mellitus, although this may depend on the statin, dose and patient population. We evaluated the effects of pitavastatin 4 mg/day on glucose homeostasis in patients with metabolic syndrome in the CAPITAIN trial. Findings were validated in a subset of patients enrolled in PREVAIL-US. Participants with a well defined metabolic syndrome phenotype were recruited to CAPITAIN to reduce the influence of confounding factors. Validation and comparison datasets were selected comprising phenotypically similar subsets of individuals enrolled in PREVAIL-US and treated with pitavastatin or pravastatin, respectively. Mean change from baseline in parameters of glucose homeostasis (fasting plasma glucose [FPG], glycated hemoglobin [HbA1c], insulin, quantitative insulin-sensitivity check index [QUICKI] and homeostasis model of assessment-insulin resistance [HOMA-IR]) and plasma lipid profile were assessed at 6 months (CAPITAIN) and 3 months (PREVAIL-US) after initiating treatment. In CAPITAIN (n = 12), no significant differences from baseline in HbA1c, insulin, HOMA-IR and QUICKI were observed at day 180 in patients treated with pitavastatin. A small (4%) increase in FPG from baseline to day 180 (P < 0.05), was observed. In the validation dataset (n = 9), no significant differences from baseline in glycemic parameters were observed at day 84 (all comparisons P > 0.05). Similar results were observed for pravastatin in the comparison dataset (n = 14). Other than a small change in FPG in the CAPITAIN study, neutral effects of pitavastatin on glucose homeostasis were observed in two cohorts of patients with metabolic syndrome, independent of its efficacy in reducing levels of atherogenic lipoproteins. The small number of patients and relatively short follow-up period represent limitations of the study. Nevertheless, these data suggest that statin-induced diabetogenesis may not represent a class effect.

  3. In silico prediction of toxicity of phenols to Tetrahymena pyriformis by using genetic algorithm and decision tree-based modeling approach.

    PubMed

    Abbasitabar, Fatemeh; Zare-Shahabadi, Vahid

    2017-04-01

    Risk assessment of chemicals is an important issue in environmental protection; however, there is a huge lack of experimental data for a large number of end-points. The experimental determination of toxicity of chemicals involves high costs and time-consuming process. In silico tools such as quantitative structure-toxicity relationship (QSTR) models, which are constructed on the basis of computational molecular descriptors, can predict missing data for toxic end-points for existing or even not yet synthesized chemicals. Phenol derivatives are known to be aquatic pollutants. With this background, we aimed to develop an accurate and reliable QSTR model for the prediction of toxicity of 206 phenols to Tetrahymena pyriformis. A multiple linear regression (MLR)-based QSTR was obtained using a powerful descriptor selection tool named Memorized_ACO algorithm. Statistical parameters of the model were 0.72 and 0.68 for R training 2 and R test 2 , respectively. To develop a high-quality QSTR model, classification and regression tree (CART) was employed. Two approaches were considered: (1) phenols were classified into different modes of action using CART and (2) the phenols in the training set were partitioned to several subsets by a tree in such a manner that in each subset, a high-quality MLR could be developed. For the first approach, the statistical parameters of the resultant QSTR model were improved to 0.83 and 0.75 for R training 2 and R test 2 , respectively. Genetic algorithm was employed in the second approach to obtain an optimal tree, and it was shown that the final QSTR model provided excellent prediction accuracy for the training and test sets (R training 2 and R test 2 were 0.91 and 0.93, respectively). The mean absolute error for the test set was computed as 0.1615. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. CD127 and CD25 expression defines CD4+ T cell subsets that are differentially depleted during HIV infection.

    PubMed

    Dunham, Richard M; Cervasi, Barbara; Brenchley, Jason M; Albrecht, Helmut; Weintrob, Amy; Sumpter, Beth; Engram, Jessica; Gordon, Shari; Klatt, Nichole R; Frank, Ian; Sodora, Donald L; Douek, Daniel C; Paiardini, Mirko; Silvestri, Guido

    2008-04-15

    Decreased CD4(+) T cell counts are the best marker of disease progression during HIV infection. However, CD4(+) T cells are heterogeneous in phenotype and function, and it is unknown how preferential depletion of specific CD4(+) T cell subsets influences disease severity. CD4(+) T cells can be classified into three subsets by the expression of receptors for two T cell-tropic cytokines, IL-2 (CD25) and IL-7 (CD127). The CD127(+)CD25(low/-) subset includes IL-2-producing naive and central memory T cells; the CD127(-)CD25(-) subset includes mainly effector T cells expressing perforin and IFN-gamma; and the CD127(low)CD25(high) subset includes FoxP3-expressing regulatory T cells. Herein we investigated how the proportions of these T cell subsets are changed during HIV infection. When compared with healthy controls, HIV-infected patients show a relative increase in CD4(+)CD127(-)CD25(-) T cells that is related to an absolute decline of CD4(+)CD127(+)CD25(low/-) T cells. Interestingly, this expansion of CD4(+)CD127(-) T cells was not observed in naturally SIV-infected sooty mangabeys. The relative expansion of CD4(+)CD127(-)CD25(-) T cells correlated directly with the levels of total CD4(+) T cell depletion and immune activation. CD4(+)CD127(-)CD25(-) T cells were not selectively resistant to HIV infection as levels of cell-associated virus were similar in all non-naive CD4(+) T cell subsets. These data indicate that, during HIV infection, specific changes in the fraction of CD4(+) T cells expressing CD25 and/or CD127 are associated with disease progression. Further studies will determine whether monitoring the three subsets of CD4(+) T cells defined based on the expression of CD25 and CD127 should be used in the clinical management of HIV-infected individuals.

  5. Self-Organizing-Map Program for Analyzing Multivariate Data

    NASA Technical Reports Server (NTRS)

    Li, P. Peggy; Jacob, Joseph C.; Block, Gary L.; Braverman, Amy J.

    2005-01-01

    SOM_VIS is a computer program for analysis and display of multidimensional sets of Earth-image data typified by the data acquired by the Multi-angle Imaging Spectro-Radiometer [MISR (a spaceborne instrument)]. In SOM_VIS, an enhanced self-organizing-map (SOM) algorithm is first used to project a multidimensional set of data into a nonuniform three-dimensional lattice structure. The lattice structure is mapped to a color space to obtain a color map for an image. The Voronoi cell-refinement algorithm is used to map the SOM lattice structure to various levels of color resolution. The final result is a false-color image in which similar colors represent similar characteristics across all its data dimensions. SOM_VIS provides a control panel for selection of a subset of suitably preprocessed MISR radiance data, and a control panel for choosing parameters to run SOM training. SOM_VIS also includes a component for displaying the false-color SOM image, a color map for the trained SOM lattice, a plot showing an original input vector in 36 dimensions of a selected pixel from the SOM image, the SOM vector that represents the input vector, and the Euclidean distance between the two vectors.

  6. VizieR Online Data Catalog: Solar analogs and twins rotation by Kepler (do Nascimento+, 2014)

    NASA Astrophysics Data System (ADS)

    Do Nascimento, J.-D. Jr; Garcia, R. A.; Mathur, S.; Anthony, F.; Barnes, S. A.; Meibom, S.; da Costa, J. S.; Castro, M.; Salabert, D.; Ceillier, T.

    2017-03-01

    Our sample of 75 stars consists of a seismic sample of 38 from Chaplin et al. (2014, J/ApJS/210/1), 35 additional stars selected from the Kepler Input Catalog (KIC), and 16 Cyg A and B. We selected 38 well-studied stars from the asteroseismic data with fundamental properties, including ages, estimated by Chaplin et al. (2014, J/ApJS/210/1), and with Teff and log g as close as possible to the Sun's value (5200 K < Teff < 6060 K and 3.63 < log g < 4.40). This seismic sample allows a direct comparison between gyro- and seismic-ages for a subset of eight stars. These seismic samples were observed in short cadence for one month each in survey mode. Stellar properties for these stars have been estimated using two global asteroseismic parameters and complementary photometric and spectroscopic observations as described by Chaplin et al. (2014, J/ApJS/210/1). The median final quoted uncertainties for the full Chaplin et al. (2014, J/ApJS/210/1) sample were approximately 0.020 dex in log g and 150 K in Teff. (1 data file).

  7. Item selection via Bayesian IRT models.

    PubMed

    Arima, Serena

    2015-02-10

    With reference to a questionnaire that aimed to assess the quality of life for dysarthric speakers, we investigate the usefulness of a model-based procedure for reducing the number of items. We propose a mixed cumulative logit model, which is known in the psychometrics literature as the graded response model: responses to different items are modelled as a function of individual latent traits and as a function of item characteristics, such as their difficulty and their discrimination power. We jointly model the discrimination and the difficulty parameters by using a k-component mixture of normal distributions. Mixture components correspond to disjoint groups of items. Items that belong to the same groups can be considered equivalent in terms of both difficulty and discrimination power. According to decision criteria, we select a subset of items such that the reduced questionnaire is able to provide the same information that the complete questionnaire provides. The model is estimated by using a Bayesian approach, and the choice of the number of mixture components is justified according to information criteria. We illustrate the proposed approach on the basis of data that are collected for 104 dysarthric patients by local health authorities in Lecce and in Milan. Copyright © 2014 John Wiley & Sons, Ltd.

  8. Adaptive estimation of hand movement trajectory in an EEG based brain-computer interface system

    NASA Astrophysics Data System (ADS)

    Robinson, Neethu; Guan, Cuntai; Vinod, A. P.

    2015-12-01

    Objective. The various parameters that define a hand movement such as its trajectory, speed, etc, are encoded in distinct brain activities. Decoding this information from neurophysiological recordings is a less explored area of brain-computer interface (BCI) research. Applying non-invasive recordings such as electroencephalography (EEG) for decoding makes the problem more challenging, as the encoding is assumed to be deep within the brain and not easily accessible by scalp recordings. Approach. EEG based BCI systems can be developed to identify the neural features underlying movement parameters that can be further utilized to provide a detailed and well defined control command set to a BCI output device. A real-time continuous control is better suited for practical BCI systems, and can be achieved by continuous adaptive reconstruction of movement trajectory than discrete brain activity classifications. In this work, we adaptively reconstruct/estimate the parameters of two-dimensional hand movement trajectory, namely movement speed and position, from multi-channel EEG recordings. The data for analysis is collected by performing an experiment that involved center-out right-hand movement tasks in four different directions at two different speeds in random order. We estimate movement trajectory using a Kalman filter that models the relation between brain activity and recorded parameters based on a set of defined predictors. We propose a method to define these predictor variables that includes spatial, spectral and temporally localized neural information and to select optimally informative variables. Main results. The proposed method yielded correlation of (0.60 ± 0.07) between recorded and estimated data. Further, incorporating the proposed predictor subset selection, the correlation achieved is (0.57 ± 0.07, p {\\lt }0.004) with significant gain in stability of the system, as well as dramatic reduction in number of predictors (76%) for the savings of computational time. Significance. The proposed system provides a real time movement control system using EEG-BCI with control over movement speed and position. These results are higher and statistically significant compared to existing techniques in EEG based systems and thus promise the applicability of the proposed method for efficient estimation of movement parameters and for continuous motor control.

  9. AOIPS data base management systems support for GARP data sets

    NASA Technical Reports Server (NTRS)

    Gary, J. P.

    1977-01-01

    A data base management system is identified, developed to provide flexible access to data sets produced by GARP during its data systems tests. The content and coverage of the data base are defined and a computer-aided, interactive information storage and retrieval system, implemented to facilitate access to user specified data subsets, is described. The computer programs developed to provide the capability were implemented on the highly interactive, minicomputer-based AOIPS and are referred to as the data retrieval system (DRS). Implemented as a user interactive but menu guided system, the DRS permits users to inventory the data tape library and create duplicate or subset data sets based on a user selected window defined by time and latitude/longitude boundaries. The DRS permits users to select, display, or produce formatted hard copy of individual data items contained within the data records.

  10. Using ICESat/GLAS Data Produced in a Self-Describing Format

    NASA Astrophysics Data System (ADS)

    Fowler, D. K.; Webster, D.; Fowler, C.; McAllister, M.; Haran, T. M.

    2015-12-01

    For the life of the ICESat mission and beyond, GLAS data have been distributed in binary format by NASA's National Snow and Ice Data Center Distributed Active Archive Center (NSIDC DAAC) at the University of Colorado in Boulder. These data have been extremely useful but, depending on the users, not always the easiest to use. Recently, with release 33 and 34, GLAS data have been produced in an HDF5 format. The NSIDC User Services Office has found that most users find this HDF5 format to be more user friendly than the original binary format. Some of the advantages include being able to view the actual data using HDFView or any of a number of open source tools freely available for users to view and work with the data. Also with this format NSIDC DAAC has been able to provide more selective and specific services which include spatial subsetting, file stitching, and the much sought after parameter subsetting through the use of Reverb, the next generation Earth science discovery tool. The final release of GLAS data in 2014 and the ongoing user questions not just about the data, but about the mission, satellite platform, and instrument have also spurred NSIDC DAAC efforts to make all of the mission documents and information available to the public in one location. Thus was born the ICESat/GLAS Long Term Archive now available online. The data and specifics from this mission are archived and made available to the public at NASA's NSIDC DAAC.

  11. An adaptive design for updating the threshold value of a continuous biomarker

    PubMed Central

    Spencer, Amy V.; Harbron, Chris; Mander, Adrian; Wason, James; Peers, Ian

    2017-01-01

    Potential predictive biomarkers are often measured on a continuous scale, but in practice, a threshold value to divide the patient population into biomarker ‘positive’ and ‘negative’ is desirable. Early phase clinical trials are increasingly using biomarkers for patient selection, but at this stage, it is likely that little will be known about the relationship between the biomarker and the treatment outcome. We describe a single-arm trial design with adaptive enrichment, which can increase power to demonstrate efficacy within a patient subpopulation, the parameters of which are also estimated. Our design enables us to learn about the biomarker and optimally adjust the threshold during the study, using a combination of generalised linear modelling and Bayesian prediction. At the final analysis, a binomial exact test is carried out, allowing the hypothesis that ‘no population subset exists in which the novel treatment has a desirable response rate’ to be tested. Through extensive simulations, we are able to show increased power over fixed threshold methods in many situations without increasing the type-I error rate. We also show that estimates of the threshold, which defines the population subset, are unbiased and often more precise than those from fixed threshold studies. We provide an example of the method applied (retrospectively) to publically available data from a study of the use of tamoxifen after mastectomy by the German Breast Study Group, where progesterone receptor is the biomarker of interest. PMID:27417407

  12. Key Reliability Drivers of Liquid Propulsion Engines and A Reliability Model for Sensitivity Analysis

    NASA Technical Reports Server (NTRS)

    Huang, Zhao-Feng; Fint, Jeffry A.; Kuck, Frederick M.

    2005-01-01

    This paper is to address the in-flight reliability of a liquid propulsion engine system for a launch vehicle. We first establish a comprehensive list of system and sub-system reliability drivers for any liquid propulsion engine system. We then build a reliability model to parametrically analyze the impact of some reliability parameters. We present sensitivity analysis results for a selected subset of the key reliability drivers using the model. Reliability drivers identified include: number of engines for the liquid propulsion stage, single engine total reliability, engine operation duration, engine thrust size, reusability, engine de-rating or up-rating, engine-out design (including engine-out switching reliability, catastrophic fraction, preventable failure fraction, unnecessary shutdown fraction), propellant specific hazards, engine start and cutoff transient hazards, engine combustion cycles, vehicle and engine interface and interaction hazards, engine health management system, engine modification, engine ground start hold down with launch commit criteria, engine altitude start (1 in. start), Multiple altitude restart (less than 1 restart), component, subsystem and system design, manufacturing/ground operation support/pre and post flight check outs and inspection, extensiveness of the development program. We present some sensitivity analysis results for the following subset of the drivers: number of engines for the propulsion stage, single engine total reliability, engine operation duration, engine de-rating or up-rating requirements, engine-out design, catastrophic fraction, preventable failure fraction, unnecessary shutdown fraction, and engine health management system implementation (basic redlines and more advanced health management systems).

  13. Clinical-scale selection and viral transduction of human naïve and central memory CD8+ T cells for adoptive cell therapy of cancer patients.

    PubMed

    Casati, Anna; Varghaei-Nahvi, Azam; Feldman, Steven Alexander; Assenmacher, Mario; Rosenberg, Steven Aaron; Dudley, Mark Edward; Scheffold, Alexander

    2013-10-01

    The adoptive transfer of lymphocytes genetically engineered to express tumor-specific antigen receptors is a potent strategy to treat cancer patients. T lymphocyte subsets, such as naïve or central memory T cells, selected in vitro prior to genetic engineering have been extensively investigated in preclinical mouse models, where they demonstrated improved therapeutic efficacy. However, so far, this is challenging to realize in the clinical setting, since good manufacturing practices (GMP) procedures for complex cell sorting and genetic manipulation are limited. To be able to directly compare the immunological attributes and therapeutic efficacy of naïve (T(N)) and central memory (T(CM)) CD8(+) T cells, we investigated clinical-scale procedures for their parallel selection and in vitro manipulation. We also evaluated currently available GMP-grade reagents for stimulation of T cell subsets, including a new type of anti-CD3/anti-CD28 nanomatrix. An optimized protocol was established for the isolation of both CD8(+) T(N) cells (CD4(-)CD62L(+)CD45RA(+)) and CD8(+) T(CM) (CD4(-)CD62L(+)CD45RA(-)) from a single patient. The highly enriched T cell subsets can be efficiently transduced and expanded to large cell numbers, sufficient for clinical applications and equivalent to or better than current cell and gene therapy approaches with unselected lymphocyte populations. The GMP protocols for selection of T(N) and T(CM) we reported here will be the basis for clinical trials analyzing safety, in vivo persistence and clinical efficacy in cancer patients and will help to generate a more reliable and efficacious cellular product.

  14. The Influence of Host Galaxies in Type Ia Supernova Cosmology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Uddin, Syed A.; Mould, Jeremy; Lidman, Chris

    We use a sample of 1338 spectroscopically confirmed and photometrically classified Type Ia supernovae (SNe Ia) sourced from Carnegie Supernova Project, Center for Astrophysics Supernova Survey, Sloan Digital Sky Survey-II, and SuperNova Legacy Survey SN samples to examine the relationships between SNe Ia and the galaxies that host them. Our results provide confirmation with improved statistical significance that SNe Ia, after standardization, are on average more luminous in massive hosts (significance >5 σ ), and decline more rapidly in massive hosts (significance >9 σ ) and in hosts with low specific star formation rates (significance >8 σ ). We studymore » the variation of these relationships with redshift and detect no evolution. We split SNe Ia into pairs of subsets that are based on the properties of the hosts and fit cosmological models to each subset. Including both systematic and statistical uncertainties, we do not find any significant shift in the best-fit cosmological parameters between the subsets. Among different SN Ia subsets, we find that SNe Ia in hosts with high specific star formation rates have the least intrinsic scatter ( σ {sub int} = 0.08 ± 0.01) in luminosity after standardization.« less

  15. The Influence of Host Galaxies in Type Ia Supernova Cosmology

    NASA Astrophysics Data System (ADS)

    Uddin, Syed A.; Mould, Jeremy; Lidman, Chris; Ruhlmann-Kleider, Vanina; Zhang, Bonnie R.

    2017-10-01

    We use a sample of 1338 spectroscopically confirmed and photometrically classified Type Ia supernovae (SNe Ia) sourced from Carnegie Supernova Project, Center for Astrophysics Supernova Survey, Sloan Digital Sky Survey-II, and SuperNova Legacy Survey SN samples to examine the relationships between SNe Ia and the galaxies that host them. Our results provide confirmation with improved statistical significance that SNe Ia, after standardization, are on average more luminous in massive hosts (significance >5σ), and decline more rapidly in massive hosts (significance >9σ) and in hosts with low specific star formation rates (significance >8σ). We study the variation of these relationships with redshift and detect no evolution. We split SNe Ia into pairs of subsets that are based on the properties of the hosts and fit cosmological models to each subset. Including both systematic and statistical uncertainties, we do not find any significant shift in the best-fit cosmological parameters between the subsets. Among different SN Ia subsets, we find that SNe Ia in hosts with high specific star formation rates have the least intrinsic scatter (σ int = 0.08 ± 0.01) in luminosity after standardization.

  16. Robust Likelihoods for Inflationary Gravitational Waves from Maps of Cosmic Microwave Background Polarization

    NASA Technical Reports Server (NTRS)

    Switzer, Eric Ryan; Watts, Duncan J.

    2016-01-01

    The B-mode polarization of the cosmic microwave background provides a unique window into tensor perturbations from inflationary gravitational waves. Survey effects complicate the estimation and description of the power spectrum on the largest angular scales. The pixel-space likelihood yields parameter distributions without the power spectrum as an intermediate step, but it does not have the large suite of tests available to power spectral methods. Searches for primordial B-modes must rigorously reject and rule out contamination. Many forms of contamination vary or are uncorrelated across epochs, frequencies, surveys, or other data treatment subsets. The cross power and the power spectrum of the difference of subset maps provide approaches to reject and isolate excess variance. We develop an analogous joint pixel-space likelihood. Contamination not modeled in the likelihood produces parameter-dependent bias and complicates the interpretation of the difference map. We describe a null test that consistently weights the difference map. Excess variance should either be explicitly modeled in the covariance or be removed through reprocessing the data.

  17. Selecting electrode configurations for image-guided cochlear implant programming using template matching

    NASA Astrophysics Data System (ADS)

    Zhang, Dongqing; Zhao, Yiyuan; Noble, Jack H.; Dawant, Benoit M.

    2017-03-01

    Cochlear implants (CIs) are used to treat patients with severe-to-profound hearing loss. In surgery, an electrode array is implanted in the cochlea. After implantation, the CI processor is programmed by an audiologist. One factor that negatively impacts outcomes and can be addressed by programming is cross-electrode neural stimulation overlap (NSO). In the recent past, we have proposed a system to assist the audiologist in programming the CI that we call Image-Guided CI Programming (IGCIP). IGCIP permits using CT images to detect NSO and recommend which subset of electrodes should be active to avoid NSO. In an ongoing clinical study, we have shown that IGCIP leads to significant improvement in hearing outcomes. Most of the IGCIP steps are robustly automated but electrode configuration selection still sometimes requires expert intervention. With expertise, Distance-Vs-Frequency (DVF) curves, which are a way to visualize the spatial relationship learned from CT between the electrodes and the nerves they stimulate, can be used to select the electrode configuration. In this work, we propose an automated technique for electrode configuration selection. It relies on matching new patients' DVF curves to a library of DVF curves for which electrode configurations are known. We compare this approach to one we have previously proposed. We show that, generally, our new method produces results that are as good as those obtained with our previous one while being generic and requiring fewer parameters.

  18. Hemodynamic and clinical impact of ultrasound-derived venous reflux parameters.

    PubMed

    Neglén, Peter; Egger, John F; Olivier, Jake; Raju, Seshadri

    2004-08-01

    This study was undertaken to assess which ultrasound-derived parameter was superior for measuring venous reflux quantitatively and to evaluate the importance of popliteal vein valve reflux. A retrospective analysis was performed of 244 refluxive limbs in 182 patients who underwent ultrasound scanning, venous pressure measurement, air plethysmography, and clinical classification of severity according to the CEAP score. Reflux time (RT, s), peak reflux velocity (PRV, m/s), time of average rate of reflux (TAF, mL/min), absolute displaced volume retrogradely (ADV, mL) were compared to clinical class, ambulatory venous pressure (% drop), venous filling time (s), and venous filling index (mL/s) using nonparametric statistical tests. A P value of <.05 was considered significant. Limbs were divided into 3 groups: (A) axial great saphenous vein reflux only (n = 68); (B) axial deep reflux including popliteal vein incompetence with or without concomitant gastrocnemius or great or small saphenous vein reflux (all ultrasound reflux parameters of each refluxive vein added at the knee level) (n = 79); and (C) all limbs with popliteal vein reflux (the ultrasound data of the refluxive popliteal vein exclusively was used in comparison regardless of concomitant associated reflux) (n = 103). Limbs were also stratified into limbs with skin changes and ulcer (C-class 4-6) and those without (C-class 1-3) and subsequently compared. No meaningful significant correlation was found between RT and the clinical and hemodynamic results in groups A and B. The PRV and TAF correlated significantly with the hemodynamic parameters. The PRV and TAF and clinical severity trended towards correlation in group A (P =.0554 and P =.0998, respectively), but was significantly correlated in group B. The poor hemodynamic condition in the subset of C-class 4-6 limbs in groups A and B was reflected in a greater PRV, TAF, and ADV in this subset as compared with the limbs in C-class 1-3. RT was not significantly different in the subsets of limbs, further suggesting that RT is not related to hemodynamic or clinical state of the limbs. No meaningful correlations were found in group C. Although the hemodynamic data were significantly poorer in the subset of limbs with C-class 4-6 than in C-class 1-3, the ultrasound-derived parameters were not significantly different. The duration of valve reflux time (or valve closure time) cannot be used to quantify severity of reflux and is purely a qualitative measurement. The PRV and the rate of reflux appeared to better reflect the magnitude of venous incompetence. In the presence of axial reflux, it appeared logical and physiologically correct to sum up these reflux parameters for each venous segment crossing the knee. The popliteal valve reflux (the "gatekeeper" function) was not in itself an important determinant of venous hemodynamics and clinical severity. Additional reflux in other venous segments must be taken into account.

  19. Efficient one-cycle affinity selection of binding proteins or peptides specific for a small-molecule using a T7 phage display pool.

    PubMed

    Takakusagi, Yoichi; Kuramochi, Kouji; Takagi, Manami; Kusayanagi, Tomoe; Manita, Daisuke; Ozawa, Hiroko; Iwakiri, Kanako; Takakusagi, Kaori; Miyano, Yuka; Nakazaki, Atsuo; Kobayashi, Susumu; Sugawara, Fumio; Sakaguchi, Kengo

    2008-11-15

    Here, we report an efficient one-cycle affinity selection using a natural-protein or random-peptide T7 phage pool for identification of binding proteins or peptides specific for small-molecules. The screening procedure involved a cuvette type 27-MHz quartz-crystal microbalance (QCM) apparatus with introduction of self-assembled monolayer (SAM) for a specific small-molecule immobilization on the gold electrode surface of a sensor chip. Using this apparatus, we attempted an affinity selection of proteins or peptides against synthetic ligand for FK506-binding protein (SLF) or irinotecan (Iri, CPT-11). An affinity selection using SLF-SAM and a natural-protein T7 phage pool successfully detected FK506-binding protein 12 (FKBP12)-displaying T7 phage after an interaction time of only 10 min. Extensive exploration of time-consuming wash and/or elution conditions together with several rounds of selection was not required. Furthermore, in the selection using a 15-mer random-peptide T7 phage pool and subsequent analysis utilizing receptor ligand contact (RELIC) software, a subset of SLF-selected peptides clearly pinpointed several amino-acid residues within the binding site of FKBP12. Likewise, a subset of Iri-selected peptides pinpointed part of the positive amino-acid region of residues from the Iri-binding site of the well-known direct targets, acetylcholinesterase (AChE) and carboxylesterase (CE). Our findings demonstrate the effectiveness of this method and general applicability for a wide range of small-molecules.

  20. The Isolation and Enrichment of Large Numbers of Highly Purified Mouse Spleen Dendritic Cell Populations and Their In Vitro Equivalents.

    PubMed

    Vremec, David

    2016-01-01

    Dendritic cells (DCs) form a complex network of cells that initiate and orchestrate immune responses against a vast array of pathogenic challenges. Developmentally and functionally distinct DC subtypes differentially regulate T-cell function. Importantly it is the ability of DC to capture and process antigen, whether from pathogens, vaccines, or self-components, and present it to naive T cells that is the key to their ability to initiate an immune response. Our typical isolation procedure for DC from murine spleen was designed to efficiently extract all DC subtypes, without bias and without alteration to their in vivo phenotype, and involves a short collagenase digestion of the tissue, followed by selection for cells of light density and finally negative selection for DC. The isolation procedure can accommodate DC numbers that have been artificially increased via administration of fms-like tyrosine kinase 3 ligand (Flt3L), either directly through a series of subcutaneous injections or by seeding with an Flt3L secreting murine melanoma. Flt3L may also be added to bone marrow cultures to produce large numbers of in vitro equivalents of the spleen DC subsets. Total DC, or their subsets, may be further purified using immunofluorescent labeling and flow cytometric cell sorting. Cell sorting may be completely bypassed by separating DC subsets using a combination of fluorescent antibody labeling and anti-fluorochrome magnetic beads. Our procedure enables efficient separation of the distinct DC subsets, even in cases where mouse numbers or flow cytometric cell sorting time is limiting.

  1. G-STRATEGY: Optimal Selection of Individuals for Sequencing in Genetic Association Studies

    PubMed Central

    Wang, Miaoyan; Jakobsdottir, Johanna; Smith, Albert V.; McPeek, Mary Sara

    2017-01-01

    In a large-scale genetic association study, the number of phenotyped individuals available for sequencing may, in some cases, be greater than the study’s sequencing budget will allow. In that case, it can be important to prioritize individuals for sequencing in a way that optimizes power for association with the trait. Suppose a cohort of phenotyped individuals is available, with some subset of them possibly already sequenced, and one wants to choose an additional fixed-size subset of individuals to sequence in such a way that the power to detect association is maximized. When the phenotyped sample includes related individuals, power for association can be gained by including partial information, such as phenotype data of ungenotyped relatives, in the analysis, and this should be taken into account when assessing whom to sequence. We propose G-STRATEGY, which uses simulated annealing to choose a subset of individuals for sequencing that maximizes the expected power for association. In simulations, G-STRATEGY performs extremely well for a range of complex disease models and outperforms other strategies with, in many cases, relative power increases of 20–40% over the next best strategy, while maintaining correct type 1 error. G-STRATEGY is computationally feasible even for large datasets and complex pedigrees. We apply G-STRATEGY to data on HDL and LDL from the AGES-Reykjavik and REFINE-Reykjavik studies, in which G-STRATEGY is able to closely-approximate the power of sequencing the full sample by selecting for sequencing a only small subset of the individuals. PMID:27256766

  2. Equifinality and process-based modelling

    NASA Astrophysics Data System (ADS)

    Khatami, S.; Peel, M. C.; Peterson, T. J.; Western, A. W.

    2017-12-01

    Equifinality is understood as one of the fundamental difficulties in the study of open complex systems, including catchment hydrology. A review of the hydrologic literature reveals that the term equifinality has been widely used, but in many cases inconsistently and without coherent recognition of the various facets of equifinality, which can lead to ambiguity but also methodological fallacies. Therefore, in this study we first characterise the term equifinality within the context of hydrological modelling by reviewing the genesis of the concept of equifinality and then presenting a theoretical framework. During past decades, equifinality has mainly been studied as a subset of aleatory (arising due to randomness) uncertainty and for the assessment of model parameter uncertainty. Although the connection between parameter uncertainty and equifinality is undeniable, we argue there is more to equifinality than just aleatory parameter uncertainty. That is, the importance of equifinality and epistemic uncertainty (arising due to lack of knowledge) and their implications is overlooked in our current practice of model evaluation. Equifinality and epistemic uncertainty in studying, modelling, and evaluating hydrologic processes are treated as if they can be simply discussed in (or often reduced to) probabilistic terms (as for aleatory uncertainty). The deficiencies of this approach to conceptual rainfall-runoff modelling are demonstrated for selected Australian catchments by examination of parameter and internal flux distributions and interactions within SIMHYD. On this basis, we present a new approach that expands equifinality concept beyond model parameters to inform epistemic uncertainty. The new approach potentially facilitates the identification and development of more physically plausible models and model evaluation schemes particularly within the multiple working hypotheses framework, and is generalisable to other fields of environmental modelling as well.

  3. Protein energy malnutrition in severe alcoholic hepatitis: diagnosis and response to treatment. The VA Cooperative Study Group #275.

    PubMed

    Mendenhall, C L; Moritz, T E; Roselle, G A; Morgan, T R; Nemchausky, B A; Tamburro, C H; Schiff, E R; McClain, C J; Marsano, L S; Allen, J I

    1995-01-01

    Active nutrition therapy and the anabolic steroid oxandrolone (OX), in selected patients with severe alcoholic hepatitis, significantly improved liver status and survival. We report here on the changes in their nutritional parameters. Protein energy malnutrition (PEM) was evaluated and expressed as percent of low normal in 271 patients initially, at 1 month and at 3 months. Active therapy consisted of OX plus a high caloric food supplement vs a matching placebo and a low calorie supplement. PEM was present in every patient; mean PEM score 60% of low normal. Most of the parameters improved significantly from baseline on standard care; the largest improvement seen in visceral proteins, the smallest in fat stores (skinfold thickness). Total PEM score significantly correlated with 6 month mortality (p = .0012). Using logistic regression analysis, creatinine height index, hand grip strength and total peripheral blood lymphocytes were the best risk factors for survival. When CD lymphocyte subsets replaced total lymphocyte counts in the equation, CD8 levels became a significant risk factor (p = .004). Active treatment produced significant risk factor (p = .004). Active treatment produced significant improvements in those parameters related to total body and muscle mass (ie, mid arm muscle area, p = .02; creatinine height index, p = .03; percent ideal body weight, p = .04). Deterioration in nutritional parameters is a significant risk factor for survival in severe patients with alcoholic hepatitis. This deterioration is reversible with standard hospital care. Active therapy further improves creatinine height index, mid arm muscle area and total lymphocyte counts. Hence, these later parameters appear to be the best indicators for follow-up assessments.

  4. Circulating B cells in type 1 diabetics exhibit fewer maturation-associated phenotypes.

    PubMed

    Hanley, Patrick; Sutter, Jennifer A; Goodman, Noah G; Du, Yangzhu; Sekiguchi, Debora R; Meng, Wenzhao; Rickels, Michael R; Naji, Ali; Luning Prak, Eline T

    2017-10-01

    Although autoantibodies have been used for decades as diagnostic and prognostic markers in type 1 diabetes (T1D), further analysis of developmental abnormalities in B cells could reveal tolerance checkpoint defects that could improve individualized therapy. To evaluate B cell developmental progression in T1D, immunophenotyping was used to classify circulating B cells into transitional, mature naïve, mature activated, and resting memory subsets. Then each subset was analyzed for the expression of additional maturation-associated markers. While the frequencies of B cell subsets did not differ significantly between patients and controls, some T1D subjects exhibited reduced proportions of B cells that expressed transmembrane activator and CAML interactor (TACI) and Fas receptor (FasR). Furthermore, some T1D subjects had B cell subsets with lower frequencies of class switching. These results suggest circulating B cells exhibit variable maturation phenotypes in T1D. These phenotypic variations may correlate with differences in B cell selection in individual T1D patients. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  5. Feature Selection and Pedestrian Detection Based on Sparse Representation.

    PubMed

    Yao, Shihong; Wang, Tao; Shen, Weiming; Pan, Shaoming; Chong, Yanwen; Ding, Fei

    2015-01-01

    Pedestrian detection have been currently devoted to the extraction of effective pedestrian features, which has become one of the obstacles in pedestrian detection application according to the variety of pedestrian features and their large dimension. Based on the theoretical analysis of six frequently-used features, SIFT, SURF, Haar, HOG, LBP and LSS, and their comparison with experimental results, this paper screens out the sparse feature subsets via sparse representation to investigate whether the sparse subsets have the same description abilities and the most stable features. When any two of the six features are fused, the fusion feature is sparsely represented to obtain its important components. Sparse subsets of the fusion features can be rapidly generated by avoiding calculation of the corresponding index of dimension numbers of these feature descriptors; thus, the calculation speed of the feature dimension reduction is improved and the pedestrian detection time is reduced. Experimental results show that sparse feature subsets are capable of keeping the important components of these six feature descriptors. The sparse features of HOG and LSS possess the same description ability and consume less time compared with their full features. The ratios of the sparse feature subsets of HOG and LSS to their full sets are the highest among the six, and thus these two features can be used to best describe the characteristics of the pedestrian and the sparse feature subsets of the combination of HOG-LSS show better distinguishing ability and parsimony.

  6. VizieR Online Data Catalog: RR Lyraes in SDSS stripe 82 (Watkins+, 2009)

    NASA Astrophysics Data System (ADS)

    Watkins, L. L.; Evans, N. W.; Belokurov, V.; Smith, M. C.; Hewett, P. C.; Bramich, D. M.; Gilmore, G. F.; Irwin, M. J.; Vidrih, S.; Wyrzykowski, L.; Zucker, D. B.

    2015-10-01

    In this paper, we select first the variable objects in Stripe 82 and then the subset of RR Lyraes, using the Bramich et al. (2008MNRAS.386..887B, Cat. V/141) light-motion curve catalogue (LMCC) and HLC. We make a selection of the variable objects and an identification of RR Lyrae stars. (2 data files).

  7. Effect of a Surprising Downward Shift in Reinforcer Value on Stimulus Over-Selectivity in a Simultaneous Discrimination Procedure

    ERIC Educational Resources Information Center

    Reynolds, Gemma; Reed, Phil

    2013-01-01

    Stimulus over-selectivity refers to the phenomenon whereby behavior is controlled by a subset of elements in the environment at the expense of other equally salient aspects of the environment. The experiments explored whether this cue interference effect was reduced following a surprising downward shift in reinforcer value. Experiment 1 revealed…

  8. Wind Plant Power Optimization and Control under Uncertainty

    NASA Astrophysics Data System (ADS)

    Jha, Pankaj; Ulker, Demet; Hutchings, Kyle; Oxley, Gregory

    2017-11-01

    The development of optimized cooperative wind plant control involves the coordinated operation of individual turbines co-located within a wind plant to improve the overall power production. This is typically achieved by manipulating the trajectory and intensity of wake interactions between nearby turbines, thereby reducing wake losses. However, there are various types of uncertainties involved, such as turbulent inflow and microscale and turbine model input parameters. In a recent NREL-Envision collaboration, a controller that performs wake steering was designed and implemented for the Longyuan Rudong offshore wind plant in Jiangsu, China. The Rudong site contains 25 Envision EN136-4 MW turbines, of which a subset was selected for the field test campaign consisting of the front two rows for the northeasterly wind direction. In the first row, a turbine was selected as the reference turbine, providing comparison power data, while another was selected as the controlled turbine. This controlled turbine wakes three different turbines in the second row depending on the wind direction. A yaw misalignment strategy was designed using Envision's GWCFD, a multi-fidelity plant-scale CFD tool based on SOWFA with a generalized actuator disc (GAD) turbine model, which, in turn, was used to tune NREL's FLORIS model used for wake steering and yaw control optimization. The presentation will account for some associated uncertainties, such as those in atmospheric turbulence and wake profile.

  9. Dark matter interpretations of ATLAS searches for the electroweak production of supersymmetric particles in s = 8 $$ \\sqrt{s}=8 $$ TeV proton-proton collisions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aaboud, M.; Aad, G.; Abbott, B.

    2016-09-01

    A selection of searches by the ATLAS experiment at the LHC for the electroweak production of SUSY particles are used to study their impact on the constraints on dark matter candidates. The searches use 20 fb-1 of proton-proton collision data at s√=8s=8 TeV. A likelihood-driven scan of a five-dimensional effective model focusing on the gaugino-higgsino and Higgs sector of the phenomenological minimal supersymmetric Standard Model is performed. This scan uses data from direct dark matter detection experiments, the relic dark matter density and precision flavour physics results. Further constraints from the ATLAS Higgs mass measurement and SUSY searches at LEPmore » are also applied. A subset of models selected from this scan are used to assess the impact of the selected ATLAS searches in this five-dimensional parameter space. These ATLAS searches substantially impact those models for which the mass m(χ~01)m(χ~10) of the lightest neutralino is less than 65 GeV, excluding 86% of such models. The searches have limited impact on models with larger m(χ~01)m(χ~10) due to either heavy electroweakinos or compressed mass spectra where the mass splittings between the produced particles and the lightest supersymmetric particle is small.« less

  10. Dark matter interpretations of ATLAS searches for the electroweak production of supersymmetric particles in $$ \\sqrt{s}=8 $$ TeV proton-proton collisions

    DOE PAGES

    Aaboud, M.; Aad, G.; Abbott, B.; ...

    2016-09-30

    A selection of searches by the ATLAS experiment at the LHC for the electroweak production of SUSY particles are used to study their impact on the constraints on dark matter candidates. The searches use 20 fb -1 of proton-proton collision data at √s=8 TeV. A likelihood-driven scan of a five-dimensional effective model focusing on the gaugino-higgsino and Higgs sector of the phenomenological minimal supersymmetric Standard Model is performed. This scan uses data from direct dark matter detection experiments, the relic dark matter density and precision flavour physics results. Further constraints from the ATLAS Higgs mass measurement and SUSY searches at LEP are also applied. A subset of models selected from this scan are used to assess the impact of the selected ATLAS searches in this five-dimensional parameter space. These ATLAS searches substantially impact those models for which the mass m(more » $$\\tilde{χ}$$$0\\atop{1}$$) of the lightest neutralino is less than 65 GeV, excluding 86% of such models. The searches have limited impact on models with larger m($$\\tilde{χ}$$$0\\atop{1}$$) due to either heavy electroweakinos or compressed mass spectra where the mass splittings between the produced particles and the lightest supersymmetric particle is small.« less

  11. Associations between blood BTEXS concentrations and hematologic parameters among adult residents of the U.S. Gulf States.

    PubMed

    Doherty, Brett T; Kwok, Richard K; Curry, Matthew D; Ekenga, Christine; Chambers, David; Sandler, Dale P; Engel, Lawrence S

    2017-07-01

    Studies of workers exposed to benzene at average air concentrations below one part per million suggest that benzene, a known hematotoxin, causes hematopoietic damage even at low exposure levels. However, evidence of such effects outside of occupational settings and for other volatile organic compounds (VOCs) is limited. To investigate associations between ambient exposures to five VOCs, including benzene, and hematologic parameters among adult residents of the U.S. Gulf Coast. Blood concentrations of selected VOCs were measured in a sample of adult participants in the Gulf Long-term Follow-up Study (GuLF STUDY) during 2012 and 2013. Complete blood counts with differentials were also performed on a subset of participants (n=406). We used these data together with detailed questionnaire data to estimate adjusted associations between blood BTEXS (benzene, toluene, ethylbenzene, o-xylene, m/p-xylene, and styrene) concentrations and hematologic parameters using generalized linear models. We observed inverse associations between blood benzene concentrations and hemoglobin concentration and mean corpuscular hemoglobin concentration, and a positive association with red cell distribution width among tobacco smoke-unexposed participants (n=146). Among tobacco smoke-exposed participants (n=247), we observed positive associations between blood VOC concentrations and several hematologic parameters, including increased white blood cell and platelet counts, suggestive of hematopoietic stimulation typically associated with tobacco smoke exposure. Most associations were stronger for benzene than for the other VOCs. Our results suggest that ambient exposure to BTEXS, particularly benzene, may be associated with hematologic effects, including decreased hemoglobin concentration, mean corpuscular hemoglobin concentration, and increased red cell distribution width. Published by Elsevier Inc.

  12. Peripheral leukocyte populations and oxidative stress biomarkers in aged dogs showing impaired cognitive abilities.

    PubMed

    Mongillo, Paolo; Bertotto, Daniela; Pitteri, Elisa; Stefani, Annalisa; Marinelli, Lieta; Gabai, Gianfranco

    2015-06-01

    In the present study, the peripheral blood leukocyte phenotypes, lymphocyte subset populations, and oxidative stress parameters were studied in cognitively characterized adult and aged dogs, in order to assess possible relationships between age, cognitive decline, and the immune status. Adult (N = 16, 2-7 years old) and aged (N = 29, older than 8 years) dogs underwent two testing procedures, for the assessment of spatial reversal learning and selective social attention abilities, which were shown to be sensitive to aging in pet dogs. Based on age and performance in cognitive testing, dogs were classified as adult not cognitively impaired (ADNI, N = 12), aged not cognitively impaired (AGNI, N = 19) and aged cognitively impaired (AGCI, N = 10). Immunological and oxidative stress parameters were compared across groups with the Kruskal-Wallis test. AGCI dogs displayed lower absolute CD4 cell count (p < 0.05) than ADNI and higher monocyte absolute count and percentage (p < 0.05) than AGNI whereas these parameters were not different between AGNI and ADNI. AGNI dogs had higher CD8 cell percentage than ADNI (p < 0.05). Both AGNI and AGCI dogs showed lower CD4/CD8 and CD21 count and percentage and higher neutrophil/lymphocyte and CD3/CD21 ratios (p < 0.05). None of the oxidative parameters showed any statistically significant difference among groups. These observations suggest that alterations in peripheral leukocyte populations may reflect age-related changes occurring within the central nervous system and disclose interesting perspectives for the dog as a model for studying the functional relationship between the nervous and immune systems during aging.

  13. Sperm deoxyribonucleic acid damage in normozoospermic men is related to age and sperm progressive motility.

    PubMed

    Belloc, Stephanie; Benkhalifa, Moncef; Cohen-Bacrie, Martine; Dalleac, Alain; Amar, Edouard; Zini, Armand

    2014-06-01

    To evaluate sperm DNA fragmentation in normozoospermic male partners of couples undergoing infertility evaluation. Retrospective cohort study. Clinical andrology laboratory. A total of 1,974 consecutive normozoospermic men selected from a larger cohort of 4,345 consecutive, nonazoospermic men presenting for infertility evaluation. None. Clinical parameters, conventional semen parameters, and sperm DNA fragmentation assessed by flow cytometry-based TUNEL assay and reported as percent sperm DNA fragmentation (%SDF). The mean (± SD) %SDF and the proportion of men with high %SDF (>30%) were significantly lower in the normozoospermic compared with the entire cohort of 4,345 evaluable infertile men (17.6% ± 10.1% vs. 20.7% ± 12.4% and 11% vs. 20%, respectively). In the group of 1,974 normozoospermic men, %SDF was positively correlated with paternal age (r = 0.17) and inversely correlated with progressive motility (r = -0.26). In the subset of normozoospermic men with sperm parameters above the 50th percentile (≥ 73 × 10(6) sperm/mL, ≥ 55% progressive motility, and ≥ 14% normal forms, World Health Organization 2010 guidelines), 5% (4 of 83) had elevated %SDF (>30%). In this large cohort of normozoospermic men presenting for infertility evaluation, DNA fragmentation level is related to sperm motility and paternal age, and 11% of these men have high levels of sperm DNA fragmentation. Furthermore, the data indicate that a nonnegligible proportion (5%) of normozoospermic men with high-normal sperm parameters may also have significant sperm DNA fragmentation. Copyright © 2014 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  14. Supplementary data of “Impacts of mesic and xeric urban vegetation on outdoor thermal comfort and microclimate in Phoenix, AZ”

    PubMed Central

    Song, Jiyun; Wang, Zhi-Hua

    2015-01-01

    An advanced Markov-Chain Monte Carlo approach called Subset Simulation is described in Au and Beck (2001) [1] was used to quantify parameter uncertainty and model sensitivity of the urban land-atmospheric framework, viz. the coupled urban canopy model-single column model (UCM-SCM). The results show that the atmospheric dynamics are sensitive to land surface conditions. The most sensitive parameters are dimensional parameters, i.e. roof width, aspect ratio, roughness length of heat and momentum, since these parameters control the magnitude of sensible heat flux. The relative insensitive parameters are hydrological parameters since the lawns or green roofs in urban areas are regularly irrigated so that the water availability for evaporation is never constrained. PMID:26702421

  15. Using Parental Profiles to Predict Membership in a Subset of College Students Experiencing Excessive Alcohol Consequences: Findings From a Longitudinal Study

    PubMed Central

    Varvil-Weld, Lindsey; Mallett, Kimberly A.; Turrisi, Rob; Abar, Caitlin C.

    2012-01-01

    Objective: Previous research identified a high-risk subset of college students experiencing a disproportionate number of alcohol-related consequences at the end of their first year. With the goal of identifying pre-college predictors of membership in this high-risk subset, the present study used a prospective design to identify latent profiles of student-reported maternal and paternal parenting styles and alcohol-specific behaviors and to determine whether these profiles were associated with membership in the high-risk consequences subset. Method: A sample of randomly selected 370 incoming first-year students at a large public university reported on their mothers’ and fathers’ communication quality, monitoring, approval of alcohol use, and modeling of drinking behaviors and on consequences experienced across the first year of college. Results: Students in the high-risk subset comprised 15.5% of the sample but accounted for almost half (46.6%) of the total consequences reported by the entire sample. Latent profile analyses identified four parental profiles: positive pro-alcohol, positive anti-alcohol, negative mother, and negative father. Logistic regression analyses revealed that students in the negative-father profile were at greatest odds of being in the high-risk consequences subset at a follow-up assessment 1 year later, even after drinking at baseline was controlled for. Students in the positive pro-alcohol profile also were at increased odds of being in the high-risk subset, although this association was attenuated after baseline drinking was controlled for. Conclusions: These findings have important implications for the improvement of existing parent- and individual-based college student drinking interventions designed to reduce alcohol-related consequences. PMID:22456248

  16. Using parental profiles to predict membership in a subset of college students experiencing excessive alcohol consequences: findings from a longitudinal study.

    PubMed

    Varvil-Weld, Lindsey; Mallett, Kimberly A; Turrisi, Rob; Abar, Caitlin C

    2012-05-01

    Previous research identified a high-risk subset of college students experiencing a disproportionate number of alcohol-related consequences at the end of their first year. With the goal of identifying pre-college predictors of membership in this high-risk subset, the present study used a prospective design to identify latent profiles of student-reported maternal and paternal parenting styles and alcohol-specific behaviors and to determine whether these profiles were associated with membership in the high-risk consequences subset. A sample of randomly selected 370 incoming first-year students at a large public university reported on their mothers' and fathers' communication quality, monitoring, approval of alcohol use, and modeling of drinking behaviors and on consequences experienced across the first year of college. Students in the high-risk subset comprised 15.5% of the sample but accounted for almost half (46.6%) of the total consequences reported by the entire sample. Latent profile analyses identified four parental profiles: positive pro-alcohol, positive anti-alcohol, negative mother, and negative father. Logistic regression analyses revealed that students in the negative-father profile were at greatest odds of being in the high-risk consequences subset at a follow-up assessment 1 year later, even after drinking at baseline was controlled for. Students in the positive pro-alcohol profile also were at increased odds of being in the high-risk subset, although this association was attenuated after baseline drinking was controlled for. These findings have important implications for the improvement of existing parent- and individual-based college student drinking interventions designed to reduce alcohol-related consequences.

  17. Automated retrieval of forest structure variables based on multi-scale texture analysis of VHR satellite imagery

    NASA Astrophysics Data System (ADS)

    Beguet, Benoit; Guyon, Dominique; Boukir, Samia; Chehata, Nesrine

    2014-10-01

    The main goal of this study is to design a method to describe the structure of forest stands from Very High Resolution satellite imagery, relying on some typical variables such as crown diameter, tree height, trunk diameter, tree density and tree spacing. The emphasis is placed on the automatization of the process of identification of the most relevant image features for the forest structure retrieval task, exploiting both spectral and spatial information. Our approach is based on linear regressions between the forest structure variables to be estimated and various spectral and Haralick's texture features. The main drawback of this well-known texture representation is the underlying parameters which are extremely difficult to set due to the spatial complexity of the forest structure. To tackle this major issue, an automated feature selection process is proposed which is based on statistical modeling, exploring a wide range of parameter values. It provides texture measures of diverse spatial parameters hence implicitly inducing a multi-scale texture analysis. A new feature selection technique, we called Random PRiF, is proposed. It relies on random sampling in feature space, carefully addresses the multicollinearity issue in multiple-linear regression while ensuring accurate prediction of forest variables. Our automated forest variable estimation scheme was tested on Quickbird and Pléiades panchromatic and multispectral images, acquired at different periods on the maritime pine stands of two sites in South-Western France. It outperforms two well-established variable subset selection techniques. It has been successfully applied to identify the best texture features in modeling the five considered forest structure variables. The RMSE of all predicted forest variables is improved by combining multispectral and panchromatic texture features, with various parameterizations, highlighting the potential of a multi-resolution approach for retrieving forest structure variables from VHR satellite images. Thus an average prediction error of ˜ 1.1 m is expected on crown diameter, ˜ 0.9 m on tree spacing, ˜ 3 m on height and ˜ 0.06 m on diameter at breast height.

  18. Strategies to Improve Activity Recognition Based on Skeletal Tracking: Applying Restrictions Regarding Body Parts and Similarity Boundaries †

    PubMed Central

    Gutiérrez-López-Franca, Carlos; Hervás, Ramón; Johnson, Esperanza

    2018-01-01

    This paper aims to improve activity recognition systems based on skeletal tracking through the study of two different strategies (and its combination): (a) specialized body parts analysis and (b) stricter restrictions for the most easily detectable activities. The study was performed using the Extended Body-Angles Algorithm, which is able to analyze activities using only a single key sample. This system allows to select, for each considered activity, which are its relevant joints, which makes it possible to monitor the body of the user selecting only a subset of the same. But this feature of the system has both advantages and disadvantages. As a consequence, in the past we had some difficulties with the recognition of activities that only have a small subset of the joints of the body as relevant. The goal of this work, therefore, is to analyze the effect produced by the application of several strategies on the results of an activity recognition system based on skeletal tracking joint oriented devices. Strategies that we applied with the purpose of improve the recognition rates of the activities with a small subset of relevant joints. Through the results of this work, we aim to give the scientific community some first indications about which considered strategy is better. PMID:29789478

  19. Quantitative impact of thymic selection on Foxp3+ and Foxp3- subsets of self-peptide/MHC class II-specific CD4+ T cells.

    PubMed

    Moon, James J; Dash, Pradyot; Oguin, Thomas H; McClaren, Jennifer L; Chu, H Hamlet; Thomas, Paul G; Jenkins, Marc K

    2011-08-30

    It is currently thought that T cells with specificity for self-peptide/MHC (pMHC) ligands are deleted during thymic development, thereby preventing autoimmunity. In the case of CD4(+) T cells, what is unclear is the extent to which self-peptide/MHC class II (pMHCII)-specific T cells are deleted or become Foxp3(+) regulatory T cells. We addressed this issue by characterizing a natural polyclonal pMHCII-specific CD4(+) T-cell population in mice that either lacked or expressed the relevant antigen in a ubiquitous pattern. Mice expressing the antigen contained one-third the number of pMHCII-specific T cells as mice lacking the antigen, and the remaining cells exhibited low TCR avidity. In mice lacking the antigen, the pMHCII-specific T-cell population was dominated by phenotypically naive Foxp3(-) cells, but also contained a subset of Foxp3(+) regulatory cells. Both Foxp3(-) and Foxp3(+) pMHCII-specific T-cell numbers were reduced in mice expressing the antigen, but the Foxp3(+) subset was more resistant to changes in number and TCR repertoire. Therefore, thymic selection of self-pMHCII-specific CD4(+) T cells results in incomplete deletion within the normal polyclonal repertoire, especially among regulatory T cells.

  20. Effect of Occupant and Impact Factors on Forces within Neck: II. Analysis of Specific Subsets

    NASA Astrophysics Data System (ADS)

    Shaibani, Saami J.

    2000-03-01

    The forces generated in the cervical spine were evaluated for a substantial number of motor-vehicle occupants in an associated study.[1] Correlation between these forces and various occupant- and impact-related parameters was generally not high for the broad groupings of the population considered at that time. In this research, smaller subsets with more elements in common were extracted from the data to try to detect any underlying relationships that might exist for the neck force. Although correlation coefficients for these subsets were higher than those for the previous groupings in more than three-quarters of the matches undertaken, the values still did not indicate consistently good fits. This suggests that there is no simple relationship for the force within the cervical spine and this, in turn, means that the potential for neck injury has to be evaluated on a case-by-case basis. 1. Effect of Occupant and Impact Factors on Forces within Neck: I. Overview of Large Population, Bull. Am. Phys. Soc. in press (2000).

  1. Long High Redshift GRB and Xrt/swift Lightcurves

    NASA Astrophysics Data System (ADS)

    Arkhangelskaja, Irene

    At February of 2010 the volume of Swift GRB subset with known redshift consisted of more than 150 bursts. Long GRB redshift distribution analysis has shown that confidence level of single peak approximation of this distribution is only ˜60%. Moreover, more than 40% of GRB are in very heavy tails outside 3σ level for this fit. More detailed analysis of long GRB redshift distribution reveals that at 97% confidence level at least two subgroups could be separated with following parameters: = 0.9 ± 0.1 and = 2.7 ± 0.2. It allows to make conclusion that Swift long GRB sources subset is not uniform. In the presented article attention is paid on the measure of discrepancy of long GRB with z>3 and subset of other long GRB with known redshifts. XRT/Swift lightcurves for these groups of GRB were considered and it have shown that at least 90% XRT/Swift lightcurves for GRB with z>3 are more complicated and have got a number of maxima.

  2. A tool for selecting SNPs for association studies based on observed linkage disequilibrium patterns.

    PubMed

    De La Vega, Francisco M; Isaac, Hadar I; Scafe, Charles R

    2006-01-01

    The design of genetic association studies using single-nucleotide polymorphisms (SNPs) requires the selection of subsets of the variants providing high statistical power at a reasonable cost. SNPs must be selected to maximize the probability that a causative mutation is in linkage disequilibrium (LD) with at least one marker genotyped in the study. The HapMap project performed a genome-wide survey of genetic variation with about a million SNPs typed in four populations, providing a rich resource to inform the design of association studies. A number of strategies have been proposed for the selection of SNPs based on observed LD, including construction of metric LD maps and the selection of haplotype tagging SNPs. Power calculations are important at the study design stage to ensure successful results. Integrating these methods and annotations can be challenging: the algorithms required to implement these methods are complex to deploy, and all the necessary data and annotations are deposited in disparate databases. Here, we present the SNPbrowser Software, a freely available tool to assist in the LD-based selection of markers for association studies. This stand-alone application provides fast query capabilities and swift visualization of SNPs, gene annotations, power, haplotype blocks, and LD map coordinates. Wizards implement several common SNP selection workflows including the selection of optimal subsets of SNPs (e.g. tagging SNPs). Selected SNPs are screened for their conversion potential to either TaqMan SNP Genotyping Assays or the SNPlex Genotyping System, two commercially available genotyping platforms, expediting the set-up of genetic studies with an increased probability of success.

  3. Channel and feature selection in multifunction myoelectric control.

    PubMed

    Khushaba, Rami N; Al-Jumaily, Adel

    2007-01-01

    Real time controlling devices based on myoelectric singles (MES) is one of the challenging research problems. This paper presents a new approach to reduce the computational cost of real time systems driven by Myoelectric signals (MES) (a.k.a Electromyography--EMG). The new approach evaluates the significance of feature/channel selection on MES pattern recognition. Particle Swarm Optimization (PSO), an evolutionary computational technique, is employed to search the feature/channel space for important subsets. These important subsets will be evaluated using a multilayer perceptron trained with back propagation neural network (BPNN). Practical results acquired from tests done on six subjects' datasets of MES signals measured in a noninvasive manner using surface electrodes are presented. It is proved that minimum error rates can be achieved by considering the correct combination of features/channels, thus providing a feasible system for practical implementation purpose for rehabilitation of patients.

  4. Input Decimated Ensembles

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Oza, Nikunj C.; Clancy, Daniel (Technical Monitor)

    2001-01-01

    Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many pattern recognition problems. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers. Therefore, reducing those correlations while keeping the classifiers' performance levels high is an important area of research. In this article, we explore input decimation (ID), a method which selects feature subsets for their ability to discriminate among the classes and uses them to decouple the base classifiers. We provide a summary of the theoretical benefits of correlation reduction, along with results of our method on two underwater sonar data sets, three benchmarks from the Probenl/UCI repositories, and two synthetic data sets. The results indicate that input decimated ensembles (IDEs) outperform ensembles whose base classifiers use all the input features; randomly selected subsets of features; and features created using principal components analysis, on a wide range of domains.

  5. Neural networks for vertical microcode compaction

    NASA Astrophysics Data System (ADS)

    Chu, Pong P.

    1992-09-01

    Neural networks provide an alternative way to solve complex optimization problems. Instead of performing a program of instructions sequentially as in a traditional computer, neural network model explores many competing hypotheses simultaneously using its massively parallel net. The paper shows how to use the neural network approach to perform vertical micro-code compaction for a micro-programmed control unit. The compaction procedure includes two basic steps. The first step determines the compatibility classes and the second step selects a minimal subset to cover the control signals. Since the selection process is an NP- complete problem, to find an optimal solution is impractical. In this study, we employ a customized neural network to obtain the minimal subset. We first formalize this problem, and then define an `energy function' and map it to a two-layer fully connected neural network. The modified network has two types of neurons and can always obtain a valid solution.

  6. Genetic Algorithms Applied to Multi-Objective Aerodynamic Shape Optimization

    NASA Technical Reports Server (NTRS)

    Holst, Terry L.

    2004-01-01

    A genetic algorithm approach suitable for solving multi-objective optimization problems is described and evaluated using a series of aerodynamic shape optimization problems. Several new features including two variations of a binning selection algorithm and a gene-space transformation procedure are included. The genetic algorithm is suitable for finding pareto optimal solutions in search spaces that are defined by any number of genes and that contain any number of local extrema. A new masking array capability is included allowing any gene or gene subset to be eliminated as decision variables from the design space. This allows determination of the effect of a single gene or gene subset on the pareto optimal solution. Results indicate that the genetic algorithm optimization approach is flexible in application and reliable. The binning selection algorithms generally provide pareto front quality enhancements and moderate convergence efficiency improvements for most of the problems solved.

  7. Genetic Algorithms Applied to Multi-Objective Aerodynamic Shape Optimization

    NASA Technical Reports Server (NTRS)

    Holst, Terry L.

    2005-01-01

    A genetic algorithm approach suitable for solving multi-objective problems is described and evaluated using a series of aerodynamic shape optimization problems. Several new features including two variations of a binning selection algorithm and a gene-space transformation procedure are included. The genetic algorithm is suitable for finding Pareto optimal solutions in search spaces that are defined by any number of genes and that contain any number of local extrema. A new masking array capability is included allowing any gene or gene subset to be eliminated as decision variables from the design space. This allows determination of the effect of a single gene or gene subset on the Pareto optimal solution. Results indicate that the genetic algorithm optimization approach is flexible in application and reliable. The binning selection algorithms generally provide Pareto front quality enhancements and moderate convergence efficiency improvements for most of the problems solved.

  8. Scalable amplification of strand subsets from chip-synthesized oligonucleotide libraries

    PubMed Central

    Schmidt, Thorsten L.; Beliveau, Brian J.; Uca, Yavuz O.; Theilmann, Mark; Da Cruz, Felipe; Wu, Chao-Ting; Shih, William M.

    2015-01-01

    Synthetic oligonucleotides are the main cost factor for studies in DNA nanotechnology, genetics and synthetic biology, which all require thousands of these at high quality. Inexpensive chip-synthesized oligonucleotide libraries can contain hundreds of thousands of distinct sequences, however only at sub-femtomole quantities per strand. Here we present a selective oligonucleotide amplification method, based on three rounds of rolling-circle amplification, that produces nanomole amounts of single-stranded oligonucleotides per millilitre reaction. In a multistep one-pot procedure, subsets of hundreds or thousands of single-stranded DNAs with different lengths can selectively be amplified and purified together. These oligonucleotides are used to fold several DNA nanostructures and as primary fluorescence in situ hybridization probes. The amplification cost is lower than other reported methods (typically around US$ 20 per nanomole total oligonucleotides produced) and is dominated by the use of commercial enzymes. PMID:26567534

  9. Optimizing an Actuator Array for the Control of Multi-Frequency Noise in Aircraft Interiors

    NASA Technical Reports Server (NTRS)

    Palumbo, D. L.; Padula, S. L.

    1997-01-01

    Techniques developed for selecting an optimized actuator array for interior noise reduction at a single frequency are extended to the multi-frequency case. Transfer functions for 64 actuators were obtained at 5 frequencies from ground testing the rear section of a fully trimmed DC-9 fuselage. A single loudspeaker facing the left side of the aircraft was the primary source. A combinatorial search procedure (tabu search) was employed to find optimum actuator subsets of from 2 to 16 actuators. Noise reduction predictions derived from the transfer functions were used as a basis for evaluating actuator subsets during optimization. Results indicate that it is necessary to constrain actuator forces during optimization. Unconstrained optimizations selected actuators which require unrealistically large forces. Two methods of constraint are evaluated. It is shown that a fast, but approximate, method yields results equivalent to an accurate, but computationally expensive, method.

  10. Relevance of 2D radiographic texture analysis for the assessment of 3D bone micro-architecture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Apostol, Lian; Boudousq, Vincent; Basset, Oliver

    Although the diagnosis of osteoporosis is mainly based on dual x-ray absorptiometry, it has been shown that trabecular bone micro-architecture is also an important factor in regard to fracture risk. In vivo, techniques based on high-resolution x-ray radiography associated to texture analysis have been proposed to investigate bone micro-architecture, but their relevance for giving pertinent 3D information is unclear. Thirty-three calcaneus and femoral neck bone samples including the cortical shells (diameter: 14 mm, height: 30-40 mm) were imaged using 3D-synchrotron x-ray micro-CT at the ESRF. The 3D reconstructed images with a cubic voxel size of 15 {mu}m were further usedmore » for two purposes: (1) quantification of three-dimensional trabecular bone micro-architecture (2) simulation of realistic x-ray radiographs under different acquisition conditions. The simulated x-ray radiographs were then analyzed using a large variety of texture analysis methods (co-occurrence, spectral density, fractal, morphology, etc.). The range of micro-architecture parameters was in agreement with previous studies and rather large, suggesting that the population was representative. More than 350 texture parameters were tested. A small number of them were selected based on their correlation to micro-architectural morphometric parameters. Using this subset of texture parameters, multiple regression allowed one to predict up to 93% of the variance of micro-architecture parameters using three texture features. 2D texture features predicting 3D micro-architecture parameters other than BV/TV were identified. The methodology proposed for evaluating the relationships between 3D micro-architecture and 2D texture parameters may also be used for optimizing the conditions for radiographic imaging. Further work will include the application of the method to physical radiographs. In the future, this approach could be used in combination with DXA to refine osteoporosis diagnosis.« less

  11. B cell subset distribution is altered in patients with severe periodontitis.

    PubMed

    Demoersman, Julien; Pochard, Pierre; Framery, Camille; Simon, Quentin; Boisramé, Sylvie; Soueidan, Assem; Pers, Jacques-Olivier

    2018-01-01

    Several studies have recently highlighted the implication of B cells in physiopathogenesis of periodontal disease by showing that a B cell deficiency leads to improved periodontal parameters. However, the detailed profiles of circulating B cell subsets have not yet been investigated in patients with severe periodontitis (SP). We hypothesised that an abnormal distribution of B cell subsets could be detected in the blood of patients with severe periodontal lesions, as already reported for patients with chronic inflammatory diseases as systemic autoimmune diseases. Fifteen subjects with SP and 13 subjects without periodontitis, according to the definition proposed by the CDC periodontal disease surveillance work group, were enrolled in this pilot observational study. Two flow cytometry panels were designed to analyse the circulating B and B1 cell subset distribution in association with the RANKL expression. A significantly higher percentage of CD27+ memory B cells was observed in patients with SP. Among these CD27+ B cells, the proportion of the switched memory subset was significantly higher. At the same time, human B1 cells, which were previously associated with a regulatory function (CD20+CD69-CD43+CD27+CD11b+), decreased in SP patients. The RANKL expression increased in every B cell subset from the SP patients and was significantly greater in activated B cells than in the subjects without periodontitis. These preliminary results demonstrate the altered distribution of B cells in the context of severe periodontitis. Further investigations with a larger cohort of patients can elucidate if the analysis of the B cell compartment distribution can reflect the periodontal disease activity and be a reliable marker for its prognosis (clinical trial registration number: NCT02833285, B cell functions in periodontitis).

  12. B cell subset distribution is altered in patients with severe periodontitis

    PubMed Central

    Demoersman, Julien; Pochard, Pierre; Framery, Camille; Simon, Quentin; Boisramé, Sylvie; Soueidan, Assem

    2018-01-01

    Several studies have recently highlighted the implication of B cells in physiopathogenesis of periodontal disease by showing that a B cell deficiency leads to improved periodontal parameters. However, the detailed profiles of circulating B cell subsets have not yet been investigated in patients with severe periodontitis (SP). We hypothesised that an abnormal distribution of B cell subsets could be detected in the blood of patients with severe periodontal lesions, as already reported for patients with chronic inflammatory diseases as systemic autoimmune diseases. Fifteen subjects with SP and 13 subjects without periodontitis, according to the definition proposed by the CDC periodontal disease surveillance work group, were enrolled in this pilot observational study. Two flow cytometry panels were designed to analyse the circulating B and B1 cell subset distribution in association with the RANKL expression. A significantly higher percentage of CD27+ memory B cells was observed in patients with SP. Among these CD27+ B cells, the proportion of the switched memory subset was significantly higher. At the same time, human B1 cells, which were previously associated with a regulatory function (CD20+CD69-CD43+CD27+CD11b+), decreased in SP patients. The RANKL expression increased in every B cell subset from the SP patients and was significantly greater in activated B cells than in the subjects without periodontitis. These preliminary results demonstrate the altered distribution of B cells in the context of severe periodontitis. Further investigations with a larger cohort of patients can elucidate if the analysis of the B cell compartment distribution can reflect the periodontal disease activity and be a reliable marker for its prognosis (clinical trial registration number: NCT02833285, B cell functions in periodontitis). PMID:29447240

  13. Identity and Diversity of Human Peripheral Th and T Regulatory Cells Defined by Single-Cell Mass Cytometry.

    PubMed

    Kunicki, Matthew A; Amaya Hernandez, Laura C; Davis, Kara L; Bacchetta, Rosa; Roncarolo, Maria-Grazia

    2018-01-01

    Human CD3 + CD4 + Th cells, FOXP3 + T regulatory (Treg) cells, and T regulatory type 1 (Tr1) cells are essential for ensuring peripheral immune response and tolerance, but the diversity of Th, Treg, and Tr1 cell subsets has not been fully characterized. Independent functional characterization of human Th1, Th2, Th17, T follicular helper (Tfh), Treg, and Tr1 cells has helped to define unique surface molecules, transcription factors, and signaling profiles for each subset. However, the adequacy of these markers to recapitulate the whole CD3 + CD4 + T cell compartment remains questionable. In this study, we examined CD3 + CD4 + T cell populations by single-cell mass cytometry. We characterize the CD3 + CD4 + Th, Treg, and Tr1 cell populations simultaneously across 23 memory T cell-associated surface and intracellular molecules. High-dimensional analysis identified several new subsets, in addition to the already defined CD3 + CD4 + Th, Treg, and Tr1 cell populations, for a total of 11 Th cell, 4 Treg, and 1 Tr1 cell subsets. Some of these subsets share markers previously thought to be selective for Treg, Th1, Th2, Th17, and Tfh cells, including CD194 (CCR4) + FOXP3 + Treg and CD183 (CXCR3) + T-bet + Th17 cell subsets. Unsupervised clustering displayed a phenotypic organization of CD3 + CD4 + T cells that confirmed their diversity but showed interrelation between the different subsets, including similarity between Th1-Th2-Tfh cell populations and Th17 cells, as well as similarity of Th2 cells with Treg cells. In conclusion, the use of single-cell mass cytometry provides a systems-level characterization of CD3 + CD4 + T cells in healthy human blood, which represents an important baseline reference to investigate abnormalities of different subsets in immune-mediated pathologies. Copyright © 2017 by The American Association of Immunologists, Inc.

  14. Combinatorial-topological framework for the analysis of global dynamics.

    PubMed

    Bush, Justin; Gameiro, Marcio; Harker, Shaun; Kokubu, Hiroshi; Mischaikow, Konstantin; Obayashi, Ippei; Pilarczyk, Paweł

    2012-12-01

    We discuss an algorithmic framework based on efficient graph algorithms and algebraic-topological computational tools. The framework is aimed at automatic computation of a database of global dynamics of a given m-parameter semidynamical system with discrete time on a bounded subset of the n-dimensional phase space. We introduce the mathematical background, which is based upon Conley's topological approach to dynamics, describe the algorithms for the analysis of the dynamics using rectangular grids both in phase space and parameter space, and show two sample applications.

  15. Combinatorial-topological framework for the analysis of global dynamics

    NASA Astrophysics Data System (ADS)

    Bush, Justin; Gameiro, Marcio; Harker, Shaun; Kokubu, Hiroshi; Mischaikow, Konstantin; Obayashi, Ippei; Pilarczyk, Paweł

    2012-12-01

    We discuss an algorithmic framework based on efficient graph algorithms and algebraic-topological computational tools. The framework is aimed at automatic computation of a database of global dynamics of a given m-parameter semidynamical system with discrete time on a bounded subset of the n-dimensional phase space. We introduce the mathematical background, which is based upon Conley's topological approach to dynamics, describe the algorithms for the analysis of the dynamics using rectangular grids both in phase space and parameter space, and show two sample applications.

  16. Human attention filters for single colors.

    PubMed

    Sun, Peng; Chubb, Charles; Wright, Charles E; Sperling, George

    2016-10-25

    The visual images in the eyes contain much more information than the brain can process. An important selection mechanism is feature-based attention (FBA). FBA is best described by attention filters that specify precisely the extent to which items containing attended features are selectively processed and the extent to which items that do not contain the attended features are attenuated. The centroid-judgment paradigm enables quick, precise measurements of such human perceptual attention filters, analogous to transmission measurements of photographic color filters. Subjects use a mouse to locate the centroid-the center of gravity-of a briefly displayed cloud of dots and receive precise feedback. A subset of dots is distinguished by some characteristic, such as a different color, and subjects judge the centroid of only the distinguished subset (e.g., dots of a particular color). The analysis efficiently determines the precise weight in the judged centroid of dots of every color in the display (i.e., the attention filter for the particular attended color in that context). We report 32 attention filters for single colors. Attention filters that discriminate one saturated hue from among seven other equiluminant distractor hues are extraordinarily selective, achieving attended/unattended weight ratios >20:1. Attention filters for selecting a color that differs in saturation or lightness from distractors are much less selective than attention filters for hue (given equal discriminability of the colors), and their filter selectivities are proportional to the discriminability distance of neighboring colors, whereas in the same range hue attention-filter selectivity is virtually independent of discriminabilty.

  17. AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

    PubMed

    Sun, Lei; Wang, Jun; Wei, Jinmao

    2017-03-14

    The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.

  18. [Quantitative variations of lymphocyte subsets in various kinds of cancer patients in the terminal stage].

    PubMed

    Nakajima, Y; Akimoto, M; Iwasaki, H; Matano, S; Hirakawa, H; Kimura, M

    1986-12-01

    Immunological studies of the peripheral blood were made in terminal breast cancer and terminal abdominal cancer patients. Two immunological parameters were studied: (1) lymphocyte subsets and (2) proliferative response to PHA. A decrease in the number of OKT-3(+) cells and an increase in that of OKT-8(+) cells were observed in abdominal cancer. It was suggested that the immunological status in abdominal cancer is more suppressive than in breast cancer. Increases in the number of OK-M1(+) cells and Leu-7(+) cells were observed in breast cancer. It is suggested that cytotoxic lymphocytes increase in number in breast cancer more than in abdominal cancer.

  19. A reassessment of IgM memory subsets in humans

    PubMed Central

    Bagnara, Davide; Squillario, Margherita; Kipling, David; Mora, Thierry; Walczak, Aleksandra M.; Da Silva, Lucie; Weller, Sandra; Dunn-Walters, Deborah K.; Weill, Jean-Claude; Reynaud, Claude-Agnès

    2015-01-01

    From paired blood and spleen samples from three adult donors we performed high-throughput V-h sequencing of human B-cell subsets defined by IgD and CD27 expression: IgD+CD27+ (“MZ”), IgD−CD27+(“memory”, including IgM (“IgM-only”), IgG and IgA) and IgD−CD27− cells (“double-negative”, including IgM, IgG and IgA). 91,294 unique sequences clustered in 42,670 clones, revealing major clonal expansions in each of these subsets. Among these clones, we further analyzed those shared sequences from different subsets or tissues for Vh-gene mutation, H-CDR3-length, and Vh/Jh usage, comparing these different characteristics with all sequences from their subset of origin, for which these parameters constitute a distinct signature. The IgM-only repertoire profile differed notably from that of MZ B cells by a higher mutation frequency, and lower Vh4 and higher Jh6 gene usage. Strikingly, IgM sequences from clones shared between the MZ and the memory IgG/IgA compartments showed a mutation and repertoire profile of IgM-only and not of MZ B cells. Similarly, all IgM clonal relationships (between MZ, IgM-only, and double-negative compartments) involved sequences with the characteristics of IgM-only B cells. Finally, clonal relationships between tissues suggested distinct recirculation characteristics between MZ and switched B cells. The “IgM-only” subset (including cells with its repertoire signature but higher IgD or lower CD27 expression levels) thus appear as the only subset showing precursor-product relationships with CD27+ switched memory B cells, indicating that they represent germinal center-derived IgM memory B cells, and that IgM memory and MZ B cells constitute two distinct entities. PMID:26355154

  20. A Reassessment of IgM Memory Subsets in Humans.

    PubMed

    Bagnara, Davide; Squillario, Margherita; Kipling, David; Mora, Thierry; Walczak, Aleksandra M; Da Silva, Lucie; Weller, Sandra; Dunn-Walters, Deborah K; Weill, Jean-Claude; Reynaud, Claude-Agnès

    2015-10-15

    From paired blood and spleen samples from three adult donors, we performed high-throughput VH sequencing of human B cell subsets defined by IgD and CD27 expression: IgD(+)CD27(+) ("marginal zone [MZ]"), IgD(-)CD27(+) ("memory," including IgM ["IgM-only"], IgG and IgA) and IgD(-)CD27(-) cells ("double-negative," including IgM, IgG, and IgA). A total of 91,294 unique sequences clustered in 42,670 clones, revealing major clonal expansions in each of these subsets. Among these clones, we further analyzed those shared sequences from different subsets or tissues for VH gene mutation, H-CDR3-length, and VH/JH usage, comparing these different characteristics with all sequences from their subset of origin for which these parameters constitute a distinct signature. The IgM-only repertoire profile differed notably from that of MZ B cells by a higher mutation frequency and lower VH4 and higher JH6 gene usage. Strikingly, IgM sequences from clones shared between the MZ and the memory IgG/IgA compartments showed a mutation and repertoire profile of IgM-only and not of MZ B cells. Similarly, all IgM clonal relationships (among MZ, IgM-only, and double-negative compartments) involved sequences with the characteristics of IgM-only B cells. Finally, clonal relationships between tissues suggested distinct recirculation characteristics between MZ and switched B cells. The "IgM-only" subset (including cells with its repertoire signature but higher IgD or lower CD27 expression levels) thus appear as the only subset showing precursor-product relationships with CD27(+) switched memory B cells, indicating that they represent germinal center-derived IgM memory B cells and that IgM memory and MZ B cells constitute two distinct entities. Copyright © 2015 by The American Association of Immunologists, Inc.

  1. Gene selection heuristic algorithm for nutrigenomics studies.

    PubMed

    Valour, D; Hue, I; Grimard, B; Valour, B

    2013-07-15

    Large datasets from -omics studies need to be deeply investigated. The aim of this paper is to provide a new method (LEM method) for the search of transcriptome and metabolome connections. The heuristic algorithm here described extends the classical canonical correlation analysis (CCA) to a high number of variables (without regularization) and combines well-conditioning and fast-computing in "R." Reduced CCA models are summarized in PageRank matrices, the product of which gives a stochastic matrix that resumes the self-avoiding walk covered by the algorithm. Then, a homogeneous Markov process applied to this stochastic matrix converges the probabilities of interconnection between genes, providing a selection of disjointed subsets of genes. This is an alternative to regularized generalized CCA for the determination of blocks within the structure matrix. Each gene subset is thus linked to the whole metabolic or clinical dataset that represents the biological phenotype of interest. Moreover, this selection process reaches the aim of biologists who often need small sets of genes for further validation or extended phenotyping. The algorithm is shown to work efficiently on three published datasets, resulting in meaningfully broadened gene networks.

  2. An Adaptive Genetic Association Test Using Double Kernel Machines.

    PubMed

    Zhan, Xiang; Epstein, Michael P; Ghosh, Debashis

    2015-10-01

    Recently, gene set-based approaches have become very popular in gene expression profiling studies for assessing how genetic variants are related to disease outcomes. Since most genes are not differentially expressed, existing pathway tests considering all genes within a pathway suffer from considerable noise and power loss. Moreover, for a differentially expressed pathway, it is of interest to select important genes that drive the effect of the pathway. In this article, we propose an adaptive association test using double kernel machines (DKM), which can both select important genes within the pathway as well as test for the overall genetic pathway effect. This DKM procedure first uses the garrote kernel machines (GKM) test for the purposes of subset selection and then the least squares kernel machine (LSKM) test for testing the effect of the subset of genes. An appealing feature of the kernel machine framework is that it can provide a flexible and unified method for multi-dimensional modeling of the genetic pathway effect allowing for both parametric and nonparametric components. This DKM approach is illustrated with application to simulated data as well as to data from a neuroimaging genetics study.

  3. Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data

    PubMed Central

    Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.

    2016-01-01

    We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872

  4. Accessing Information in Working Memory: Can the Focus of Attention Grasp Two Elements at the Same Time?

    ERIC Educational Resources Information Center

    Oberauer, Klaus; Bialkova, Svetlana

    2009-01-01

    Processing information in working memory requires selective access to a subset of working-memory contents by a focus of attention. Complex cognition often requires joint access to 2 items in working memory. How does the focus select 2 items? Two experiments with an arithmetic task and 1 with a spatial task investigate time demands for successive…

  5. Complete set of homogeneous isotropic analytic solutions in scalar-tensor cosmology with radiation and curvature

    NASA Astrophysics Data System (ADS)

    Bars, Itzhak; Chen, Shih-Hung; Steinhardt, Paul J.; Turok, Neil

    2012-10-01

    We study a model of a scalar field minimally coupled to gravity, with a specific potential energy for the scalar field, and include curvature and radiation as two additional parameters. Our goal is to obtain analytically the complete set of configurations of a homogeneous and isotropic universe as a function of time. This leads to a geodesically complete description of the Universe, including the passage through the cosmological singularities, at the classical level. We give all the solutions analytically without any restrictions on the parameter space of the model or initial values of the fields. We find that for generic solutions the Universe goes through a singular (zero-size) bounce by entering a period of antigravity at each big crunch and exiting from it at the following big bang. This happens cyclically again and again without violating the null-energy condition. There is a special subset of geodesically complete nongeneric solutions which perform zero-size bounces without ever entering the antigravity regime in all cycles. For these, initial values of the fields are synchronized and quantized but the parameters of the model are not restricted. There is also a subset of spatial curvature-induced solutions that have finite-size bounces in the gravity regime and never enter the antigravity phase. These exist only within a small continuous domain of parameter space without fine-tuning the initial conditions. To obtain these results, we identified 25 regions of a 6-parameter space in which the complete set of analytic solutions are explicitly obtained.

  6. Significance of settling model structures and parameter subsets in modelling WWTPs under wet-weather flow and filamentous bulking conditions.

    PubMed

    Ramin, Elham; Sin, Gürkan; Mikkelsen, Peter Steen; Plósz, Benedek Gy

    2014-10-15

    Current research focuses on predicting and mitigating the impacts of high hydraulic loadings on centralized wastewater treatment plants (WWTPs) under wet-weather conditions. The maximum permissible inflow to WWTPs depends not only on the settleability of activated sludge in secondary settling tanks (SSTs) but also on the hydraulic behaviour of SSTs. The present study investigates the impacts of ideal and non-ideal flow (dry and wet weather) and settling (good settling and bulking) boundary conditions on the sensitivity of WWTP model outputs to uncertainties intrinsic to the one-dimensional (1-D) SST model structures and parameters. We identify the critical sources of uncertainty in WWTP models through global sensitivity analysis (GSA) using the Benchmark simulation model No. 1 in combination with first- and second-order 1-D SST models. The results obtained illustrate that the contribution of settling parameters to the total variance of the key WWTP process outputs significantly depends on the influent flow and settling conditions. The magnitude of the impact is found to vary, depending on which type of 1-D SST model is used. Therefore, we identify and recommend potential parameter subsets for WWTP model calibration, and propose optimal choice of 1-D SST models under different flow and settling boundary conditions. Additionally, the hydraulic parameters in the second-order SST model are found significant under dynamic wet-weather flow conditions. These results highlight the importance of developing a more mechanistic based flow-dependent hydraulic sub-model in second-order 1-D SST models in the future. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. The XC chemokine receptor 1 is a conserved selective marker of mammalian cells homologous to mouse CD8α+ dendritic cells

    PubMed Central

    Crozat, Karine; Guiton, Rachel; Contreras, Vanessa; Feuillet, Vincent; Dutertre, Charles-Antoine; Ventre, Erwan; Vu Manh, Thien-Phong; Baranek, Thomas; Storset, Anne K.; Marvel, Jacqueline; Boudinot, Pierre; Hosmalin, Anne; Schwartz-Cornil, Isabelle

    2010-01-01

    Human BDCA3+ dendritic cells (DCs) were suggested to be homologous to mouse CD8α+ DCs. We demonstrate that human BDCA3+ DCs are more efficient than their BDCA1+ counterparts or plasmacytoid DCs (pDCs) in cross-presenting antigen and activating CD8+ T cells, which is similar to mouse CD8α+ DCs as compared with CD11b+ DCs or pDCs, although with more moderate differences between human DC subsets. Yet, no specific marker was known to be shared between homologous DC subsets across species. We found that XC chemokine receptor 1 (XCR1) is specifically expressed and active in mouse CD8α+, human BDCA3+, and sheep CD26+ DCs and is conserved across species. The mRNA encoding the XCR1 ligand chemokine (C motif) ligand 1 (XCL1) is selectively expressed in natural killer (NK) and CD8+ T lymphocytes at steady-state and is enhanced upon activation. Moreover, the Xcl1 mRNA is selectively expressed at high levels in central memory compared with naive CD8+ T lymphocytes. Finally, XCR1−/− mice have decreased early CD8+ T cell responses to Listeria monocytogenes infection, which is associated with higher bacterial loads early in infection. Therefore, XCR1 constitutes the first conserved specific marker for cell subsets homologous to mouse CD8α+ DCs in higher vertebrates and promotes their ability to activate early CD8+ T cell defenses against an intracellular pathogenic bacteria. PMID:20479118

  8. Association of bladder sensation measures and bladder diary in patients with urinary incontinence.

    PubMed

    King, Ashley B; Wolters, Jeff P; Klausner, Adam P; Rapp, David E

    2012-04-01

    Investigation suggests the involvement of afferent actions in the pathophysiology of urinary incontinence. Current diagnostic modalities do not allow for the accurate identification of sensory dysfunction. We previously reported urodynamic derivatives that may be useful in assessing bladder sensation. We sought to further investigate these derivatives by assessing for a relationship with 3-day bladder diary. Subset analysis was performed in patients without stress urinary incontinence (SUI) attempting to isolate patients with urgency symptoms. No association was demonstrated between bladder diary parameters and urodynamic derivatives (r coefficient range (-0.06 to 0.08)(p > 0.05)). However, subset analysis demonstrated an association between detrusor overactivity (DO) and bladder urgency velocity (BUV), with a lower BUV identified in patients without DO. Subset analysis of patients with isolated urgency/urge incontinence identified weak associations between voiding frequency and FSR (r = 0.39) and between daily incontinence episodes and BUV (r = 0.35). However, these associations failed to demonstrate statistical significance. No statistical association was seen between bladder diary and urodynamic derivatives. This is not unexpected, given that bladder diary parameters may reflect numerous pathologies including not only sensory dysfunction but also SUI and DO. However, weak associations were identified in patients without SUI and, further, a statistical relationship between DO and BUV was seen. Additional research is needed to assess the utility of FSR/BUV in characterizing sensory dysfunction, especially in patients without concurrent pathology (e.g. SUI, DO).

  9. Pathways to Disease: The Biological Consequences of Social Adversity on Asthma in Minority Youth

    DTIC Science & Technology

    2016-10-01

    the microbiome, and teleomere length and relate these biomarkers to the measured exposures to adversity and stress. The selection of and methods to...granted approval from the HRPO at the end of December 2015. After selecting a subset of our study population for evaluation, we experienced a second delay...of almost three months in setting up our account for Clinical lab testing. Since we were able to prepare the selected samples for biomarker testing

  10. Model-on-Demand Predictive Control for Nonlinear Hybrid Systems With Application to Adaptive Behavioral Interventions

    PubMed Central

    Nandola, Naresh N.; Rivera, Daniel E.

    2011-01-01

    This paper presents a data-centric modeling and predictive control approach for nonlinear hybrid systems. System identification of hybrid systems represents a challenging problem because model parameters depend on the mode or operating point of the system. The proposed algorithm applies Model-on-Demand (MoD) estimation to generate a local linear approximation of the nonlinear hybrid system at each time step, using a small subset of data selected by an adaptive bandwidth selector. The appeal of the MoD approach lies in the fact that model parameters are estimated based on a current operating point; hence estimation of locations or modes governed by autonomous discrete events is achieved automatically. The local MoD model is then converted into a mixed logical dynamical (MLD) system representation which can be used directly in a model predictive control (MPC) law for hybrid systems using multiple-degree-of-freedom tuning. The effectiveness of the proposed MoD predictive control algorithm for nonlinear hybrid systems is demonstrated on a hypothetical adaptive behavioral intervention problem inspired by Fast Track, a real-life preventive intervention for improving parental function and reducing conduct disorder in at-risk children. Simulation results demonstrate that the proposed algorithm can be useful for adaptive intervention problems exhibiting both nonlinear and hybrid character. PMID:21874087

  11. The MODIS reprojection tool

    USGS Publications Warehouse

    Dwyer, John L.; Schmidt, Gail L.; Qu, J.J.; Gao, W.; Kafatos, M.; Murphy , R.E.; Salomonson, V.V.

    2006-01-01

    The MODIS Reprojection Tool (MRT) is designed to help individuals work with MODIS Level-2G, Level-3, and Level-4 land data products. These products are referenced to a global tiling scheme in which each tile is approximately 10° latitude by 10° longitude and non-overlapping (Fig. 9.1). If desired, the user may reproject only selected portions of the product (spatial or parameter subsetting). The software may also be used to convert MODIS products to file formats (generic binary and GeoTIFF) that are more readily compatible with existing software packages. The MODIS land products distributed by the Land Processes Distributed Active Archive Center (LP DAAC) are in the Hierarchical Data Format - Earth Observing System (HDF-EOS), developed by the National Center for Supercomputing Applications at the University of Illinois at Urbana Champaign for the NASA EOS Program. Each HDF-EOS file is comprised of one or more science data sets (SDSs) corresponding to geophysical or biophysical parameters. Metadata are embedded in the HDF file as well as contained in a .met file that is associated with each HDF-EOS file. The MRT supports 8-bit, 16-bit, and 32-bit integer data (both signed and unsigned), as well as 32-bit float data. The data type of the output is the same as the data type of each corresponding input SDS.

  12. An Improved Method for Measuring Chromatin-binding Dynamics Using Time-dependent Formaldehyde Crosslinking

    PubMed Central

    Hoffman, Elizabeth A.; Zaidi, Hussain; Shetty, Savera J.; Bekiranov, Stefan; Auble, David T.

    2018-01-01

    Formaldehyde crosslinking is widely used in combination with chromatin immunoprecipitation (ChIP) to measure the locations along DNA and relative levels of transcription factor (TF)-DNA interactions in vivo. However, the measurements that are typically made do not provide unambiguous information about the dynamic properties of these interactions. We have developed a method to estimate binding kinetic parameters from time-dependent formaldehyde crosslinking data, called crosslinking kinetics (CLK) analysis. Cultures of yeast cells are crosslinked with formaldehyde for various periods of time, yielding the relative ChIP signal at particular loci. We fit the data using the mass-action CLK model to extract kinetic parameters of the TF-chromatin interaction, including the on- and off-rates and crosslinking rate. From the on- and off-rate we obtain the occupancy and residence time. The following protocol is the second iteration of this method, CLKv2, updated with improved crosslinking and quenching conditions, more information about crosslinking rates, and systematic procedures for modeling the observed kinetic regimes. CLKv2 analysis has been applied to investigate the binding behavior of the TATA-binding protein (TBP), and a selected subset of other TFs. The protocol was developed using yeast cells, but may be applicable to cells from other organisms as well. PMID:29682595

  13. CD127 and CD25 Expression Defines CD4+ T Cell Subsets That Are Differentially Depleted during HIV Infection1

    PubMed Central

    Dunham, Richard M.; Cervasi, Barbara; Brenchley, Jason M.; Albrecht, Helmut; Weintrob, Amy; Sumpter, Beth; Engram, Jessica; Gordon, Shari; Klatt, Nichole R.; Frank, Ian; Sodora, Donald L.; Douek, Daniel C.; Paiardini, Mirko; Silvestri, Guido

    2009-01-01

    Decreased CD4+ T cell counts are the best marker of disease progression during HIV infection. However, CD4+ T cells are heterogeneous in phenotype and function, and it is unknown how preferential depletion of specific CD4+ T cell subsets influences disease severity. CD4+ T cells can be classified into three subsets by the expression of receptors for two T cell-tropic cytokines, IL-2 (CD25) and IL-7 (CD127). The CD127+CD25low/− subset includes IL-2-producing naive and central memory T cells; the CD127−CD25− subset includes mainly effector T cells expressing perforin and IFN-γ; and the CD127lowCD25high subset includes FoxP3-expressing regulatory T cells. Herein we investigated how the proportions of these T cell subsets are changed during HIV infection. When compared with healthy controls, HIV-infected patients show a relative increase in CD4+CD127−CD25− T cells that is related to an absolute decline of CD4+CD127+CD25low/− T cells. Interestingly, this expansion of CD4+CD127− T cells was not observed in naturally SIV-infected sooty mangabeys. The relative expansion of CD4+CD127−CD25− T cells correlated directly with the levels of total CD4+ T cell depletion and immune activation. CD4+CD127−CD25− T cells were not selectively resistant to HIV infection as levels of cell-associated virus were similar in all non-naive CD4+ T cell subsets. These data indicate that, during HIV infection, specific changes in the fraction of CD4+ T cells expressing CD25 and/or CD127 are associated with disease progression. Further studies will determine whether monitoring the three subsets of CD4+ T cells defined based on the expression of CD25 and CD127 should be used in the clinical management of HIV-infected individuals. PMID:18390743

  14. Exploring NASA OMI Level 2 Data With Visualization

    NASA Technical Reports Server (NTRS)

    Wei, Jennifer; Yang, Wenli; Johnson, James; Zhao, Peisheng; Gerasimov, Irina; Pham, Long; Vicente, Gilberto

    2014-01-01

    Satellite data products are important for a wide variety of applications that can bring far-reaching benefits to the science community and the broader society. These benefits can best be achieved if the satellite data are well utilized and interpreted, such as model inputs from satellite, or extreme events (such as volcano eruptions, dust storms,... etc.). Unfortunately, this is not always the case, despite the abundance and relative maturity of numerous satellite data products provided by NASA and other organizations. Such obstacles may be avoided by allowing users to visualize satellite data as "images", with accurate pixel-level (Level-2) information, including pixel coverage area delineation and science team recommended quality screening for individual geophysical parameters. We present a prototype service from the Goddard Earth Sciences Data and Information Services Center (GES DISC) supporting Aura OMI Level-2 Data with GIS-like capabilities. Functionality includes selecting data sources (e.g., multiple parameters under the same scene, like NO2 and SO2, or the same parameter with different aggregation methods, like NO2 in OMNO2G and OMNO2D products), user-defined area-of-interest and temporal extents, zooming, panning, overlaying, sliding, and data subsetting, reformatting, and reprojection. The system will allow any user-defined portal interface (front-end) to connect to our backend server with OGC standard-compliant Web Mapping Service (WMS) and Web Coverage Service (WCS) calls. This back-end service should greatly enhance its expandability to integrate additional outside data/map sources.

  15. Exploring NASA OMI Level 2 Data With Visualization

    NASA Technical Reports Server (NTRS)

    Wei, Jennifer C.; Yang, Wenli; Johnson, James; Zhao, Peisheng; Gerasimov, Irina; Pham, Long; Vincente, Gilbert

    2014-01-01

    Satellite data products are important for a wide variety of applications that can bring far-reaching benefits to the science community and the broader society. These benefits can best be achieved if the satellite data are well utilized and interpreted, such as model inputs from satellite, or extreme events (such as volcano eruptions, dust storms, etc.).Unfortunately, this is not always the case, despite the abundance and relative maturity of numerous satellite data products provided by NASA and other organizations. Such obstacles may be avoided by allowing users to visualize satellite data as images, with accurate pixel-level (Level-2) information, including pixel coverage area delineation and science team recommended quality screening for individual geophysical parameters. We present a prototype service from the Goddard Earth Sciences Data and Information Services Center (GES DISC) supporting Aura OMI Level-2 Data with GIS-like capabilities. Functionality includes selecting data sources (e.g., multiple parameters under the same scene, like NO2 and SO2, or the same parameter with different aggregation methods, like NO2 in OMNO2G and OMNO2D products), user-defined area-of-interest and temporal extents, zooming, panning, overlaying, sliding, and data subsetting, reformatting, and reprojection. The system will allow any user-defined portal interface (front-end) to connect to our backend server with OGC standard-compliant Web Mapping Service (WMS) and Web Coverage Service (WCS) calls. This back-end service should greatly enhance its expandability to integrate additional outside data-map sources.

  16. Plasma properties of driver gas following interplanetary shocks observed by ISEE-3

    NASA Technical Reports Server (NTRS)

    Zwickl, R. D.; Asbridge, J. R.; Bame, S. J.; Feldman, W. C.; Gosling, J. T.; Smith, E. J.

    1983-01-01

    Plasma fluid parameters calculated from solar wind and magnetic field data to determine the characteristic properties of driver gas following a select subset of interplanetary shocks were studied. Of 54 shocks observed from August 1978 to February 1980, 9 contained a well defined driver gas that was clearly identifiable by a discontinuous decrease in the average proton temperature. While helium enhancements were present downstream of the shock in all 9 of these events, only about half of them contained simultaneous changes in the two quantities. Simultaneous with the drop in proton temperature the helium and electron temperature decreased abruptly. In some cases the proton temperature depression was accompanied by a moderate increase in magnetic field magnitude with an unusually low variance, by a small decrease in the variance of the bulk velocity, and by an increase in the ratio of parallel to perpendicular temperature. The cold driver gas usually displayed a bidirectional flow of suprathermal solar wind electrons at higher energies.

  17. Trigger learning and ECG parameter customization for remote cardiac clinical care information system.

    PubMed

    Bashir, Mohamed Ezzeldin A; Lee, Dong Gyu; Li, Meijing; Bae, Jang-Whan; Shon, Ho Sun; Cho, Myung Chan; Ryu, Keun Ho

    2012-07-01

    Coronary heart disease is being identified as the largest single cause of death along the world. The aim of a cardiac clinical information system is to achieve the best possible diagnosis of cardiac arrhythmias by electronic data processing. Cardiac information system that is designed to offer remote monitoring of patient who needed continues follow up is demanding. However, intra- and interpatient electrocardiogram (ECG) morphological descriptors are varying through the time as well as the computational limits pose significant challenges for practical implementations. The former requires that the classification model be adjusted continuously, and the latter requires a reduction in the number and types of ECG features, and thus, the computational burden, necessary to classify different arrhythmias. We propose the use of adaptive learning to automatically train the classifier on up-to-date ECG data, and employ adaptive feature selection to define unique feature subsets pertinent to different types of arrhythmia. Experimental results show that this hybrid technique outperforms conventional approaches and is, therefore, a promising new intelligent diagnostic tool.

  18. The Swift GRB Host Galaxy Legacy Survey

    NASA Astrophysics Data System (ADS)

    Perley, Daniel

    2015-08-01

    I will describe the Swift Host Galaxy Legacy Survey (SHOALS), a comprehensive multiwavelength program to characterize the demographics of the GRB host population and its redshift evolution from z=0 to z=7. Using unbiased selection criteria we have designated a subset of 119 Swift gamma-ray bursts which are now being targeted with intensive observational follow-up. Deep Spitzer imaging of every field has already been obtained and analyzed, with major programs ongoing at Keck, GTC, Gemini, VLT, and Magellan to obtain complementary optical/NIR photometry and spectroscopy to enable full SED modeling and derivation of fundamental physical parameters such as mass, extinction, and star-formation rate. Using these data I will present an unbiased measurement of the GRB host-galaxy luminosity and mass distributions and their evolution with redshift, compare GRB hosts to other star-forming galaxy populations, and discuss implications for the nature of the GRB progenitor and the ability of GRBs to serve as tools for measuring and studying cosmic star-formation in the distant universe.

  19. Maclisp extensions

    NASA Technical Reports Server (NTRS)

    Bawden, A.; Burke, G. S.; Hoffman, C. W.

    1981-01-01

    A common subset of selected facilities available in Maclisp and its derivatives (PDP-10 and Multics Maclisp, Lisp Machine Lisp (Zetalisp), and NIL) is decribed. The object is to add in writing code which can run compatibly in more than one of these environments.

  20. A New Direction of Cancer Classification: Positive Effect of Low-Ranking MicroRNAs.

    PubMed

    Li, Feifei; Piao, Minghao; Piao, Yongjun; Li, Meijing; Ryu, Keun Ho

    2014-10-01

    Many studies based on microRNA (miRNA) expression profiles showed a new aspect of cancer classification. Because one characteristic of miRNA expression data is the high dimensionality, feature selection methods have been used to facilitate dimensionality reduction. The feature selection methods have one shortcoming thus far: they just consider the problem of where feature to class is 1:1 or n:1. However, because one miRNA may influence more than one type of cancer, human miRNA is considered to be ranked low in traditional feature selection methods and are removed most of the time. In view of the limitation of the miRNA number, low-ranking miRNAs are also important to cancer classification. We considered both high- and low-ranking features to cover all problems (1:1, n:1, 1:n, and m:n) in cancer classification. First, we used the correlation-based feature selection method to select the high-ranking miRNAs, and chose the support vector machine, Bayes network, decision tree, k-nearest-neighbor, and logistic classifier to construct cancer classification. Then, we chose Chi-square test, information gain, gain ratio, and Pearson's correlation feature selection methods to build the m:n feature subset, and used the selected miRNAs to determine cancer classification. The low-ranking miRNA expression profiles achieved higher classification accuracy compared with just using high-ranking miRNAs in traditional feature selection methods. Our results demonstrate that the m:n feature subset made a positive impression of low-ranking miRNAs in cancer classification.

  1. Immune Reactions against Gene Gun Vaccines Are Differentially Modulated by Distinct Dendritic Cell Subsets in the Skin

    PubMed Central

    Deressa, Tekalign; Strandt, Helen; Florindo Pinheiro, Douglas; Mittermair, Roberta; Pizarro Pesado, Jennifer; Thalhamer, Josef; Hammerl, Peter; Stoecklinger, Angelika

    2015-01-01

    The skin accommodates multiple dendritic cell (DC) subsets with remarkable functional diversity. Immune reactions are initiated and modulated by the triggering of DC by pathogen-associated or endogenous danger signals. In contrast to these processes, the influence of intrinsic features of protein antigens on the strength and type of immune responses is much less understood. Therefore, we investigated the involvement of distinct DC subsets in immune reactions against two structurally different model antigens, E. coli beta-galactosidase (betaGal) and chicken ovalbumin (OVA) under otherwise identical conditions. After epicutaneous administration of the respective DNA vaccines with a gene gun, wild type mice induced robust immune responses against both antigens. However, ablation of langerin+ DC almost abolished IgG1 and cytotoxic T lymphocytes against betaGal but enhanced T cell and antibody responses against OVA. We identified epidermal Langerhans cells (LC) as the subset responsible for the suppression of anti-OVA reactions and found regulatory T cells critically involved in this process. In contrast, reactions against betaGal were not affected by the selective elimination of LC, indicating that this antigen required a different langerin+ DC subset. The opposing findings obtained with OVA and betaGal vaccines were not due to immune-modulating activities of either the plasmid DNA or the antigen gene products, nor did the differential cellular localization, size or dose of the two proteins account for the opposite effects. Thus, skin-borne protein antigens may be differentially handled by distinct DC subsets, and, in this way, intrinsic features of the antigen can participate in immune modulation. PMID:26030383

  2. Existence of CD8α-like dendritic cells with a conserved functional specialization and a common molecular signature in distant mammalian species.

    PubMed

    Contreras, Vanessa; Urien, Céline; Guiton, Rachel; Alexandre, Yannick; Vu Manh, Thien-Phong; Andrieu, Thibault; Crozat, Karine; Jouneau, Luc; Bertho, Nicolas; Epardaud, Mathieu; Hope, Jayne; Savina, Ariel; Amigorena, Sebastian; Bonneau, Michel; Dalod, Marc; Schwartz-Cornil, Isabelle

    2010-09-15

    The mouse lymphoid organ-resident CD8alpha(+) dendritic cell (DC) subset is specialized in Ag presentation to CD8(+) T cells. Recent evidence shows that mouse nonlymphoid tissue CD103(+) DCs and human blood DC Ag 3(+) DCs share similarities with CD8alpha(+) DCs. We address here whether the organization of DC subsets is conserved across mammals in terms of gene expression signatures, phenotypic characteristics, and functional specialization, independently of the tissue of origin. We study the DC subsets that migrate from the skin in the ovine species that, like all domestic animals, belongs to the Laurasiatheria, a distinct phylogenetic clade from the supraprimates (human/mouse). We demonstrate that the minor sheep CD26(+) skin lymph DC subset shares significant transcriptomic similarities with mouse CD8alpha(+) and human blood DC Ag 3(+) DCs. This allowed the identification of a common set of phenotypic characteristics for CD8alpha-like DCs in the three mammalian species (i.e., SIRP(lo), CADM1(hi), CLEC9A(hi), CD205(hi), XCR1(hi)). Compared to CD26(-) DCs, the sheep CD26(+) DCs show 1) potent stimulation of allogeneic naive CD8(+) T cells with high selective induction of the Ifngamma and Il22 genes; 2) dominant efficacy in activating specific CD8(+) T cells against exogenous soluble Ag; and 3) selective expression of functional pathways associated with high capacity for Ag cross-presentation. Our results unravel a unifying definition of the CD8alpha(+)-like DCs across mammalian species and identify molecular candidates that could be used for the design of vaccines applying to mammals in general.

  3. Minimizing the average distance to a closest leaf in a phylogenetic tree.

    PubMed

    Matsen, Frederick A; Gallagher, Aaron; McCoy, Connor O

    2013-11-01

    When performing an analysis on a collection of molecular sequences, it can be convenient to reduce the number of sequences under consideration while maintaining some characteristic of a larger collection of sequences. For example, one may wish to select a subset of high-quality sequences that represent the diversity of a larger collection of sequences. One may also wish to specialize a large database of characterized "reference sequences" to a smaller subset that is as close as possible on average to a collection of "query sequences" of interest. Such a representative subset can be useful whenever one wishes to find a set of reference sequences that is appropriate to use for comparative analysis of environmentally derived sequences, such as for selecting "reference tree" sequences for phylogenetic placement of metagenomic reads. In this article, we formalize these problems in terms of the minimization of the Average Distance to the Closest Leaf (ADCL) and investigate algorithms to perform the relevant minimization. We show that the greedy algorithm is not effective, show that a variant of the Partitioning Around Medoids (PAM) heuristic gets stuck in local minima, and develop an exact dynamic programming approach. Using this exact program we note that the performance of PAM appears to be good for simulated trees, and is faster than the exact algorithm for small trees. On the other hand, the exact program gives solutions for all numbers of leaves less than or equal to the given desired number of leaves, whereas PAM only gives a solution for the prespecified number of leaves. Via application to real data, we show that the ADCL criterion chooses chimeric sequences less often than random subsets, whereas the maximization of phylogenetic diversity chooses them more often than random. These algorithms have been implemented in publicly available software.

  4. Langerin+ dermal dendritic cells are critical for CD8+ T cell activation and IgH γ-1 class switching in response to gene gun vaccines.

    PubMed

    Stoecklinger, Angelika; Eticha, Tekalign D; Mesdaghi, Mehrnaz; Kissenpfennig, Adrien; Malissen, Bernard; Thalhamer, Josef; Hammerl, Peter

    2011-02-01

    The C-type lectin langerin/CD207 was originally discovered as a specific marker for epidermal Langerhans cells (LC). Recently, additional and distinct subsets of langerin(+) dendritic cells (DC) have been identified in lymph nodes and peripheral tissues of mice. Although the role of LC for immune activation or modulation is now being discussed controversially, other langerin(+) DC appear crucial for protective immunity in a growing set of infection and vaccination models. In knock-in mice that express the human diphtheria toxin receptor under control of the langerin promoter, injection of diphtheria toxin ablates LC for several weeks whereas other langerin(+) DC subsets are replenished within just a few days. Thus, by careful timing of diphtheria toxin injections selective states of deficiency in either LC only or all langerin(+) cells can be established. Taking advantage of this system, we found that, unlike selective LC deficiency, ablation of all langerin(+) DC abrogated the activation of IFN-γ-producing and cytolytic CD8(+) T cells after gene gun vaccination. Moreover, we identified migratory langerin(+) dermal DC as the subset that directly activated CD8(+) T cells in lymph nodes. Langerin(+) DC were also critical for IgG1 but not IgG2a Ab induction, suggesting differential polarization of CD4(+) T helper cells by langerin(+) or langerin-negative DC, respectively. In contrast, protein vaccines administered with various adjuvants induced IgG1 independently of langerin(+) DC. Taken together, these findings reflect a highly specialized division of labor between different DC subsets both with respect to Ag encounter as well as downstream processes of immune activation.

  5. Cerebellins are differentially expressed in selective subsets of neurons throughout the brain.

    PubMed

    Seigneur, Erica; Südhof, Thomas C

    2017-10-15

    Cerebellins are secreted hexameric proteins that form tripartite complexes with the presynaptic cell-adhesion molecules neurexins or 'deleted-in-colorectal-cancer', and the postsynaptic glutamate-receptor-related proteins GluD1 and GluD2. These tripartite complexes are thought to regulate synapses. However, cerebellins are expressed in multiple isoforms whose relative distributions and overall functions are not understood. Three of the four cerebellins, Cbln1, Cbln2, and Cbln4, autonomously assemble into homohexamers, whereas the Cbln3 requires Cbln1 for assembly and secretion. Here, we show that Cbln1, Cbln2, and Cbln4 are abundantly expressed in nearly all brain regions, but exhibit strikingly different expression patterns and developmental dynamics. Using newly generated knockin reporter mice for Cbln2 and Cbln4, we find that Cbln2 and Cbln4 are not universally expressed in all neurons, but only in specific subsets of neurons. For example, Cbln2 and Cbln4 are broadly expressed in largely non-overlapping subpopulations of excitatory cortical neurons, but only sparse expression was observed in excitatory hippocampal neurons of the CA1- or CA3-region. Similarly, Cbln2 and Cbln4 are selectively expressed, respectively, in inhibitory interneurons and excitatory mitral projection neurons of the main olfactory bulb; here, these two classes of neurons form dendrodendritic reciprocal synapses with each other. A few brain regions, such as the nucleus of the lateral olfactory tract, exhibit astoundingly high Cbln2 expression levels. Viewed together, our data show that cerebellins are abundantly expressed in relatively small subsets of neurons, suggesting specific roles restricted to subsets of synapses. © 2017 Wiley Periodicals, Inc.

  6. Ultracool dwarf benchmarks with Gaia primaries

    NASA Astrophysics Data System (ADS)

    Marocco, F.; Pinfield, D. J.; Cook, N. J.; Zapatero Osorio, M. R.; Montes, D.; Caballero, J. A.; Gálvez-Ortiz, M. C.; Gromadzki, M.; Jones, H. R. A.; Kurtev, R.; Smart, R. L.; Zhang, Z.; Cabrera Lavers, A. L.; García Álvarez, D.; Qi, Z. X.; Rickard, M. J.; Dover, L.

    2017-10-01

    We explore the potential of Gaia for the field of benchmark ultracool/brown dwarf companions, and present the results of an initial search for metal-rich/metal-poor systems. A simulated population of resolved ultracool dwarf companions to Gaia primary stars is generated and assessed. Of the order of ˜24 000 companions should be identifiable outside of the Galactic plane (|b| > 10 deg) with large-scale ground- and space-based surveys including late M, L, T and Y types. Our simulated companion parameter space covers 0.02 ≤ M/M⊙ ≤ 0.1, 0.1 ≤ age/Gyr ≤ 14 and -2.5 ≤ [Fe/H] ≤ 0.5, with systems required to have a false alarm probability <10-4, based on projected separation and expected constraints on common distance, common proper motion and/or common radial velocity. Within this bulk population, we identify smaller target subsets of rarer systems whose collective properties still span the full parameter space of the population, as well as systems containing primary stars that are good age calibrators. Our simulation analysis leads to a series of recommendations for candidate selection and observational follow-up that could identify ˜500 diverse Gaia benchmarks. As a test of the veracity of our methodology and simulations, our initial search uses UKIRT Infrared Deep Sky Survey and Sloan Digital Sky Survey to select secondaries, with the parameters of primaries taken from Tycho-2, Radial Velocity Experiment, Large sky Area Multi-Object fibre Spectroscopic Telescope and Tycho-Gaia Astrometric Solution. We identify and follow up 13 new benchmarks. These include M8-L2 companions, with metallicity constraints ranging in quality, but robust in the range -0.39 ≤ [Fe/H] ≤ +0.36, and with projected physical separation in the range 0.6 < s/kau < 76. Going forward, Gaia offers a very high yield of benchmark systems, from which diverse subsamples may be able to calibrate a range of foundational ultracool/sub-stellar theory and observation.

  7. SU-F-T-618: Evaluation of a Mono-Isocentric Treatment Planning Software for Stereotactic Radiosurgery of Multiple Brain Metastases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sham, E; Sattarivand, M; Mulroy, L

    Purpose: To evaluate planning performance of an automated treatment planning software (BrainLAB; Elements) for stereotactic radiosurgery (SRS) of multiple brain metastases. Methods: Brainlab’s Multiple Metastases Elements (MME) uses single isocentric technique to treat up to 10 cranial planning target volumes (PTVs). The planning algorithm of the MME accounts for multiple PTVs overlapping with one another on the beam eyes view (BEV) and automatically selects a subset of all overlapping PTVs on each arc for sparing normal tissues in the brain. The algorithm also optimizes collimator angles, margins between multi-leaf collimators (MLCs) and PTVs, as well as monitor units (MUs) usingmore » minimization of conformity index (CI) for all targets. Planning performance was evaluated by comparing the MME-calculated treatment plan parameters with the same parameters calculated with the Volumetric Modulated Arc Therapy (VMAT) optimization on Varian’s Eclipse platform. Results: Figures 1 to 3 compare several treatment plan outcomes calculated between the MME and VMAT for 5 clinical multi-targets SRS patient plans. Prescribed target dose was volume-dependent and defined based on the RTOG recommendation. For a total number of 18 PTV’s, mean values for the CI, PITV, and GI were comparable between the MME and VMAT within one standard deviation (σ). However, MME-calculated MDPD was larger than the same VMAT-calculated parameter. While both techniques delivered similar maximum point doses to the critical cranial structures and total MU’s for the 5 patient plans, the MME required less treatment planning time by an order of magnitude compared to VMAT. Conclusion: The MME and VMAT produce similar plan qualities in terms of MUs, target dose conformation, and OAR dose sparing. While the selective use of PTVs for arc-optimization with the MME reduces significantly the total planning time in comparison to VMAT, the target dose homogeneity was also compromised due to its simplified inverse planning algorithm used.« less

  8. Prediction of chemotherapeutic response of colorectal liver metastases with dynamic gadolinium-DTPA-enhanced MRI and localized 19F MRS pharmacokinetic studies of 5-fluorouracil.

    PubMed

    van Laarhoven, H W M; Klomp, D W J; Rijpkema, M; Kamm, Y L M; Wagener, D J Th; Barentsz, J O; Punt, C J A; Heerschap, A

    2007-04-01

    Systemic chemotherapy is effective in only a subset of patients with metastasized colorectal cancer. Therefore, early selection of patients who are most likely to benefit from chemotherapy is desirable. Response to treatment may be determined by the delivery of the drug to the tumor, retention of the drug in the tumor and by the amount of intracellular uptake, metabolic activation and catabolism, as well as other factors. The first aim of this study was to investigate the predictive value of DCE-MRI with the contrast agent Gd-DTPA for tumor response to first-line chemotherapy in patients with liver metastases of colorectal cancer. The second aim was to investigate the predictive value of 5-fluorouracil (FU) uptake, retention and catabolism as measured by localized (19)F MRS for tumor response to FU therapy. Since FU uptake, retention and metabolism may depend on tumor vascularization, the relationship between (19)F MRS and the DCE-MRI parameters k(ep), K(trans) and v(e) was also examined (1). In this study, 37 patients were included. The kinetic parameters of DCE-MRI, k(ep), K(trans) and v(e), before start of treatment did not predict tumor response after 2 months, suggesting that the delivery of chemotherapy by tumor vasculature is not a major factor determining response in first-line treatment. No evident correlations between (19)F MRS parameters and tumor response were found. This suggests that in liver metastases that are not selected on the basis of their tumor diameter, FU uptake and catabolism are not limiting factors for response. The transfer constant K(trans), as measured by DCE-MRI before start of treatment, was negatively correlated with FU half-life in the liver metastases, which suggests that, in metastases with a larger tumor blood flow or permeability surface area product, FU is rapidly washed out from the tumor. c 2006 John Wiley & Sons, Ltd.

  9. Spectroscopy Made Easy: A New Tool for Fitting Observations with Synthetic Spectra

    NASA Technical Reports Server (NTRS)

    Valenti, J. A.; Piskunov, N.

    1996-01-01

    We describe a new software package that may be used to determine stellar and atomic parameters by matching observed spectra with synthetic spectra generated from parameterized atmospheres. A nonlinear least squares algorithm is used to solve for any subset of allowed parameters, which include atomic data (log gf and van der Waals damping constants), model atmosphere specifications (T(sub eff, log g), elemental abundances, and radial, turbulent, and rotational velocities. LTE synthesis software handles discontiguous spectral intervals and complex atomic blends. As a demonstration, we fit 26 Fe I lines in the NSO Solar Atlas (Kurucz et al.), determining various solar and atomic parameters.

  10. Spatial and temporal study of nitrate concentration in groundwater by means of coregionalization

    USGS Publications Warehouse

    D'Agostino, V.; Greene, E.A.; Passarella, G.; Vurro, M.

    1998-01-01

    Spatial and temporal behavior of hydrochemical parameters in groundwater can be studied using tools provided by geostatistics. The cross-variogram can be used to measure the spatial increments between observations at two given times as a function of distance (spatial structure). Taking into account the existence of such a spatial structure, two different data sets (sampled at two different times), representing concentrations of the same hydrochemical parameter, can be analyzed by cokriging in order to reduce the uncertainty of the estimation. In particular, if one of the two data sets is a subset of the other (that is, an undersampled set), cokriging allows us to study the spatial distribution of the hydrochemical parameter at that time, while also considering the statistical characteristics of the full data set established at a different time. This paper presents an application of cokriging by using temporal subsets to study the spatial distribution of nitrate concentration in the aquifer of the Lucca Plain, central Italy. Three data sets of nitrate concentration in groundwater were collected during three different periods in 1991. The first set was from 47 wells, but the second and the third are undersampled and represent 28 and 27 wells, respectively. Comparing the result of cokriging with ordinary kriging showed an improvement of the uncertainty in terms of reducing the estimation variance. The application of cokriging to the undersampled data sets reduced the uncertainty in estimating nitrate concentration and at the same time decreased the cost of the field sampling and laboratory analysis.Spatial and temporal behavior of hydrochemical parameters in groundwater can be studied using tools provided by geostatistics. The cross-variogram can be used to measure the spatial increments between observations at two given times as a function of distance (spatial structure). Taking into account the existence of such a spatial structure, two different data sets (sampled at two different times), representing concentrations of the same hydrochemical parameter, can be analyzed by cokriging in order to reduce the uncertainty of the estimation. In particular, if one of the two data sets is a subset of the other (that is, an undersampled set), cokriging allows us to study the spatial distribution of the hydrochemical parameter at that time, while also considering the statistical characteristics of the full data set established at a different time. This paper presents an application of cokriging by using temporal subsets to study the spatial distribution of nitrate concentration in the aquifer of the Lucca Plain, central Italy. Three data sets of nitrate concentration in groundwater were collected during three different periods in 1991. The first set was from 47 wells, but the second and the third are undersampled and represent 28 and 27 wells, respectively. Comparing the result of cokriging with ordinary kriging showed an improvement of the uncertainty in terms of reducing the estimation variance. The application of cokriging to the undersampled data sets reduced the uncertainty in estimating nitrate concentration and at the same time decreased the cost of the field sampling and laboratory analysis.

  11. MISR Regional SAMUM Imagery Overview

    Atmospheric Science Data Center

    2016-08-24

    ... View Data  |  Download Data About this Web Site: Visualizations of select MISR Level 3 data for special regional ... regional version used in support of the SAMUM Campaign. More information about the Level 1 and Level 2 products subsetted for the SAMUM ...

  12. MISR Regional VBBE Imagery Overview

    Atmospheric Science Data Center

    2016-08-24

    ... View Data  |  Download Data About this Web Site: Visualizations of select MISR Level 3 data for special regional ... regional version used in support of the VBBE Campaign. More information about the Level 1 and Level 2 products subsetted for the VBBE ...

  13. A comparative analysis of biclustering algorithms for gene expression data

    PubMed Central

    Eren, Kemal; Deveci, Mehmet; Küçüktunç, Onur; Çatalyürek, Ümit V.

    2013-01-01

    The need to analyze high-dimension biological data is driving the development of new data mining methods. Biclustering algorithms have been successfully applied to gene expression data to discover local patterns, in which a subset of genes exhibit similar expression levels over a subset of conditions. However, it is not clear which algorithms are best suited for this task. Many algorithms have been published in the past decade, most of which have been compared only to a small number of algorithms. Surveys and comparisons exist in the literature, but because of the large number and variety of biclustering algorithms, they are quickly outdated. In this article we partially address this problem of evaluating the strengths and weaknesses of existing biclustering methods. We used the BiBench package to compare 12 algorithms, many of which were recently published or have not been extensively studied. The algorithms were tested on a suite of synthetic data sets to measure their performance on data with varying conditions, such as different bicluster models, varying noise, varying numbers of biclusters and overlapping biclusters. The algorithms were also tested on eight large gene expression data sets obtained from the Gene Expression Omnibus. Gene Ontology enrichment analysis was performed on the resulting biclusters, and the best enrichment terms are reported. Our analyses show that the biclustering method and its parameters should be selected based on the desired model, whether that model allows overlapping biclusters, and its robustness to noise. In addition, we observe that the biclustering algorithms capable of finding more than one model are more successful at capturing biologically relevant clusters. PMID:22772837

  14. An adaptive design for updating the threshold value of a continuous biomarker.

    PubMed

    Spencer, Amy V; Harbron, Chris; Mander, Adrian; Wason, James; Peers, Ian

    2016-11-30

    Potential predictive biomarkers are often measured on a continuous scale, but in practice, a threshold value to divide the patient population into biomarker 'positive' and 'negative' is desirable. Early phase clinical trials are increasingly using biomarkers for patient selection, but at this stage, it is likely that little will be known about the relationship between the biomarker and the treatment outcome. We describe a single-arm trial design with adaptive enrichment, which can increase power to demonstrate efficacy within a patient subpopulation, the parameters of which are also estimated. Our design enables us to learn about the biomarker and optimally adjust the threshold during the study, using a combination of generalised linear modelling and Bayesian prediction. At the final analysis, a binomial exact test is carried out, allowing the hypothesis that 'no population subset exists in which the novel treatment has a desirable response rate' to be tested. Through extensive simulations, we are able to show increased power over fixed threshold methods in many situations without increasing the type-I error rate. We also show that estimates of the threshold, which defines the population subset, are unbiased and often more precise than those from fixed threshold studies. We provide an example of the method applied (retrospectively) to publically available data from a study of the use of tamoxifen after mastectomy by the German Breast Study Group, where progesterone receptor is the biomarker of interest. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  15. On Least Squares Fitting Nonlinear Submodels.

    ERIC Educational Resources Information Center

    Bechtel, Gordon G.

    Three simplifying conditions are given for obtaining least squares (LS) estimates for a nonlinear submodel of a linear model. If these are satisfied, and if the subset of nonlinear parameters may be LS fit to the corresponding LS estimates of the linear model, then one attains the desired LS estimates for the entire submodel. Two illustrative…

  16. 14 CFR Appendix M to Part 25 - Fuel Tank System Flammability Reduction Means

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ..., ground testing, and flight testing, or any combination of these, that: (1) Validate the parameters used... either ground or takeoff/climb phases of flight during warm days. The analysis must consider the following conditions. (1) The analysis must use the subset of those flights that begin with a sea level...

  17. 14 CFR Appendix M to Part 25 - Fuel Tank System Flammability Reduction Means

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ..., ground testing, and flight testing, or any combination of these, that: (1) Validate the parameters used... either ground or takeoff/climb phases of flight during warm days. The analysis must consider the following conditions. (1) The analysis must use the subset of those flights that begin with a sea level...

  18. 14 CFR Appendix M to Part 25 - Fuel Tank System Flammability Reduction Means

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ..., ground testing, and flight testing, or any combination of these, that: (1) Validate the parameters used... either ground or takeoff/climb phases of flight during warm days. The analysis must consider the following conditions. (1) The analysis must use the subset of those flights that begin with a sea level...

  19. Non-robust dynamic inferences from macroeconometric models: Bifurcation stratification of confidence regions

    NASA Astrophysics Data System (ADS)

    Barnett, William A.; Duzhak, Evgeniya Aleksandrovna

    2008-06-01

    Grandmont [J.M. Grandmont, On endogenous competitive business cycles, Econometrica 53 (1985) 995-1045] found that the parameter space of the most classical dynamic models is stratified into an infinite number of subsets supporting an infinite number of different kinds of dynamics, from monotonic stability at one extreme to chaos at the other extreme, and with many forms of multiperiodic dynamics in between. The econometric implications of Grandmont’s findings are particularly important, if bifurcation boundaries cross the confidence regions surrounding parameter estimates in policy-relevant models. Stratification of a confidence region into bifurcated subsets seriously damages robustness of dynamical inferences. Recently, interest in policy in some circles has moved to New-Keynesian models. As a result, in this paper we explore bifurcation within the class of New-Keynesian models. We develop the econometric theory needed to locate bifurcation boundaries in log-linearized New-Keynesian models with Taylor policy rules or inflation-targeting policy rules. Central results needed in this research are our theorems on the existence and location of Hopf bifurcation boundaries in each of the cases that we consider.

  20. Inversion of parameters for semiarid regions by a neural network

    NASA Technical Reports Server (NTRS)

    Zurk, Lisa M.; Davis, Daniel; Njoku, Eni G.; Tsang, Leung; Hwang, Jenq-Neng

    1992-01-01

    Microwave brightness temperatures obtained from a passive radiative transfer model are inverted through use of a neural network. The model is applicable to semiarid regions and produces dual-polarized brightness temperatures for 6.6-, 10.7-, and 37-GHz frequencies. A range of temperatures is generated by varying three geophysical parameters over acceptable ranges: soil moisture, vegetation moisture, and soil temperature. A multilayered perceptron (MLP) neural network is trained with a subset of the generated temperatures, and the remaining temperatures are inverted using a backpropagation method. Several synthetic terrains are devised and inverted by the network under local constraints. All the inversions show good agreement with the original geophysical parameters, falling within 5 percent of the actual value of the parameter range.

  1. LS Bound based gene selection for DNA microarray data.

    PubMed

    Zhou, Xin; Mao, K Z

    2005-04-15

    One problem with discriminant analysis of DNA microarray data is that each sample is represented by quite a large number of genes, and many of them are irrelevant, insignificant or redundant to the discriminant problem at hand. Methods for selecting important genes are, therefore, of much significance in microarray data analysis. In the present study, a new criterion, called LS Bound measure, is proposed to address the gene selection problem. The LS Bound measure is derived from leave-one-out procedure of LS-SVMs (least squares support vector machines), and as the upper bound for leave-one-out classification results it reflects to some extent the generalization performance of gene subsets. We applied this LS Bound measure for gene selection on two benchmark microarray datasets: colon cancer and leukemia. We also compared the LS Bound measure with other evaluation criteria, including the well-known Fisher's ratio and Mahalanobis class separability measure, and other published gene selection algorithms, including Weighting factor and SVM Recursive Feature Elimination. The strength of the LS Bound measure is that it provides gene subsets leading to more accurate classification results than the filter method while its computational complexity is at the level of the filter method. A companion website can be accessed at http://www.ntu.edu.sg/home5/pg02776030/lsbound/. The website contains: (1) the source code of the gene selection algorithm; (2) the complete set of tables and figures regarding the experimental study; (3) proof of the inequality (9). ekzmao@ntu.edu.sg.

  2. Issues Relating to Selective Reporting When Including Non-Randomized Studies in Systematic Reviews on the Effects of Healthcare Interventions

    ERIC Educational Resources Information Center

    Norris, Susan L.; Moher, David; Reeves, Barnaby C.; Shea, Beverley; Loke, Yoon; Garner, Sarah; Anderson, Laurie; Tugwell, Peter; Wells, George

    2013-01-01

    Background: Selective outcome and analysis reporting (SOR and SAR) occur when only a subset of outcomes measured and analyzed in a study is fully reported, and are an important source of potential bias. Key methodological issues: We describe what is known about the prevalence and effects of SOR and SAR in both randomized controlled trials (RCTs)…

  3. A Novel Inhibitor Of Topoisomerase I is Selectively Toxic For A Subset of Non-Small Cell Lung Cancer Cell Lines | Office of Cancer Genomics

    Cancer.gov

    SW044248, identified through a screen for chemicals that are selectively toxic for NSCLC cell lines, was found to rapidly inhibit macromolecular synthesis in sensitive, but not in insensitive cells. SW044248 killed approximately 15% of a panel of 74 NSCLC cell lines and was non-toxic to immortalized human bronchial cell lines.

  4. A method for simplifying the analysis of traffic accidents injury severity on two-lane highways using Bayesian networks.

    PubMed

    Mujalli, Randa Oqab; de Oña, Juan

    2011-10-01

    This study describes a method for reducing the number of variables frequently considered in modeling the severity of traffic accidents. The method's efficiency is assessed by constructing Bayesian networks (BN). It is based on a two stage selection process. Several variable selection algorithms, commonly used in data mining, are applied in order to select subsets of variables. BNs are built using the selected subsets and their performance is compared with the original BN (with all the variables) using five indicators. The BNs that improve the indicators' values are further analyzed for identifying the most significant variables (accident type, age, atmospheric factors, gender, lighting, number of injured, and occupant involved). A new BN is built using these variables, where the results of the indicators indicate, in most of the cases, a statistically significant improvement with respect to the original BN. It is possible to reduce the number of variables used to model traffic accidents injury severity through BNs without reducing the performance of the model. The study provides the safety analysts a methodology that could be used to minimize the number of variables used in order to determine efficiently the injury severity of traffic accidents without reducing the performance of the model. Copyright © 2011 Elsevier Ltd. All rights reserved.

  5. Feature selection in feature network models: finding predictive subsets of features with the Positive Lasso.

    PubMed

    Frank, Laurence E; Heiser, Willem J

    2008-05-01

    A set of features is the basis for the network representation of proximity data achieved by feature network models (FNMs). Features are binary variables that characterize the objects in an experiment, with some measure of proximity as response variable. Sometimes features are provided by theory and play an important role in the construction of the experimental conditions. In some research settings, the features are not known a priori. This paper shows how to generate features in this situation and how to select an adequate subset of features that takes into account a good compromise between model fit and model complexity, using a new version of least angle regression that restricts coefficients to be non-negative, called the Positive Lasso. It will be shown that features can be generated efficiently with Gray codes that are naturally linked to the FNMs. The model selection strategy makes use of the fact that FNM can be considered as univariate multiple regression model. A simulation study shows that the proposed strategy leads to satisfactory results if the number of objects is less than or equal to 22. If the number of objects is larger than 22, the number of features selected by our method exceeds the true number of features in some conditions.

  6. Toward a model for lexical access based on acoustic landmarks and distinctive features

    NASA Astrophysics Data System (ADS)

    Stevens, Kenneth N.

    2002-04-01

    This article describes a model in which the acoustic speech signal is processed to yield a discrete representation of the speech stream in terms of a sequence of segments, each of which is described by a set (or bundle) of binary distinctive features. These distinctive features specify the phonemic contrasts that are used in the language, such that a change in the value of a feature can potentially generate a new word. This model is a part of a more general model that derives a word sequence from this feature representation, the words being represented in a lexicon by sequences of feature bundles. The processing of the signal proceeds in three steps: (1) Detection of peaks, valleys, and discontinuities in particular frequency ranges of the signal leads to identification of acoustic landmarks. The type of landmark provides evidence for a subset of distinctive features called articulator-free features (e.g., [vowel], [consonant], [continuant]). (2) Acoustic parameters are derived from the signal near the landmarks to provide evidence for the actions of particular articulators, and acoustic cues are extracted by sampling selected attributes of these parameters in these regions. The selection of cues that are extracted depends on the type of landmark and on the environment in which it occurs. (3) The cues obtained in step (2) are combined, taking context into account, to provide estimates of ``articulator-bound'' features associated with each landmark (e.g., [lips], [high], [nasal]). These articulator-bound features, combined with the articulator-free features in (1), constitute the sequence of feature bundles that forms the output of the model. Examples of cues that are used, and justification for this selection, are given, as well as examples of the process of inferring the underlying features for a segment when there is variability in the signal due to enhancement gestures (recruited by a speaker to make a contrast more salient) or due to overlap of gestures from neighboring segments.

  7. Optimal selection of markers for validation or replication from genome-wide association studies.

    PubMed

    Greenwood, Celia M T; Rangrej, Jagadish; Sun, Lei

    2007-07-01

    With reductions in genotyping costs and the fast pace of improvements in genotyping technology, it is not uncommon for the individuals in a single study to undergo genotyping using several different platforms, where each platform may contain different numbers of markers selected via different criteria. For example, a set of cases and controls may be genotyped at markers in a small set of carefully selected candidate genes, and shortly thereafter, the same cases and controls may be used for a genome-wide single nucleotide polymorphism (SNP) association study. After such initial investigations, often, a subset of "interesting" markers is selected for validation or replication. Specifically, by validation, we refer to the investigation of associations between the selected subset of markers and the disease in independent data. However, it is not obvious how to choose the best set of markers for this validation. There may be a prior expectation that some sets of genotyping data are more likely to contain real associations. For example, it may be more likely for markers in plausible candidate genes to show disease associations than markers in a genome-wide scan. Hence, it would be desirable to select proportionally more markers from the candidate gene set. When a fixed number of markers are selected for validation, we propose an approach for identifying an optimal marker-selection configuration by basing the approach on minimizing the stratified false discovery rate. We illustrate this approach using a case-control study of colorectal cancer from Ontario, Canada, and we show that this approach leads to substantial reductions in the estimated false discovery rates in the Ontario dataset for the selected markers, as well as reductions in the expected false discovery rates for the proposed validation dataset. Copyright 2007 Wiley-Liss, Inc.

  8. Liver Transplantation for Hepatocellular Carcinoma beyond Milan Criteria: Multidisciplinary Approach to Improve Outcome

    PubMed Central

    Kornberg, A.

    2014-01-01

    The implementation of the Milan criteria (MC) in 1996 has dramatically improved prognosis after liver transplantation (LT) in patients with hepatocellular carcinoma (HCC). Liver transplantation has, thereby, become the standard therapy for patients with “early-stage” HCC on liver cirrhosis. The MC were consequently adopted by United Network of Organ Sharing (UNOS) and Eurotransplant for prioritization of patients with HCC. Recent advancements in the knowledge about tumor biology, radiographic imaging techniques, locoregional interventional treatments, and immunosuppressive medications have raised a critical discussion, if the MC might be too restrictive and unjustified keeping away many patients from potentially curative LT. Numerous transplant groups have, therefore, increasingly focussed on a stepwise expansion of selection criteria, mainly based on tumor macromorphology, such as size and number of HCC nodules. Against the background of a dramatic shortage of donor organs, however, simple expansion of tumor macromorphology may not be appropriate to create a safe extended criteria system. In contrast, rather the implementation of reliable prognostic parameters of tumor biology into selection process prior to LT is mandatory. Furthermore, a multidisciplinary approach of pre-, peri-, and posttransplant modulating of the tumor and/or the patient has to be established for improving prognosis in this special subset of patients. PMID:27335840

  9. Creating potentiometric surfaces from combined water well and oil well data in the midcontinent of the United States

    USGS Publications Warehouse

    Gianoutsos, Nicholas J.; Nelson, Philip H.

    2013-01-01

    For years, hydrologists have defined potentiometric surfaces using measured hydraulic-head values in water wells from aquifers. Down-dip, the oil and gas industry is also interested in the formation pressures of many of the same geologic formations for the purpose of hydrocarbon recovery. In oil and gas exploration, drillstem tests (DSTs) provide the formation pressure for a given depth interval in a well. These DST measurements can be used to calculate hydraulic-head values in deep hydrocarbon-bearing formations in areas where water wells do not exist. Unlike hydraulic-head measurements in water wells, which have a low number of problematic data points (outliers), only a small subset of the DST data measure true formation pressures. Using 3D imaging capabilities to view and clean the data, we have developed a process to estimate potentiometric surfaces from erratic DST data sets of hydrocarbon-bearing formations in the midcontinent of the U.S. The analysis indicates that the potentiometric surface is more readily defined through human interpretation of the chaotic DST data sets rather than through the application of filtering and geostatistical analysis. The data are viewed as a series of narrow, 400-mile-long swaths and a 2D viewer is used to select a subset of hydraulic-head values that represent the potentiometric surface. The user-selected subsets for each swath are then combined into one data set for each formation. These data are then joined with the hydraulic-head values from water wells to define the 3D potentiometric surfaces. The final product is an interactive, 3D digital display containing: (1) the subsurface structure of the formation, (2) the cluster of DST-derived hydraulic head values, (3) the user-selected subset of hydraulic-head values that define the potentiometric surface, (4) the hydraulic-head measurements from the corresponding shallow aquifer, (5) the resulting potentiometric surface encompassing both oil and gas and water wells, and (6) the land surface elevation of the region. Examples from the midcontinent of the United States, specifically Kansas, Oklahoma, and parts of adjacent states illustrate the process.

  10. Core Hunter 3: flexible core subset selection.

    PubMed

    De Beukelaer, Herman; Davenport, Guy F; Fack, Veerle

    2018-05-31

    Core collections provide genebank curators and plant breeders a way to reduce size of their collections and populations, while minimizing impact on genetic diversity and allele frequency. Many methods have been proposed to generate core collections, often using distance metrics to quantify the similarity of two accessions, based on genetic marker data or phenotypic traits. Core Hunter is a multi-purpose core subset selection tool that uses local search algorithms to generate subsets relying on one or more metrics, including several distance metrics and allelic richness. In version 3 of Core Hunter (CH3) we have incorporated two new, improved methods for summarizing distances to quantify diversity or representativeness of the core collection. A comparison of CH3 and Core Hunter 2 (CH2) showed that these new metrics can be effectively optimized with less complex algorithms, as compared to those used in CH2. CH3 is more effective at maximizing the improved diversity metric than CH2, still ensures a high average and minimum distance, and is faster for large datasets. Using CH3, a simple stochastic hill-climber is able to find highly diverse core collections, and the more advanced parallel tempering algorithm further increases the quality of the core and further reduces variability across independent samples. We also evaluate the ability of CH3 to simultaneously maximize diversity, and either representativeness or allelic richness, and compare the results with those of the GDOpt and SimEli methods. CH3 can sample equally representative cores as GDOpt, which was specifically designed for this purpose, and is able to construct cores that are simultaneously more diverse, and either are more representative or have higher allelic richness, than those obtained by SimEli. In version 3, Core Hunter has been updated to include two new core subset selection metrics that construct cores for representativeness or diversity, with improved performance. It combines and outperforms the strengths of other methods, as it (simultaneously) optimizes a variety of metrics. In addition, CH3 is an improvement over CH2, with the option to use genetic marker data or phenotypic traits, or both, and improved speed. Core Hunter 3 is freely available on http://www.corehunter.org .

  11. TCR tuning of T cell subsets.

    PubMed

    Cho, Jae-Ho; Sprent, Jonathan

    2018-05-01

    After selection in the thymus, the post-thymic T cell compartments comprise heterogenous subsets of naive and memory T cells that make continuous T cell receptor (TCR) contact with self-ligands bound to major histocompatibility complex (MHC) molecules. T cell recognition of self-MHC ligands elicits covert TCR signaling and is particularly important for controlling survival of naive T cells. Such tonic TCR signaling is tightly controlled and maintains the cells in a quiescent state to avoid autoimmunity. Here, we review how naive and memory T cells are differentially tuned and wired for TCR sensitivity to self and foreign ligands. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  12. Method and apparatus for wavefront sensing

    DOEpatents

    Bahk, Seung-Whan

    2016-08-23

    A method of measuring characteristics of a wavefront of an incident beam includes obtaining an interferogram associated with the incident beam passing through a transmission mask and Fourier transforming the interferogram to provide a frequency domain interferogram. The method also includes selecting a subset of harmonics from the frequency domain interferogram, individually inverse Fourier transforming each of the subset of harmonics to provide a set of spatial domain harmonics, and extracting a phase profile from each of the set of spatial domain harmonics. The method further includes removing phase discontinuities in the phase profile, rotating the phase profile, and reconstructing a phase front of the wavefront of the incident beam.

  13. New error calibration tests for gravity models using subset solutions and independent data - Applied to GEM-T3

    NASA Technical Reports Server (NTRS)

    Lerch, F. J.; Nerem, R. S.; Chinn, D. S.; Chan, J. C.; Patel, G. B.; Klosko, S. M.

    1993-01-01

    A new method has been developed to provide a direct test of the error calibrations of gravity models based on actual satellite observations. The basic approach projects the error estimates of the gravity model parameters onto satellite observations, and the results of these projections are then compared with data residual computed from the orbital fits. To allow specific testing of the gravity error calibrations, subset solutions are computed based on the data set and data weighting of the gravity model. The approach is demonstrated using GEM-T3 to show that the gravity error estimates are well calibrated and that reliable predictions of orbit accuracies can be achieved for independent orbits.

  14. Human attention filters for single colors

    PubMed Central

    Sun, Peng; Chubb, Charles; Wright, Charles E.; Sperling, George

    2016-01-01

    The visual images in the eyes contain much more information than the brain can process. An important selection mechanism is feature-based attention (FBA). FBA is best described by attention filters that specify precisely the extent to which items containing attended features are selectively processed and the extent to which items that do not contain the attended features are attenuated. The centroid-judgment paradigm enables quick, precise measurements of such human perceptual attention filters, analogous to transmission measurements of photographic color filters. Subjects use a mouse to locate the centroid—the center of gravity—of a briefly displayed cloud of dots and receive precise feedback. A subset of dots is distinguished by some characteristic, such as a different color, and subjects judge the centroid of only the distinguished subset (e.g., dots of a particular color). The analysis efficiently determines the precise weight in the judged centroid of dots of every color in the display (i.e., the attention filter for the particular attended color in that context). We report 32 attention filters for single colors. Attention filters that discriminate one saturated hue from among seven other equiluminant distractor hues are extraordinarily selective, achieving attended/unattended weight ratios >20:1. Attention filters for selecting a color that differs in saturation or lightness from distractors are much less selective than attention filters for hue (given equal discriminability of the colors), and their filter selectivities are proportional to the discriminability distance of neighboring colors, whereas in the same range hue attention-filter selectivity is virtually independent of discriminabilty. PMID:27791040

  15. Variable Selection through Correlation Sifting

    NASA Astrophysics Data System (ADS)

    Huang, Jim C.; Jojic, Nebojsa

    Many applications of computational biology require a variable selection procedure to sift through a large number of input variables and select some smaller number that influence a target variable of interest. For example, in virology, only some small number of viral protein fragments influence the nature of the immune response during viral infection. Due to the large number of variables to be considered, a brute-force search for the subset of variables is in general intractable. To approximate this, methods based on ℓ1-regularized linear regression have been proposed and have been found to be particularly successful. It is well understood however that such methods fail to choose the correct subset of variables if these are highly correlated with other "decoy" variables. We present a method for sifting through sets of highly correlated variables which leads to higher accuracy in selecting the correct variables. The main innovation is a filtering step that reduces correlations among variables to be selected, making the ℓ1-regularization effective for datasets on which many methods for variable selection fail. The filtering step changes both the values of the predictor variables and output values by projections onto components obtained through a computationally-inexpensive principal components analysis. In this paper we demonstrate the usefulness of our method on synthetic datasets and on novel applications in virology. These include HIV viral load analysis based on patients' HIV sequences and immune types, as well as the analysis of seasonal variation in influenza death rates based on the regions of the influenza genome that undergo diversifying selection in the previous season.

  16. Systematic Improvement of Potential-Derived Atomic Multipoles and Redundancy of the Electrostatic Parameter Space.

    PubMed

    Jakobsen, Sofie; Jensen, Frank

    2014-12-09

    We assess the accuracy of force field (FF) electrostatics at several levels of approximation from the standard model using fixed partial charges to conformational specific multipole fits including up to quadrupole moments. Potential-derived point charges and multipoles are calculated using least-squares methods for a total of ∼1000 different conformations of the 20 natural amino acids. Opposed to standard charge fitting schemes the procedure presented in the current work employs fitting points placed on a single isodensity surface, since the electrostatic potential (ESP) on such a surface determines the ESP at all points outside this surface. We find that the effect of multipoles beyond partial atomic charges is of the same magnitude as the effect due to neglecting conformational dependency (i.e., polarizability), suggesting that the two effects should be included at the same level in FF development. The redundancy at both the partial charge and multipole levels of approximation is quantified. We present an algorithm which stepwise reduces or increases the dimensionality of the charge or multipole parameter space and provides an upper limit of the ESP error that can be obtained at a given truncation level. Thereby, we can identify a reduced set of multipole moments corresponding to ∼40% of the total number of multipoles. This subset of parameters provides a significant improvement in the representation of the ESP compared to the simple point charge model and close to the accuracy obtained using the complete multipole parameter space. The selection of the ∼40% most important multipole sites is highly transferable among different conformations, and we find that quadrupoles are of high importance for atoms involved in π-bonding, since the anisotropic electric field generated in such regions requires a large degree of flexibility.

  17. Classification of toxicity effects of biotransformed hepatic drugs using whale optimized support vector machines.

    PubMed

    Tharwat, Alaa; Moemen, Yasmine S; Hassanien, Aboul Ella

    2017-04-01

    Measuring toxicity is an important step in drug development. Nevertheless, the current experimental methods used to estimate the drug toxicity are expensive and time-consuming, indicating that they are not suitable for large-scale evaluation of drug toxicity in the early stage of drug development. Hence, there is a high demand to develop computational models that can predict the drug toxicity risks. In this study, we used a dataset that consists of 553 drugs that biotransformed in liver. The toxic effects were calculated for the current data, namely, mutagenic, tumorigenic, irritant and reproductive effect. Each drug is represented by 31 chemical descriptors (features). The proposed model consists of three phases. In the first phase, the most discriminative subset of features is selected using rough set-based methods to reduce the classification time while improving the classification performance. In the second phase, different sampling methods such as Random Under-Sampling, Random Over-Sampling and Synthetic Minority Oversampling Technique (SMOTE), BorderLine SMOTE and Safe Level SMOTE are used to solve the problem of imbalanced dataset. In the third phase, the Support Vector Machines (SVM) classifier is used to classify an unknown drug into toxic or non-toxic. SVM parameters such as the penalty parameter and kernel parameter have a great impact on the classification accuracy of the model. In this paper, Whale Optimization Algorithm (WOA) has been proposed to optimize the parameters of SVM, so that the classification error can be reduced. The experimental results proved that the proposed model achieved high sensitivity to all toxic effects. Overall, the high sensitivity of the WOA+SVM model indicates that it could be used for the prediction of drug toxicity in the early stage of drug development. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. REGIONAL PRIORITIZATIONS OF BIODIVERSITY AND SEDIMENT RETENTION FUNCTIONS: FINDINGS AND MANAGEMENT RELEVANCE

    EPA Science Inventory

    The synoptic approach is a landscape-level assessment tool for geographic prioritization of wetland protection and restoration efforts. Prioritization becomes necessary when effort ? including time and money ? is limited, forcing managers to select a subset of locations. The ap...

  19. AN ARCGIS TOOL FOR CREATING POPULATIONS OF WATERSHEDS

    EPA Science Inventory

    For the Landscape Investigations for Pesticides Study in the Midwest, the goal is to sample a representative subset of watersheds selected statistically from a target population of watersheds within the glaciated corn belt. This area stretches from Ohio to Iowa and includes parts...

  20. MISR Regional GoMACCS Imagery Overview

    Atmospheric Science Data Center

    2016-08-24

    ... View Data  |  Download Data About this Web Site: Visualizations of select MISR Level 3 data for special regional ... version used in support of the GoMACCS Campaign. More information about the Level 1 and Level 2 products subsetted for the GoMACCS ...

  1. An evaluation of the genetic-matched pair study design using genome-wide SNP data from the European population.

    PubMed

    Lu, Timothy Tehua; Lao, Oscar; Nothnagel, Michael; Junge, Olaf; Freitag-Wolf, Sandra; Caliebe, Amke; Balascakova, Miroslava; Bertranpetit, Jaume; Bindoff, Laurence Albert; Comas, David; Holmlund, Gunilla; Kouvatsi, Anastasia; Macek, Milan; Mollet, Isabelle; Nielsen, Finn; Parson, Walther; Palo, Jukka; Ploski, Rafal; Sajantila, Antti; Tagliabracci, Adriano; Gether, Ulrik; Werge, Thomas; Rivadeneira, Fernando; Hofman, Albert; Uitterlinden, André Gerardus; Gieger, Christian; Wichmann, Heinz-Erich; Ruether, Andreas; Schreiber, Stefan; Becker, Christian; Nürnberg, Peter; Nelson, Matthew Roberts; Kayser, Manfred; Krawczak, Michael

    2009-07-01

    Genetic matching potentially provides a means to alleviate the effects of incomplete Mendelian randomization in population-based gene-disease association studies. We therefore evaluated the genetic-matched pair study design on the basis of genome-wide SNP data (309,790 markers; Affymetrix GeneChip Human Mapping 500K Array) from 2457 individuals, sampled at 23 different recruitment sites across Europe. Using pair-wise identity-by-state (IBS) as a matching criterion, we tried to derive a subset of markers that would allow identification of the best overall matching (BOM) partner for a given individual, based on the IBS status for the subset alone. However, our results suggest that, by following this approach, the prediction accuracy is only notably improved by the first 20 markers selected, and increases proportionally to the marker number thereafter. Furthermore, in a considerable proportion of cases (76.0%), the BOM of a given individual, based on the complete marker set, came from a different recruitment site than the individual itself. A second marker set, specifically selected for ancestry sensitivity using singular value decomposition, performed even more poorly and was no more capable of predicting the BOM than randomly chosen subsets. This leads us to conclude that, at least in Europe, the utility of the genetic-matched pair study design depends critically on the availability of comprehensive genotype information for both cases and controls.

  2. The cytotoxic action of the CD56+ fraction of cytokine-induced killer cells against a K562 cell line is mainly restricted to the natural killer cell subset.

    PubMed

    Chieregato, Katia; Zanon, Cristina; Castegnaro, Silvia; Bernardi, Martina; Amati, Eliana; Sella, Sabrina; Rodeghiero, Francesco; Astori, Giuseppe

    2017-01-01

    Cytokine-induced killer cells are polyclonal T cells generated ex vivo and comprise two main subsets: the CD56- fraction, possessing an alloreactive potential caused by T cells (CD3+CD56-), and the CD56+ fraction, characterised by a strong antitumour capacity induced by natural killer-like T cells (NK-like T, CD3+CD56+) and natural killer cells (NK, CD3-CD56+ bright). We investigated the cytotoxic action of selected CD56+ cell subpopulations against a human chronic myeloid leukaemia (K562) cell line. After immunomagnetic selection of the CD56+ cell fraction, NK bright cells (CD3-CD56+ bright) and two subsets of NK-like T cells (CD3+CD56+), called NK-like T CD56 dim and NK-like T CD56 bright, could be identified. The cytotoxic effect against K562 cells was mainly exerted by the NK bright subpopulation and resulted to be inversely correlated with the percentage of NK-like T CD56 dim cells in the culture. The lytic action appeared to be independent of cell degranulation as suggested by the lack of change in the expression of CD107a. We conclude that the cytotoxic action of CD56+ cells against a K562 cell line is mainly due to the NK cells.

  3. Predictive equations for the estimation of body size in seals and sea lions (Carnivora: Pinnipedia)

    PubMed Central

    Churchill, Morgan; Clementz, Mark T; Kohno, Naoki

    2014-01-01

    Body size plays an important role in pinniped ecology and life history. However, body size data is often absent for historical, archaeological, and fossil specimens. To estimate the body size of pinnipeds (seals, sea lions, and walruses) for today and the past, we used 14 commonly preserved cranial measurements to develop sets of single variable and multivariate predictive equations for pinniped body mass and total length. Principal components analysis (PCA) was used to test whether separate family specific regressions were more appropriate than single predictive equations for Pinnipedia. The influence of phylogeny was tested with phylogenetic independent contrasts (PIC). The accuracy of these regressions was then assessed using a combination of coefficient of determination, percent prediction error, and standard error of estimation. Three different methods of multivariate analysis were examined: bidirectional stepwise model selection using Akaike information criteria; all-subsets model selection using Bayesian information criteria (BIC); and partial least squares regression. The PCA showed clear discrimination between Otariidae (fur seals and sea lions) and Phocidae (earless seals) for the 14 measurements, indicating the need for family-specific regression equations. The PIC analysis found that phylogeny had a minor influence on relationship between morphological variables and body size. The regressions for total length were more accurate than those for body mass, and equations specific to Otariidae were more accurate than those for Phocidae. Of the three multivariate methods, the all-subsets approach required the fewest number of variables to estimate body size accurately. We then used the single variable predictive equations and the all-subsets approach to estimate the body size of two recently extinct pinniped taxa, the Caribbean monk seal (Monachus tropicalis) and the Japanese sea lion (Zalophus japonicus). Body size estimates using single variable regressions generally under or over-estimated body size; however, the all-subset regression produced body size estimates that were close to historically recorded body length for these two species. This indicates that the all-subset regression equations developed in this study can estimate body size accurately. PMID:24916814

  4. Effect of Cytomegalovirus Co-Infection on Normalization of Selected T-Cell Subsets in Children with Perinatally Acquired HIV Infection Treated with Combination Antiretroviral Therapy

    PubMed Central

    Kapetanovic, Suad; Aaron, Lisa; Montepiedra, Grace; Anthony, Patricia; Thuvamontolrat, Kasalyn; Pahwa, Savita; Burchett, Sandra; Weinberg, Adriana; Kovacs, Andrea

    2015-01-01

    Background We examined the effect of cytomegalovirus (CMV) co-infection and viremia on reconstitution of selected CD4+ and CD8+ T-cell subsets in perinatally HIV-infected (PHIV+) children ≥ 1-year old who participated in a partially randomized, open-label, 96-week combination antiretroviral therapy (cART)-algorithm study. Methods Participants were categorized as CMV-naïve, CMV-positive (CMV+) viremic, and CMV+ aviremic, based on blood, urine, or throat culture, CMV IgG and DNA polymerase chain reaction measured at baseline. At weeks 0, 12, 20 and 40, T-cell subsets including naïve (CD62L+CD45RA+; CD95-CD28+), activated (CD38+HLA-DR+) and terminally differentiated (CD62L-CD45RA+; CD95+CD28-) CD4+ and CD8+ T-cells were measured by flow cytometry. Results Of the 107 participants included in the analysis, 14% were CMV+ viremic; 49% CMV+ aviremic; 37% CMV-naïve. In longitudinal adjusted models, compared with CMV+ status, baseline CMV-naïve status was significantly associated with faster recovery of CD8+CD62L+CD45RA+% and CD8+CD95-CD28+% and faster decrease of CD8+CD95+CD28-%, independent of HIV VL response to treatment, cART regimen and baseline CD4%. Surprisingly, CMV status did not have a significant impact on longitudinal trends in CD8+CD38+HLA-DR+%. CMV status did not have a significant impact on any CD4+ T-cell subsets. Conclusions In this cohort of PHIV+ children, the normalization of naïve and terminally differentiated CD8+ T-cell subsets in response to cART was detrimentally affected by the presence of CMV co-infection. These findings may have implications for adjunctive treatment strategies targeting CMV co-infection in PHIV+ children, especially those that are now adults or reaching young adulthood and may have accelerated immunologic aging, increased opportunistic infections and aging diseases of the immune system. PMID:25794163

  5. Selective accumulation of langerhans-type dendritic cells in small airways of patients with COPD

    PubMed Central

    2010-01-01

    Background Dendritic cells (DC) linking innate and adaptive immune responses are present in human lungs, but the characterization of different subsets and their role in COPD pathogenesis remain to be elucidated. The aim of this study is to characterize and quantify pulmonary myeloid DC subsets in small airways of current and ex-smokers with or without COPD. Methods Myeloid DC were characterized using flowcytometry on single cell suspensions of digested human lung tissue. Immunohistochemical staining for langerin, BDCA-1, CD1a and DC-SIGN was performed on surgical resection specimens from 85 patients. Expression of factors inducing Langerhans-type DC (LDC) differentiation was evaluated by RT-PCR on total lung RNA. Results Two segregated subsets of tissue resident pulmonary myeloid DC were identified in single cell suspensions by flowcytometry: the langerin+ LDC and the DC-SIGN+ interstitial-type DC (intDC). LDC partially expressed the markers CD1a and BDCA-1, which are also present on their known blood precursors. In contrast, intDC did not express langerin, CD1a or BDCA-1, but were more closely related to monocytes. Quantification of DC in the small airways by immunohistochemistry revealed a higher number of LDC in current smokers without COPD and in COPD patients compared to never smokers and ex-smokers without COPD. Importantly, there was no difference in the number of LDC between current and ex-smoking COPD patients. In contrast, the number of intDC did not differ between study groups. Interestingly, the number of BDCA-1+ DC was significantly lower in COPD patients compared to never smokers and further decreased with the severity of the disease. In addition, the accumulation of LDC in the small airways significantly correlated with the expression of the LDC inducing differentiation factor activin-A. Conclusions Myeloid DC differentiation is altered in small airways of current smokers and COPD patients resulting in a selective accumulation of the LDC subset which correlates with the pulmonary expression of the LDC-inducing differentiation factor activin-A. This study identified the LDC subset as an interesting focus for future research in COPD pathogenesis. PMID:20307269

  6. The evolutionary sequence of post-starburst galaxies

    NASA Astrophysics Data System (ADS)

    Wilkinson, C. L.; Pimbblet, K. A.; Stott, J. P.

    2017-12-01

    There are multiple ways in which to select post-starburst galaxies in the literature. In this work, we present a study into how two well-used selection techniques have consequences on observable post-starburst galaxy parameters, such as colour, morphology and environment, and how this affects interpretations of their role in the galaxy duty cycle. We identify a master sample of H δ strong (EWH δ > 3Å) post-starburst galaxies from the value-added catalogue in the seventh data release of the Sloan Digital Sky Survey (SDSS DR7) over a redshift range 0.01 < z < 0.1. From this sample we select two E+A subsets, both having a very little [O II] emission (EW_[O II] > -2.5 Å) but one having an additional cut on EWHα (>-3 Å). We examine the differences in observables and AGN fractions to see what effect the H α cut has on the properties of post-starburst galaxies and what these differing samples can tell us about the duty cycle of post-starburst galaxies. We find that H δ strong galaxies peak in the 'blue cloud', E+As in the 'green valley' and pure E+As in the 'red sequence'. We also find that pure E+As have a more early-type morphology and a higher fraction in denser environments compared with the H δ strong and E+A galaxies. These results suggest that there is an evolutionary sequence in the post-starburst phase from blue discy galaxies with residual star formation to passive red early-types.

  7. A genetic programming approach to oral cancer prognosis.

    PubMed

    Tan, Mei Sze; Tan, Jing Wei; Chang, Siow-Wee; Yap, Hwa Jen; Abdul Kareem, Sameem; Zain, Rosnah Binti

    2016-01-01

    The potential of genetic programming (GP) on various fields has been attained in recent years. In bio-medical field, many researches in GP are focused on the recognition of cancerous cells and also on gene expression profiling data. In this research, the aim is to study the performance of GP on the survival prediction of a small sample size of oral cancer prognosis dataset, which is the first study in the field of oral cancer prognosis. GP is applied on an oral cancer dataset that contains 31 cases collected from the Malaysia Oral Cancer Database and Tissue Bank System (MOCDTBS). The feature subsets that is automatically selected through GP were noted and the influences of this subset on the results of GP were recorded. In addition, a comparison between the GP performance and that of the Support Vector Machine (SVM) and logistic regression (LR) are also done in order to verify the predictive capabilities of the GP. The result shows that GP performed the best (average accuracy of 83.87% and average AUROC of 0.8341) when the features selected are smoking, drinking, chewing, histological differentiation of SCC, and oncogene p63. In addition, based on the comparison results, we found that the GP outperformed the SVM and LR in oral cancer prognosis. Some of the features in the dataset are found to be statistically co-related. This is because the accuracy of the GP prediction drops when one of the feature in the best feature subset is excluded. Thus, GP provides an automatic feature selection function, which chooses features that are highly correlated to the prognosis of oral cancer. This makes GP an ideal prediction model for cancer clinical and genomic data that can be used to aid physicians in their decision making stage of diagnosis or prognosis.

  8. CD3-T cell receptor modulation is selectively induced in CD8 but not CD4 lymphocytes cultured in agar.

    PubMed Central

    Oudrhiri, N; Farcet, J P; Gourdin, M F; M'Bemba, E; Gaulard, P; Katz, A; Divine, M; Galazka, A; Reyes, F

    1990-01-01

    The CD3-T cell receptor (TcR) complex is central to the immune response. Upon binding by specific ligands, internalized CD3-TcR molecules increase, and either T cell response or unresponsiveness may ensue depending on the triggering conditions. Using semi-solid agar culture, we have shown previously that quiescent CD4 but not CD8 lymphocytes generate clonal colonies under phytohaemagglutinin stimulation. Here we have demonstrated that the agar induces selective CD3-TcR modulation in the CD8 and not in the CD4 subset. CD8 lymphocytes preactivated in liquid culture and recultured in agar with exogenous recombinant interleukin-2 generate colonies with a modulated CD3-TcR surface expression. The peptides composing the CD3-TcR complex are synthesized in CD8 colonies as well as in CD4; however, the CD3 gamma chain is phosphorylated at a higher level in CD8 colonies. A component of the agar polymer, absent in agarose, appears to be the ligand that induces differential CD3-TcR modulation in the CD8 subset. In contrast to agar culture, CD8 colonies can be derived from quiescent CD8 lymphocytes in agarose. These CD8 colonies express unmodulated CD-TcR. CD3-TcR modulation with anti-CD3 monoclonal antibody prior to culturing in agarose inhibits the colony formation. We conclude that given triggering conditions can result in both CD3-TcR modulation and inhibition of the proliferative response selectively in the CD8 lymphocyte subset and not in the CD4. Images Fig. 3 Fig. 4 Fig. 5 PMID:2146997

  9. Mutagenicity and in vivo toxicity of combined particulate and semivolatile organic fractions of gasoline and diesel engine emissions.

    PubMed

    Seagrave, JeanClare; McDonald, Jacob D; Gigliotti, Andrew P; Nikula, Kristen J; Seilkop, Steven K; Gurevich, Michael; Mauderly, Joe L

    2002-12-01

    Exposure to engine emissions is associated with adverse health effects. However, little is known about the relative effects of emissions produced by different operating conditions, fuels, or technologies. Rapid screening techniques are needed to compare the biological effects of emissions with different characteristics. Here, we examined a set of engine emission samples using conventional bioassays. The samples included combined particulate material and semivolatile organic compound fractions of emissions collected from normal- and high-emitter gasoline and diesel vehicles collected at 72 degrees F, and from normal-emitter groups collected at 30 degrees F. The relative potency of the samples was determined by statistical analysis of the dose-response curves. All samples induced bacterial mutagenicity, with a 10-fold range of potency among the samples. Responses to intratracheal instillation in rats indicated generally parallel rankings of the samples by multiple endpoints reflecting cytotoxic, inflammatory, and lung parenchymal changes, allowing selection of a more limited set of parameters for future studies. The parameters selected to assess oxidative stress and macrophage function yielded little useful information. Responses to instillation indicated little difference in potency per unit of combined particulate material and semivolatile organic compound mass between normal-emitter gasoline and diesel vehicles, or between emissions collected at different temperatures. However, equivalent masses of emissions from high-emitter vehicles of both types were more potent than those from normal-emitters. While preliminary in terms of assessing contributions of different emissions to health hazards, the results indicate that a subset of this panel of assays will be useful in providing rapid, cost-effective feedback on the biological impact of modified technology.

  10. Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine

    PubMed Central

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Garshasbi, Masoud

    2018-01-01

    Background: Gene expression data are characteristically high dimensional with a small sample size in contrast to the feature size and variability inherent in biological processes that contribute to difficulties in analysis. Selection of highly discriminative features decreases the computational cost and complexity of the classifier and improves its reliability for prediction of a new class of samples. Methods: The present study used hybrid particle swarm optimization and genetic algorithms for gene selection and a fuzzy support vector machine (SVM) as the classifier. Fuzzy logic is used to infer the importance of each sample in the training phase and decrease the outlier sensitivity of the system to increase the ability to generalize the classifier. A decision-tree algorithm was applied to the most frequent genes to develop a set of rules for each type of cancer. This improved the abilities of the algorithm by finding the best parameters for the classifier during the training phase without the need for trial-and-error by the user. The proposed approach was tested on four benchmark gene expression profiles. Results: Good results have been demonstrated for the proposed algorithm. The classification accuracy for leukemia data is 100%, for colon cancer is 96.67% and for breast cancer is 98%. The results show that the best kernel used in training the SVM classifier is the radial basis function. Conclusions: The experimental results show that the proposed algorithm can decrease the dimensionality of the dataset, determine the most informative gene subset, and improve classification accuracy using the optimal parameters of the classifier with no user interface. PMID:29535919

  11. Targeted depletion of a MDSC subset unmasks pancreatic ductal adenocarcinoma to adaptive immunity

    PubMed Central

    Stromnes, Ingunn M.; Brockenbrough, Scott; Izeradjene, Kamel; Carlson, Markus A.; Cuevas, Carlos; Simmons, Randi M.; Greenberg, Philip D.; Hingorani, Sunil R.

    2015-01-01

    Objective Pancreatic ductal adenocarcinoma (PDA) is characterized by a robust desmoplasia, including the notable accumulation of immunosuppressive cells that shield neoplastic cells from immune detection. Immune evasion may be further enhanced if the malignant cells fail to express high levels of antigens that are sufficiently immunogenic to engender an effector T cell response. In this report, we investigate the predominant subsets of immunosuppressive cancer-conditioned myeloid cells that chronicle and shape pancreas cancer progression. We show that selective depletion of one subset of myeloid-derived suppressor cells (MDSC) in an autochthonous, genetically engineered mouse model (GEMM) of PDA unmasks the ability of the adaptive immune response to engage and target tumor epithelial cells. Methods A combination of in vivo and in vitro studies were performed employing a GEMM that faithfully recapitulates the cardinal features of human PDA. The predominant cancer-conditioned myeloid cell subpopulation was specifically targeted in vivo and the biological outcomes determined. Results PDA orchestrates the induction of distinct subsets of cancer-associated myeloid cells through the production of factors known to influence myelopoeisis. These immature myeloid cells inhibit the proliferation and induce apoptosis of activated T cells. Targeted depletion of granulocytic MDSC (Gr-MDSC) in autochthonous PDA increases the intratumoral accumulation of activated CD8 T cells and apoptosis of tumor epithelial cells, and also remodels the tumor stroma. Conclusions Neoplastic ductal cells of the pancreas induce distinct myeloid cell subsets that promote tumor cell survival and accumulation. Targeted depletion of a single myeloid subset, the Gr-MDSC, can unmask an endogenous T cell response, revealing an unexpected latent immunity and invoking targeting of Gr-MDSC as a potential strategy to exploit for treating this highly lethal disease. PMID:24555999

  12. RSV Vaccine-Enhanced Disease Is Orchestrated by the Combined Actions of Distinct CD4 T Cell Subsets

    PubMed Central

    Knudson, Cory J.; Hartwig, Stacey M.; Meyerholz, David K.; Varga, Steven M.

    2015-01-01

    There is no currently licensed vaccine for respiratory syncytial virus (RSV) despite being the leading cause of lower respiratory tract infections in children. Children previously immunized with a formalin-inactivated RSV (FI-RSV) vaccine exhibited enhanced respiratory disease following natural RSV infection. Subsequent studies in animal models have implicated roles for CD4 T cells, eosinophils and non-neutralizing antibodies in mediating enhanced respiratory disease. However, the underlying immunological mechanisms responsible for the enhanced respiratory disease and other disease manifestations associated with FI-RSV vaccine-enhanced disease remain unclear. We demonstrate for the first time that while CD4 T cells mediate all aspects of vaccine-enhanced disease, distinct CD4 T cell subsets orchestrate discrete and specific disease parameters. A Th2-biased immune response, but not eosinophils specifically, was required for airway hyperreactivity and mucus hypersecretion. In contrast, the Th1-associated cytokine TNF-α was necessary to mediate airway obstruction and weight loss. Our data demonstrate that individual disease manifestations associated with FI-RSV vaccine-enhanced disease are mediated by distinct subsets of CD4 T cells. PMID:25769044

  13. Restoration of STORM images from sparse subset of localizations (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Moiseev, Alexander A.; Gelikonov, Grigory V.; Gelikonov, Valentine M.

    2016-02-01

    To construct a Stochastic Optical Reconstruction Microscopy (STORM) image one should collect sufficient number of localized fluorophores to satisfy Nyquist criterion. This requirement limits time resolution of the method. In this work we propose a probabalistic approach to construct STORM images from a subset of localized fluorophores 3-4 times sparser than required from Nyquist criterion. Using a set of STORM images constructed from number of localizations sufficient for Nyquist criterion we derive a model which allows us to predict the probability for every location to be occupied by a fluorophore at the end of hypothetical acquisition, having as an input parameters distribution of already localized fluorophores in the proximity of this location. We show that probability map obtained from number of fluorophores 3-4 times less than required by Nyquist criterion may be used as superresolution image itself. Thus we are able to construct STORM image from a subset of localized fluorophores 3-4 times sparser than required from Nyquist criterion, proportionaly decreasing STORM data acquisition time. This method may be used complementary with other approaches desined for increasing STORM time resolution.

  14. Information Fusion for High Level Situation Assessment and Prediction

    DTIC Science & Technology

    2007-03-01

    procedure includes deciding a sensor set that achieves the optimal trade -off between its cost and benefit, activating the identified sensors, integrating...and effective decision can be made by dynamic inference based on selecting a subset of sensors with the optimal trade -off between their cost and...first step is achieved by designing a sensor selection criterion that represents the trade -off between the sensor benefit and sensor cost. This is then

  15. A Deficit in Older Adults' Effortful Selection of Cued Responses

    PubMed Central

    Proctor, Robert W.; Vu, Kim-Phuong L.; Pick, David F.

    2007-01-01

    J. J. Adam et al. (1998) provided evidence for an “age-related deficit in preparing 2 fingers on 2 hands, but not on 1 hand” (p. 870). Instead of having an anatomical basis, the deficit could result from the effortful processing required for individuals to select cued subsets of responses that do not coincide with left and right subgroups. The deficit also could involve either the ultimate benefit that can be attained or the time required to attain that benefit. The authors report 3 experiments (Ns = 40, 48, and 32 participants, respectively) in which they tested those distinctions by using an overlapped hand placement (participants alternated the index and middle fingers of the hands), a normal hand placement, and longer precuing intervals than were used in previous studies. The older adults were able to achieve the full precuing benefit shown by younger adults but required longer to achieve the maximal benefit for most pairs of responses. The deficit did not depend on whether the responses were from different hands, suggesting that it lies primarily in the effortful processing required for those subsets of cued responses that are not selected easily. PMID:16801319

  16. Analysis of the single-vehicle cyclic inventory routing problem

    NASA Astrophysics Data System (ADS)

    Aghezzaf, El-Houssaine; Zhong, Yiqing; Raa, Birger; Mateo, Manel

    2012-11-01

    The single-vehicle cyclic inventory routing problem (SV-CIRP) consists of a repetitive distribution of a product from a single depot to a selected subset of customers. For each customer, selected for replenishments, the supplier collects a corresponding fixed reward. The objective is to determine the subset of customers to replenish, the quantity of the product to be delivered to each and to design the vehicle route so that the resulting profit (difference between the total reward and the total logistical cost) is maximised while preventing stockouts at each of the selected customers. This problem appears often as a sub-problem in many logistical problems. In this article, the SV-CIRP is formulated as a mixed-integer program with a nonlinear objective function. After a thorough analysis of the structure of the problem and its features, an exact algorithm for its solution is proposed. This exact algorithm requires only solutions of linear mixed-integer programs. Values of a savings-based heuristic for this problem are compared to the optimal values obtained for a set of some test problems. In general, the gap may get as large as 25%, which justifies the effort to continue exploring and developing exact and approximation algorithms for the SV-CIRP.

  17. An Adaptive Genetic Association Test Using Double Kernel Machines

    PubMed Central

    Zhan, Xiang; Epstein, Michael P.; Ghosh, Debashis

    2014-01-01

    Recently, gene set-based approaches have become very popular in gene expression profiling studies for assessing how genetic variants are related to disease outcomes. Since most genes are not differentially expressed, existing pathway tests considering all genes within a pathway suffer from considerable noise and power loss. Moreover, for a differentially expressed pathway, it is of interest to select important genes that drive the effect of the pathway. In this article, we propose an adaptive association test using double kernel machines (DKM), which can both select important genes within the pathway as well as test for the overall genetic pathway effect. This DKM procedure first uses the garrote kernel machines (GKM) test for the purposes of subset selection and then the least squares kernel machine (LSKM) test for testing the effect of the subset of genes. An appealing feature of the kernel machine framework is that it can provide a flexible and unified method for multi-dimensional modeling of the genetic pathway effect allowing for both parametric and nonparametric components. This DKM approach is illustrated with application to simulated data as well as to data from a neuroimaging genetics study. PMID:26640602

  18. RELAX: detecting relaxed selection in a phylogenetic framework.

    PubMed

    Wertheim, Joel O; Murrell, Ben; Smith, Martin D; Kosakovsky Pond, Sergei L; Scheffler, Konrad

    2015-03-01

    Relaxation of selective strength, manifested as a reduction in the efficiency or intensity of natural selection, can drive evolutionary innovation and presage lineage extinction or loss of function. Mechanisms through which selection can be relaxed range from the removal of an existing selective constraint to a reduction in effective population size. Standard methods for estimating the strength and extent of purifying or positive selection from molecular sequence data are not suitable for detecting relaxed selection, because they lack power and can mistake an increase in the intensity of positive selection for relaxation of both purifying and positive selection. Here, we present a general hypothesis testing framework (RELAX) for detecting relaxed selection in a codon-based phylogenetic framework. Given two subsets of branches in a phylogeny, RELAX can determine whether selective strength was relaxed or intensified in one of these subsets relative to the other. We establish the validity of our test via simulations and show that it can distinguish between increased positive selection and a relaxation of selective strength. We also demonstrate the power of RELAX in a variety of biological scenarios where relaxation of selection has been hypothesized or demonstrated previously. We find that obligate and facultative γ-proteobacteria endosymbionts of insects are under relaxed selection compared with their free-living relatives and obligate endosymbionts are under relaxed selection compared with facultative endosymbionts. Selective strength is also relaxed in asexual Daphnia pulex lineages, compared with sexual lineages. Endogenous, nonfunctional, bornavirus-like elements are found to be under relaxed selection compared with exogenous Borna viruses. Finally, selection on the short-wavelength sensitive, SWS1, opsin genes in echolocating and nonecholocating bats is relaxed only in lineages in which this gene underwent pseudogenization; however, selection on the functional medium/long-wavelength sensitive opsin, M/LWS1, is found to be relaxed in all echolocating bats compared with nonecholocating bats. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Summary of the DREAM8 Parameter Estimation Challenge: Toward Parameter Identification for Whole-Cell Models.

    PubMed

    Karr, Jonathan R; Williams, Alex H; Zucker, Jeremy D; Raue, Andreas; Steiert, Bernhard; Timmer, Jens; Kreutz, Clemens; Wilkinson, Simon; Allgood, Brandon A; Bot, Brian M; Hoff, Bruce R; Kellen, Michael R; Covert, Markus W; Stolovitzky, Gustavo A; Meyer, Pablo

    2015-05-01

    Whole-cell models that explicitly represent all cellular components at the molecular level have the potential to predict phenotype from genotype. However, even for simple bacteria, whole-cell models will contain thousands of parameters, many of which are poorly characterized or unknown. New algorithms are needed to estimate these parameters and enable researchers to build increasingly comprehensive models. We organized the Dialogue for Reverse Engineering Assessments and Methods (DREAM) 8 Whole-Cell Parameter Estimation Challenge to develop new parameter estimation algorithms for whole-cell models. We asked participants to identify a subset of parameters of a whole-cell model given the model's structure and in silico "experimental" data. Here we describe the challenge, the best performing methods, and new insights into the identifiability of whole-cell models. We also describe several valuable lessons we learned toward improving future challenges. Going forward, we believe that collaborative efforts supported by inexpensive cloud computing have the potential to solve whole-cell model parameter estimation.

  20. Summary of the DREAM8 Parameter Estimation Challenge: Toward Parameter Identification for Whole-Cell Models

    PubMed Central

    Karr, Jonathan R.; Williams, Alex H.; Zucker, Jeremy D.; Raue, Andreas; Steiert, Bernhard; Timmer, Jens; Kreutz, Clemens; Wilkinson, Simon; Allgood, Brandon A.; Bot, Brian M.; Hoff, Bruce R.; Kellen, Michael R.; Covert, Markus W.; Stolovitzky, Gustavo A.; Meyer, Pablo

    2015-01-01

    Whole-cell models that explicitly represent all cellular components at the molecular level have the potential to predict phenotype from genotype. However, even for simple bacteria, whole-cell models will contain thousands of parameters, many of which are poorly characterized or unknown. New algorithms are needed to estimate these parameters and enable researchers to build increasingly comprehensive models. We organized the Dialogue for Reverse Engineering Assessments and Methods (DREAM) 8 Whole-Cell Parameter Estimation Challenge to develop new parameter estimation algorithms for whole-cell models. We asked participants to identify a subset of parameters of a whole-cell model given the model’s structure and in silico “experimental” data. Here we describe the challenge, the best performing methods, and new insights into the identifiability of whole-cell models. We also describe several valuable lessons we learned toward improving future challenges. Going forward, we believe that collaborative efforts supported by inexpensive cloud computing have the potential to solve whole-cell model parameter estimation. PMID:26020786

  1. Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model

    PubMed Central

    Tan, Maxine; Pu, Jiantao; Zheng, Bin

    2014-01-01

    Purpose: Improving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis (CAD) schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification. Methods: We initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features, and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest (ROIs), we performed the study using a ten-fold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only. Results: The area under the receiver operating characteristic curve (AUC) = 0.805±0.012 was obtained for the classification task. The results also showed that the most frequently-selected features by the SFFS-based algorithm in 10-fold iterations were those related to mass shape, isodensity and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions. Conclusions: In conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be useful as a “second reader” in future clinical practice. PMID:24664267

  2. DETERMINATION OF NEW CARBONYL-CONTAINING DISINFECTION BY-PRODUCTS IN DRINKING WATER

    EPA Science Inventory

    Only a subset of all disinfection by-products were targeted for an intense occurrence study during the Information Collection Rule. Among 50 additional compounds selected for study because of their potential for significant toxicity, a group of carbonyl-containing compounds is be...

  3. Individual Differences in Working Memory Capacity Predict Retrieval-Induced Forgetting

    ERIC Educational Resources Information Center

    Aslan, Alp; Bauml, Karl-Heinz T.

    2011-01-01

    Selectively retrieving a subset of previously studied information enhances memory for the retrieved information but causes forgetting of related, nonretrieved information. Such retrieval-induced forgetting (RIF) has often been attributed to inhibitory executive-control processes that supposedly suppress the nonretrieved items' memory…

  4. Data Mining Feature Subset Weighting and Selection Using Genetic Algorithms

    DTIC Science & Technology

    2002-03-01

    seed-stain, anthracnose, phyllosticta-leaf-spot, alternarialeaf-spot, frog-eye-leaf- spot, diaporthe-pod-&-stem-blight, cyst - nematode , 2-4-d-injury...seed-discolor: absent,present,?. 33. seed-size: norm,lt-norm,?. 34. shriveling: absent,present,?. 35. roots: norm,rotted,galls- cysts

  5. Genotyping variability of computationally categorized peach microsatellite markers

    USDA-ARS?s Scientific Manuscript database

    Numerous expressed sequence tag (EST) simple sequence repeat (SSR) primers can be easily mined out. The obstacle to develop them into usable markers is how to optimally select downsized subsets of the primers for genotyping, which accordingly reduces amplification failure and monomorphism often occu...

  6. 12-Chemokine Gene Signature Identifies Lymph Node-like Structures in Melanoma: Potential for Patient Selection for Immunotherapy?

    NASA Astrophysics Data System (ADS)

    Messina, Jane L.; Fenstermacher, David A.; Eschrich, Steven; Qu, Xiaotao; Berglund, Anders E.; Lloyd, Mark C.; Schell, Michael J.; Sondak, Vernon K.; Weber, Jeffrey S.; Mulé, James J.

    2012-10-01

    We have interrogated a 12-chemokine gene expression signature (GES) on genomic arrays of 14,492 distinct solid tumors and show broad distribution across different histologies. We hypothesized that this 12-chemokine GES might accurately predict a unique intratumoral immune reaction in stage IV (non-locoregional) melanoma metastases. The 12-chemokine GES predicted the presence of unique, lymph node-like structures, containing CD20+ B cell follicles with prominent areas of CD3+ T cells (both CD4+ and CD8+ subsets). CD86+, but not FoxP3+, cells were present within these unique structures as well. The direct correlation between the 12-chemokine GES score and the presence of unique, lymph nodal structures was also associated with better overall survival of the subset of melanoma patients. The use of this novel 12-chemokine GES may reveal basic information on in situ mechanisms of the anti-tumor immune response, potentially leading to improvements in the identification and selection of melanoma patients most suitable for immunotherapy.

  7. Functional identification of a neurocircuit regulating blood glucose

    PubMed Central

    Meek, Thomas H.; Nelson, Jarrell T.; Matsen, Miles E.; Dorfman, Mauricio D.; Guyenet, Stephan J.; Damian, Vincent; Allison, Margaret B.; Scarlett, Jarrad M.; Nguyen, Hong T.; Thaler, Joshua P.; Olson, David P.; Myers, Martin G.; Schwartz, Michael W.; Morton, Gregory J.

    2016-01-01

    Previous studies implicate the hypothalamic ventromedial nucleus (VMN) in glycemic control. Here, we report that selective inhibition of the subset of VMN neurons that express the transcription factor steroidogenic-factor 1 (VMNSF1 neurons) blocks recovery from insulin-induced hypoglycemia whereas, conversely, activation of VMNSF1 neurons causes diabetes-range hyperglycemia. Moreover, this hyperglycemic response is reproduced by selective activation of VMNSF1 fibers projecting to the anterior bed nucleus of the stria terminalis (aBNST), but not to other brain areas innervated by VMNSF1 neurons. We also report that neurons in the lateral parabrachial nucleus (LPBN), a brain area that is also implicated in the response to hypoglycemia, make synaptic connections with the specific subset of glucoregulatory VMNSF1 neurons that project to the aBNST. These results collectively establish a physiological role in glucose homeostasis for VMNSF1 neurons and suggest that these neurons are part of an ascending glucoregulatory LPBN→VMNSF1→aBNST neurocircuit. PMID:27001850

  8. An Optimal Basketball Free Throw

    ERIC Educational Resources Information Center

    Seppala-Holtzman, David

    2012-01-01

    A basketball player attempting a free throw has two parameters under his or her control: the angle of elevation and the force with which the ball is thrown. We compute upper and lower bounds for the initial velocity for suitable values of the angle of elevation, generating a subset of the configuration space of all successful free throws. A…

  9. Phenotypic, molecular, and functional characterization of human peripheral blood CD34+/THY1+ cells.

    PubMed

    Humeau, L; Bardin, F; Maroc, C; Alario, T; Galindo, R; Mannoni, P; Chabannon, C

    1996-02-01

    A subset of mobilized CD34+ cells present in patient aphereses expresses Thy1 (CDw90). This population contains most long-term culture initiating cells, as assayed with a murine stromal cell line. It also contains a significant proportion of colony-forming unit granulocyte macrophage, but very few burst-forming unit erythroid. The limited differentiation towards the erythroid lineage is further confirmed by the absence of GATA-1 mRNA in the CD34+/Thy1+ subset, and by the low level of c-kit expression. The CD34+/Thy1+ subset appears phenotypically and functionally heterogeneous, a finding consistent with its high representation, compared to phenotypes such as CD34+/CD38-. Therefore, while at least some of CD34+/Thy1+ cells may be infectable by retroviral vectors, as shown by the presence of a transcript for the receptor for murine amphotropic retroviruses, the use of this selection strategy to specifically target human stem cells appears questionable.

  10. Inline Measurement of Particle Concentrations in Multicomponent Suspensions using Ultrasonic Sensor and Least Squares Support Vector Machines.

    PubMed

    Zhan, Xiaobin; Jiang, Shulan; Yang, Yili; Liang, Jian; Shi, Tielin; Li, Xiwen

    2015-09-18

    This paper proposes an ultrasonic measurement system based on least squares support vector machines (LS-SVM) for inline measurement of particle concentrations in multicomponent suspensions. Firstly, the ultrasonic signals are analyzed and processed, and the optimal feature subset that contributes to the best model performance is selected based on the importance of features. Secondly, the LS-SVM model is tuned, trained and tested with different feature subsets to obtain the optimal model. In addition, a comparison is made between the partial least square (PLS) model and the LS-SVM model. Finally, the optimal LS-SVM model with the optimal feature subset is applied to inline measurement of particle concentrations in the mixing process. The results show that the proposed method is reliable and accurate for inline measuring the particle concentrations in multicomponent suspensions and the measurement accuracy is sufficiently high for industrial application. Furthermore, the proposed method is applicable to the modeling of the nonlinear system dynamically and provides a feasible way to monitor industrial processes.

  11. Serum C-X-C motif chemokine 13 is elevated in early and established rheumatoid arthritis and correlates with rheumatoid factor levels

    PubMed Central

    2014-01-01

    Introduction We hypothesized that serum levels of C-X-C motif chemokine 13 (CXCL13), a B-cell chemokine, would delineate a subset of rheumatoid arthritis (RA) patients characterized by increased humoral immunity. Methods Serum from patients with established RA (the Dartmouth RA Cohort) was analyzed for CXCL13, rheumatoid factor (RF) levels, anticitrullinated peptide/protein antibody (ACPA) and total immunoglobulin G (IgG); other parameters were obtained by chart review. A confirmatory analysis was performed using samples from the Sherbrooke Early Undifferentiated PolyArthritis (EUPA) Cohort. The Wilcoxon rank-sum test, a t-test and Spearman’s correlation analysis were utilized to determine relationships between variables. Results In both the Dartmouth and Sherbrooke cohorts, CXCL13 levels were selectively increased in seropositive relative to seronegative RA patients (P = 0.0002 and P < 0.0001 for the respective cohorts), with a strong correlation to both immunoglobulin M (IgM) and IgA RF levels (P < 0.0001). There was a weaker relationship to ACPA titers (P = 0.03 and P = 0.006, respectively) and total IgG (P = 0.02 and P = 0.14, respectively). No relationship was seen with regard to age, sex, shared epitope status or inclusion high-sensitivity C-reactive protein (hsCRP) in either cohort or regarding the presence of baseline erosions in the Sherbrooke Cohort, whereas a modest relationship with Disease Activity Score in 28 joints CRP (DAS28-CRP) was seen in the Dartmouth cohort but not the Sherbrooke cohort. Conclusion Using both established and early RA cohorts, marked elevations of serum CXCL13 levels resided nearly completely within the seropositive population. CXCL13 levels exhibited a strong relationship with RF, whereas the association with clinical parameters (age, sex, DAS28-CRP and erosions) or other serologic markers (ACPA and IgG) was either much weaker or absent. Elevated serum CXCL13 levels may identify a subset of seropositive RA patients whose disease is shaped by or responsive to RF production. PMID:24766912

  12. Possible involvement of the glucocorticoid receptor (NR3C1) and selected NR3C1 gene variants in regulation of human testicular function.

    PubMed

    Nordkap, L; Almstrup, K; Nielsen, J E; Bang, A K; Priskorn, L; Krause, M; Holmboe, S A; Winge, S B; Egeberg Palme, D L; Mørup, N; Petersen, J H; Juul, A; Skakkebaek, N E; Rajpert-De Meyts, E; Jørgensen, N

    2017-11-01

    Perceived stress has been associated with decreased semen quality but the mechanisms have not been elucidated. It is not known whether cortisol, the major stress hormone in humans, can act directly via receptors in the testis, and whether variants in the gene encoding the glucocorticoid receptor (NR3C1) can possibly modulate the effect. To address these questions, we investigated the expression of the glucocorticoid receptor in human testicular tissue, including adult and fetal samples (n = 20) by immunohistochemical staining, and in silico analysis of publicly available datasets. In the adult testis NR3C1 protein was detected in peritubular cells, a subset of Leydig cells, Sertoli cells (weak), and spermatogonia, but not in spermatids. The NR3C1 expression pattern in fetal testis samples differed by a notably stronger reaction in Sertoli cells, lack of staining in gonocytes but the presence in a subset of pro-spermatogonia, and the almost absent reaction in nascent peritubular cells. In parallel, we explored the association between adult testicular function and three single nucleotide NR3C1 polymorphisms (BcII [rs41423247], 9β [rs6198], and Tth111I [rs10052957]) affecting glucocorticoid sensitivity. Testicular function was determined by semen analysis and reproductive hormone profiling in 893 men from the general population. The NR3C1 SNP BclI was associated with semen quality in an over-dominant manner with heterozygotes having better semen parameters compared to both homozygote constellations, and with sperm motility showing the strongest association. This association was supported by a higher inhibin B and inhibin B/FSH ratio, as well as a lower FSH in BclI heterozygotes. The SNPs 9β and Tth111I were not associated with semen parameters. Although the clinical impact of the findings is limited, the results substantiate a suggested link between stress and testicular function. Hence this investigation should be regarded as a discovery study generating hypotheses for future studies. © 2017 American Society of Andrology and European Academy of Andrology.

  13. Characterization of Machine Variability and Progressive Heat Treatment in Selective Laser Melting of Inconel 718

    NASA Technical Reports Server (NTRS)

    Prater, T.; Tilson, W.; Jones, Z.

    2015-01-01

    The absence of an economy of scale in spaceflight hardware makes additive manufacturing an immensely attractive option for propulsion components. As additive manufacturing techniques are increasingly adopted by government and industry to produce propulsion hardware in human-rated systems, significant development efforts are needed to establish these methods as reliable alternatives to conventional subtractive manufacturing. One of the critical challenges facing powder bed fusion techniques in this application is variability between machines used to perform builds. Even with implementation of robust process controls, it is possible for two machines operating at identical parameters with equivalent base materials to produce specimens with slightly different material properties. The machine variability study presented here evaluates 60 specimens of identical geometry built using the same parameters. 30 samples were produced on machine 1 (M1) and the other 30 samples were built on machine 2 (M2). Each of the 30-sample sets were further subdivided into three subsets (with 10 specimens in each subset) to assess the effect of progressive heat treatment on machine variability. The three categories for post-processing were: stress relief, stress relief followed by hot isostatic press (HIP), and stress relief followed by HIP followed by heat treatment per AMS 5664. Each specimen (a round, smooth tensile) was mechanically tested per ASTM E8. Two formal statistical techniques, hypothesis testing for equivalency of means and one-way analysis of variance (ANOVA), were applied to characterize the impact of machine variability and heat treatment on six material properties: tensile stress, yield stress, modulus of elasticity, fracture elongation, and reduction of area. This work represents the type of development effort that is critical as NASA, academia, and the industrial base work collaboratively to establish a path to certification for additively manufactured parts. For future flight programs, NASA and its commercial partners will procure parts from vendors who will use a diverse range of machines to produce parts and, as such, it is essential that the AM community develop a sound understanding of the degree to which machine variability impacts material properties.

  14. Maps on statistical manifolds exactly reduced from the Perron-Frobenius equations for solvable chaotic maps

    NASA Astrophysics Data System (ADS)

    Goto, Shin-itiro; Umeno, Ken

    2018-03-01

    Maps on a parameter space for expressing distribution functions are exactly derived from the Perron-Frobenius equations for a generalized Boole transform family. Here the generalized Boole transform family is a one-parameter family of maps, where it is defined on a subset of the real line and its probability distribution function is the Cauchy distribution with some parameters. With this reduction, some relations between the statistical picture and the orbital one are shown. From the viewpoint of information geometry, the parameter space can be identified with a statistical manifold, and then it is shown that the derived maps can be characterized. Also, with an induced symplectic structure from a statistical structure, symplectic and information geometric aspects of the derived maps are discussed.

  15. Implementation of the Global Parameters Determination in Gaia's Astrometric Solution (AGIS)

    NASA Astrophysics Data System (ADS)

    Raison, F.; Olias, A.; Hobbs, D.; Lindegren, L.

    2010-12-01

    Gaia is ESA’s space astrometry mission with a foreseen launch date in early 2012. Its main objective is to perform a stellar census of the 1000 Million brightest objects in our galaxy (completeness to V=20 mag) from which an astrometric catalog of micro-arcsec level accuracy will be constructed. A key element in this endeavor is the Astrometric Global Iterative Solution (AGIS). A core part of AGIS is to determine the accurate spacecraft attitude, geometric instrument calibration and astrometric model parameters for a well-behaved subset of all the objects (the ‘primary stars’). In addition, a small number of global parameters will be estimated, one of these being PPN γ. We present here the implementation of the algorithms dedicated to the determination of the global parameters.

  16. [Characteristics of peripheral blood lymphocyte immune subsets in patients with chronic active Epstein-Barr virus infection].

    PubMed

    Xing, Yan; Song, Hong-mei; Li, Tai-sheng; Qiu, Zhi-feng; Wu, Xiao-yan; Wang, Wei; Wei, Min

    2009-06-01

    To study the characteristics of the peripheral blood lymphocyte subsets in pediatric patients with chronic active EBV (CAEBV) infection. Flow cytometry was used to detect the peripheral blood NK, B, T lymphocyte subsets and the functional, regulatory, naïve, memory and activatory subsets of T lymphocytes in 10 pediatric patients with CAEBV infection, 13 pediatric patients with acute Epstein-Barr virus infection (AEBV) and 12 healthy children in our hospital between March 2004 and April 2008. Compared with AEBV group, the number of white blood cells [3325 x 10(6)/L (median, just the same as the following)], lymphocytes (1078 x 10(6)/L), NK cells (68 x 10(6)/L), B cells (84 x 10(6)/L), total T cells (684 x 10(6)/L), CD4+ T cells (406 x 10(6)/L) and CD8+ T cells (295 x 10(6)/L) in CAEBV patients were lower (P<0.05). The functional subset of the CD4+ T cells in CAEBV group (94.5%) was lower than those of the healthy control group (98.7%) (P<0.05), but was still higher than those of AEBV group (74.0%) (P<0.05). While the functional subset of the CD8+ T cells in CAEBV (40.7%) was not dramatically different from the healthy control group (48.3%), but was still higher than that of AEBV group (21.0%) (P<0.05). Although the regulatory subset in CAEBV group (5.0%) was higher than the health control group (4.6%) (P<0.05), but lower than AEBV group (5.8%) (P<0.05). In CAEBV, the proportion of CD4+/CD8+ naïve T cells (32.3%/37.5%) was lower than that of normal group (58.3%/56.6%) (P<0.05), but the proportion of CD4+/CD8+ effective memory T cells in CAEBV group (23.9%/15.1%) was lower than that in AEBV group (36.5%/69.8%) (P<0.05), while the proportion of CD8+ fake naïve T cells in CAEBV (17.5%) was higher than the other 2 groups (P<0.05). The CD8+ activatory subset in CAEBV group (84.4%/34.0%) was higher than that of the healthy control group (44.1%/16.7%) (P<0.05), but still lower than AEBV group (96%/95%) (P<0.05). There is an imbalance in lymphocyte subsets and disturbance in cellular immunity in CAEBV patients, which may be associated with EBV chronic active infection. Detecting the peripheral haematologic parameters and lymphocyte subsets may be helpful in the diagnosis and the differential diagnosis of CAEBV.

  17. Adaptive 4d Psi-Based Change Detection

    NASA Astrophysics Data System (ADS)

    Yang, Chia-Hsiang; Soergel, Uwe

    2018-04-01

    In a previous work, we proposed a PSI-based 4D change detection to detect disappearing and emerging PS points (3D) along with their occurrence dates (1D). Such change points are usually caused by anthropic events, e.g., building constructions in cities. This method first divides an entire SAR image stack into several subsets by a set of break dates. The PS points, which are selected based on their temporal coherences before or after a break date, are regarded as change candidates. Change points are then extracted from these candidates according to their change indices, which are modelled from their temporal coherences of divided image subsets. Finally, we check the evolution of the change indices for each change point to detect the break date that this change occurred. The experiment validated both feasibility and applicability of our method. However, two questions still remain. First, selection of temporal coherence threshold associates with a trade-off between quality and quantity of PS points. This selection is also crucial for the amount of change points in a more complex way. Second, heuristic selection of change index thresholds brings vulnerability and causes loss of change points. In this study, we adapt our approach to identify change points based on statistical characteristics of change indices rather than thresholding. The experiment validates this adaptive approach and shows increase of change points compared with the old version. In addition, we also explore and discuss optimal selection of temporal coherence threshold.

  18. Discrete Biogeography Based Optimization for Feature Selection in Molecular Signatures.

    PubMed

    Liu, Bo; Tian, Meihong; Zhang, Chunhua; Li, Xiangtao

    2015-04-01

    Biomarker discovery from high-dimensional data is a complex task in the development of efficient cancer diagnoses and classification. However, these data are usually redundant and noisy, and only a subset of them present distinct profiles for different classes of samples. Thus, selecting high discriminative genes from gene expression data has become increasingly interesting in the field of bioinformatics. In this paper, a discrete biogeography based optimization is proposed to select the good subset of informative gene relevant to the classification. In the proposed algorithm, firstly, the fisher-markov selector is used to choose fixed number of gene data. Secondly, to make biogeography based optimization suitable for the feature selection problem; discrete migration model and discrete mutation model are proposed to balance the exploration and exploitation ability. Then, discrete biogeography based optimization, as we called DBBO, is proposed by integrating discrete migration model and discrete mutation model. Finally, the DBBO method is used for feature selection, and three classifiers are used as the classifier with the 10 fold cross-validation method. In order to show the effective and efficiency of the algorithm, the proposed algorithm is tested on four breast cancer dataset benchmarks. Comparison with genetic algorithm, particle swarm optimization, differential evolution algorithm and hybrid biogeography based optimization, experimental results demonstrate that the proposed method is better or at least comparable with previous method from literature when considering the quality of the solutions obtained. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Selective CD28 Antagonist Blunts Memory Immune Responses and Promotes Long-Term Control of Skin Inflammation in Nonhuman Primates.

    PubMed

    Poirier, Nicolas; Chevalier, Melanie; Mary, Caroline; Hervouet, Jeremy; Minault, David; Baker, Paul; Ville, Simon; Le Bas-Bernardet, Stephanie; Dilek, Nahzli; Belarif, Lyssia; Cassagnau, Elisabeth; Scobie, Linda; Blancho, Gilles; Vanhove, Bernard

    2016-01-01

    Novel therapies that specifically target activation and expansion of pathogenic immune cell subsets responsible for autoimmune attacks are needed to confer long-term remission. Pathogenic cells in autoimmunity include memory T lymphocytes that are long-lived and present rapid recall effector functions with reduced activation requirements. Whereas the CD28 costimulation pathway predominantly controls priming of naive T cells and hence generation of adaptive memory cells, the roles of CD28 costimulation on established memory T lymphocytes and the recall of memory responses remain controversial. In contrast to CD80/86 antagonists (CTLA4-Ig), selective CD28 antagonists blunt T cell costimulation while sparing CTLA-4 and PD-L1-dependent coinhibitory signals. Using a new selective CD28 antagonist, we showed that Ag-specific reactivation of human memory T lymphocytes was prevented. Selective CD28 blockade controlled both cellular and humoral memory recall in nonhuman primates and induced long-term Ag-specific unresponsiveness in a memory T cell-mediated inflammatory skin model. No modification of memory T lymphocytes subsets or numbers was observed in the periphery, and importantly no significant reactivation of quiescent viruses was noticed. These findings indicate that pathogenic memory T cell responses are controlled by both CD28 and CTLA-4/PD-L1 cosignals in vivo and that selectively targeting CD28 would help to promote remission of autoimmune diseases and control chronic inflammation. Copyright © 2015 by The American Association of Immunologists, Inc.

  20. An Ensemble Framework Coping with Instability in the Gene Selection Process.

    PubMed

    Castellanos-Garzón, José A; Ramos, Juan; López-Sánchez, Daniel; de Paz, Juan F; Corchado, Juan M

    2018-03-01

    This paper proposes an ensemble framework for gene selection, which is aimed at addressing instability problems presented in the gene filtering task. The complex process of gene selection from gene expression data faces different instability problems from the informative gene subsets found by different filter methods. This makes the identification of significant genes by the experts difficult. The instability of results can come from filter methods, gene classifier methods, different datasets of the same disease and multiple valid groups of biomarkers. Even though there is a wide number of proposals, the complexity imposed by this problem remains a challenge today. This work proposes a framework involving five stages of gene filtering to discover biomarkers for diagnosis and classification tasks. This framework performs a process of stable feature selection, facing the problems above and, thus, providing a more suitable and reliable solution for clinical and research purposes. Our proposal involves a process of multistage gene filtering, in which several ensemble strategies for gene selection were added in such a way that different classifiers simultaneously assess gene subsets to face instability. Firstly, we apply an ensemble of recent gene selection methods to obtain diversity in the genes found (stability according to filter methods). Next, we apply an ensemble of known classifiers to filter genes relevant to all classifiers at a time (stability according to classification methods). The achieved results were evaluated in two different datasets of the same disease (pancreatic ductal adenocarcinoma), in search of stability according to the disease, for which promising results were achieved.

Top