multivariate classification based: Topics by Science.gov

Sample records for multivariate classification based

Drunk driving detection based on classification of multivariate time series.

PubMed

Li, Zhenlong; Jin, Xue; Zhao, Xiaohua

2015-09-01

This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
An efficient swarm intelligence approach to feature selection based on invasive weed optimization: Application to multivariate calibration and classification using spectroscopic data

NASA Astrophysics Data System (ADS)

Sheykhizadeh, Saheleh; Naseri, Abdolhossein

2018-04-01

Variable selection plays a key role in classification and multivariate calibration. Variable selection methods are aimed at choosing a set of variables, from a large pool of available predictors, relevant to the analyte concentrations estimation, or to achieve better classification results. Many variable selection techniques have now been introduced among which, those which are based on the methodologies of swarm intelligence optimization have been more respected during a few last decades since they are mainly inspired by nature. In this work, a simple and new variable selection algorithm is proposed according to the invasive weed optimization (IWO) concept. IWO is considered a bio-inspired metaheuristic mimicking the weeds ecological behavior in colonizing as well as finding an appropriate place for growth and reproduction; it has been shown to be very adaptive and powerful to environmental changes. In this paper, the first application of IWO, as a very simple and powerful method, to variable selection is reported using different experimental datasets including FTIR and NIR data, so as to undertake classification and multivariate calibration tasks. Accordingly, invasive weed optimization - linear discrimination analysis (IWO-LDA) and invasive weed optimization- partial least squares (IWO-PLS) are introduced for multivariate classification and calibration, respectively.
An efficient swarm intelligence approach to feature selection based on invasive weed optimization: Application to multivariate calibration and classification using spectroscopic data.

PubMed

Sheykhizadeh, Saheleh; Naseri, Abdolhossein

2018-04-05

Variable selection plays a key role in classification and multivariate calibration. Variable selection methods are aimed at choosing a set of variables, from a large pool of available predictors, relevant to the analyte concentrations estimation, or to achieve better classification results. Many variable selection techniques have now been introduced among which, those which are based on the methodologies of swarm intelligence optimization have been more respected during a few last decades since they are mainly inspired by nature. In this work, a simple and new variable selection algorithm is proposed according to the invasive weed optimization (IWO) concept. IWO is considered a bio-inspired metaheuristic mimicking the weeds ecological behavior in colonizing as well as finding an appropriate place for growth and reproduction; it has been shown to be very adaptive and powerful to environmental changes. In this paper, the first application of IWO, as a very simple and powerful method, to variable selection is reported using different experimental datasets including FTIR and NIR data, so as to undertake classification and multivariate calibration tasks. Accordingly, invasive weed optimization - linear discrimination analysis (IWO-LDA) and invasive weed optimization- partial least squares (IWO-PLS) are introduced for multivariate classification and calibration, respectively. Copyright © 2018 Elsevier B.V. All rights reserved.
An information-based network approach for protein classification

PubMed Central

Wan, Xiaogeng; Zhao, Xin; Yau, Stephen S. T.

2017-01-01

Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections. Our method is unsupervised, multivariate, and alignment-free. It can be applied to the classification of both protein sequences and structures. Nine examples are used to demonstrate the efficiency of our new method. PMID:28350835
Use of collateral information to improve LANDSAT classification accuracies

NASA Technical Reports Server (NTRS)

Strahler, A. H. (Principal Investigator)

1981-01-01

Methods to improve LANDSAT classification accuracies were investigated including: (1) the use of prior probabilities in maximum likelihood classification as a methodology to integrate discrete collateral data with continuously measured image density variables; (2) the use of the logit classifier as an alternative to multivariate normal classification that permits mixing both continuous and categorical variables in a single model and fits empirical distributions of observations more closely than the multivariate normal density function; and (3) the use of collateral data in a geographic information system as exercised to model a desired output information layer as a function of input layers of raster format collateral and image data base layers.
Interpreting support vector machine models for multivariate group wise analysis in neuroimaging

PubMed Central

Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos

2015-01-01

Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913
Microcomputer-based classification of environmental data in municipal areas

NASA Astrophysics Data System (ADS)

Thiergärtner, H.

1995-10-01

Multivariate data-processing methods used in mineral resource identification can be used to classify urban regions. Using elements of expert systems, geographical information systems, as well as known classification and prognosis systems, it is possible to outline a single model that consists of resistant and of temporary parts of a knowledge base including graphical input and output treatment and of resistant and temporary elements of a bank of methods and algorithms. Whereas decision rules created by experts will be stored in expert systems directly, powerful classification rules in form of resistant but latent (implicit) decision algorithms may be implemented in the suggested model. The latent functions will be transformed into temporary explicit decision rules by learning processes depending on the actual task(s), parameter set(s), pixels selection(s), and expert control(s). This takes place both at supervised and nonsupervised classification of multivariately described pixel sets representing municipal subareas. The model is outlined briefly and illustrated by results obtained in a target area covering a part of the city of Berlin (Germany).
Multivariate statistical analysis software technologies for astrophysical research involving large data bases

NASA Technical Reports Server (NTRS)

Djorgovski, George

1993-01-01

The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multiparameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resource.
Multivariate statistical analysis software technologies for astrophysical research involving large data bases

NASA Technical Reports Server (NTRS)

Djorgovski, Stanislav

1992-01-01

The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multi parameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resources.
Estimating the Classification Efficiency of a Test Battery.

ERIC Educational Resources Information Center

De Corte, Wilfried

2000-01-01

Shows how a theorem proven by H. Brogden (1951, 1959) can be used to estimate the allocation average (a predictor based classification of a test battery) assuming that the predictor intercorrelations and validities are known and that the predictor variables have a joint multivariate normal distribution. (SLD)
Lameness detection in dairy cattle: single predictor v. multivariate analysis of image-based posture processing and behaviour and performance sensing.

PubMed

Van Hertem, T; Bahr, C; Schlageter Tello, A; Viazzi, S; Steensels, M; Romanini, C E B; Lokhorst, C; Maltz, E; Halachmi, I; Berckmans, D

2016-09-01

The objective of this study was to evaluate if a multi-sensor system (milk, activity, body posture) was a better classifier for lameness than the single-sensor-based detection models. Between September 2013 and August 2014, 3629 cow observations were collected on a commercial dairy farm in Belgium. Human locomotion scoring was used as reference for the model development and evaluation. Cow behaviour and performance was measured with existing sensors that were already present at the farm. A prototype of three-dimensional-based video recording system was used to quantify automatically the back posture of a cow. For the single predictor comparisons, a receiver operating characteristics curve was made. For the multivariate detection models, logistic regression and generalized linear mixed models (GLMM) were developed. The best lameness classification model was obtained by the multi-sensor analysis (area under the receiver operating characteristics curve (AUC)=0.757±0.029), containing a combination of milk and milking variables, activity and gait and posture variables from videos. Second, the multivariate video-based system (AUC=0.732±0.011) performed better than the multivariate milk sensors (AUC=0.604±0.026) and the multivariate behaviour sensors (AUC=0.633±0.018). The video-based system performed better than the combined behaviour and performance-based detection model (AUC=0.669±0.028), indicating that it is worthwhile to consider a video-based lameness detection system, regardless the presence of other existing sensors in the farm. The results suggest that Θ2, the feature variable for the back curvature around the hip joints, with an AUC of 0.719 is the best single predictor variable for lameness detection based on locomotion scoring. In general, this study showed that the video-based back posture monitoring system is outperforming the behaviour and performance sensing techniques for locomotion scoring-based lameness detection. A GLMM with seven specific variables (walking speed, back posture measurement, daytime activity, milk yield, lactation stage, milk peak flow rate and milk peak conductivity) is the best combination of variables for lameness classification. The accuracy on four-level lameness classification was 60.3%. The accuracy improved to 79.8% for binary lameness classification. The binary GLMM obtained a sensitivity of 68.5% and a specificity of 87.6%, which both exceed the sensitivity (52.1%±4.7%) and specificity (83.2%±2.3%) of the multi-sensor logistic regression model. This shows that the repeated measures analysis in the GLMM, taking into account the individual history of the animal, outperforms the classification when thresholds based on herd level (a statistical population) are used.
Testing Multivariate Adaptive Regression Splines (MARS) as a Method of Land Cover Classification of TERRA-ASTER Satellite Images.

PubMed

Quirós, Elia; Felicísimo, Angel M; Cuartero, Aurora

2009-01-01

This work proposes a new method to classify multi-spectral satellite images based on multivariate adaptive regression splines (MARS) and compares this classification system with the more common parallelepiped and maximum likelihood (ML) methods. We apply the classification methods to the land cover classification of a test zone located in southwestern Spain. The basis of the MARS method and its associated procedures are explained in detail, and the area under the ROC curve (AUC) is compared for the three methods. The results show that the MARS method provides better results than the parallelepiped method in all cases, and it provides better results than the maximum likelihood method in 13 cases out of 17. These results demonstrate that the MARS method can be used in isolation or in combination with other methods to improve the accuracy of soil cover classification. The improvement is statistically significant according to the Wilcoxon signed rank test.
External Validation of the European Hernia Society Classification for Postoperative Complications after Incisional Hernia Repair: A Cohort Study of 2,191 Patients.

PubMed

Kroese, Leonard F; Kleinrensink, Gert-Jan; Lange, Johan F; Gillion, Jean-Francois

2018-03-01

Incisional hernia is a frequent complication after midline laparotomy. Surgical hernia repair is associated with complications, but no clear predictive risk factors have been identified. The European Hernia Society (EHS) classification offers a structured framework to describe hernias and to analyze postoperative complications. Because of its structured nature, it might prove to be useful for preoperative patient or treatment classification. The objective of this study was to investigate the EHS classification as a predictor for postoperative complications after incisional hernia surgery. An analysis was performed using a registry-based, large-scale, prospective cohort study, including all patients undergoing incisional hernia surgery between September 1, 2011 and February 29, 2016. Univariate analyses and multivariable logistic regression analysis were performed to identify risk factors for postoperative complications. A total of 2,191 patients were included, of whom 323 (15%) had 1 or more complications. Factors associated with complications in univariate analyses (p < 0.20) and clinically relevant factors were included in the multivariable analysis. In the multivariable analysis, EHS width class, incarceration, open surgery, duration of surgery, Altemeier wound class, and therapeutic antibiotic treatment were independent risk factors for postoperative complications. Third recurrence and emergency surgery were associated with fewer complications. Incisional hernia repair is associated with a 15% complication rate. The EHS width classification is associated with postoperative complications. To identify patients at risk for complications, the EHS classification is useful. Copyright © 2017. Published by Elsevier Inc.
Combination of laser-induced breakdown spectroscopy and Raman spectroscopy for multivariate classification of bacteria

NASA Astrophysics Data System (ADS)

Prochazka, D.; Mazura, M.; Samek, O.; Rebrošová, K.; Pořízka, P.; Klus, J.; Prochazková, P.; Novotný, J.; Novotný, K.; Kaiser, J.

2018-01-01

In this work, we investigate the impact of data provided by complementary laser-based spectroscopic methods on multivariate classification accuracy. Discrimination and classification of five Staphylococcus bacterial strains and one strain of Escherichia coli is presented. The technique that we used for measurements is a combination of Raman spectroscopy and Laser-Induced Breakdown Spectroscopy (LIBS). Obtained spectroscopic data were then processed using Multivariate Data Analysis algorithms. Principal Components Analysis (PCA) was selected as the most suitable technique for visualization of bacterial strains data. To classify the bacterial strains, we used Neural Networks, namely a supervised version of Kohonen's self-organizing maps (SOM). We were processing results in three different ways - separately from LIBS measurements, from Raman measurements, and we also merged data from both mentioned methods. The three types of results were then compared. By applying the PCA to Raman spectroscopy data, we observed that two bacterial strains were fully distinguished from the rest of the data set. In the case of LIBS data, three bacterial strains were fully discriminated. Using a combination of data from both methods, we achieved the complete discrimination of all bacterial strains. All the data were classified with a high success rate using SOM algorithm. The most accurate classification was obtained using a combination of data from both techniques. The classification accuracy varied, depending on specific samples and techniques. As for LIBS, the classification accuracy ranged from 45% to 100%, as for Raman Spectroscopy from 50% to 100% and in case of merged data, all samples were classified correctly. Based on the results of the experiments presented in this work, we can assume that the combination of Raman spectroscopy and LIBS significantly enhances discrimination and classification accuracy of bacterial species and strains. The reason is the complementarity in obtained chemical information while using these two methods.
Multivariate classification of infrared spectra of cell and tissue samples

DOEpatents

Haaland, David M.; Jones, Howland D. T.; Thomas, Edward V.

1997-01-01

Multivariate classification techniques are applied to spectra from cell and tissue samples irradiated with infrared radiation to determine if the samples are normal or abnormal (cancerous). Mid and near infrared radiation can be used for in vivo and in vitro classifications using at least different wavelengths.
Particle analysis using laser ablation mass spectroscopy

DOEpatents

Parker, Eric P.; Rosenthal, Stephen E.; Trahan, Michael W.; Wagner, John S.

2003-09-09

The present invention provides a method of quickly identifying bioaerosols by class, even if the subject bioaerosol has not been previously encountered. The method begins by collecting laser ablation mass spectra from known particles. The spectra are correlated with the known particles, including the species of particle and the classification (e.g., bacteria). The spectra can then be used to train a neural network, for example using genetic algorithm-based training, to recognize each spectra and to recognize characteristics of the classifications. The spectra can also be used in a multivariate patch algorithm. Laser ablation mass specta from unknown particles can be presented as inputs to the trained neural net for identification as to classification. The description below first describes suitable intelligent algorithms and multivariate patch algorithms, then presents an example of the present invention including results.
A Deep Learning Architecture for Temporal Sleep Stage Classification Using Multivariate and Multimodal Time Series.

PubMed

Chambon, Stanislas; Galtier, Mathieu N; Arnal, Pierrick J; Wainrib, Gilles; Gramfort, Alexandre

2018-04-01

Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders. It is traditionally performed by a sleep expert who assigns to each 30 s of the signal of a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEGs), electrooculograms (EOGs), electrocardiograms, and electromyograms (EMGs). We introduce here the first deep learning approach for sleep stage classification that learns end-to-end without computing spectrograms or extracting handcrafted features, that exploits all multivariate and multimodal polysomnography (PSG) signals (EEG, EMG, and EOG), and that can exploit the temporal context of each 30-s window of data. For each modality, the first layer learns linear spatial filters that exploit the array of sensors to increase the signal-to-noise ratio, and the last layer feeds the learnt representation to a softmax classifier. Our model is compared to alternative automatic approaches based on convolutional networks or decisions trees. Results obtained on 61 publicly available PSG records with up to 20 EEG channels demonstrate that our network architecture yields the state-of-the-art performance. Our study reveals a number of insights on the spatiotemporal distribution of the signal of interest: a good tradeoff for optimal classification performance measured with balanced accuracy is to use 6 EEG with 2 EOG (left and right) and 3 EMG chin channels. Also exploiting 1 min of data before and after each data segment offers the strongest improvement when a limited number of channels are available. As sleep experts, our system exploits the multivariate and multimodal nature of PSG signals in order to deliver the state-of-the-art classification performance with a small computational cost.
R-parametrization and its role in classification of linear multivariable feedback systems

NASA Technical Reports Server (NTRS)

Chen, Robert T. N.

1988-01-01

A classification of all the compensators that stabilize a given general plant in a linear, time-invariant multi-input, multi-output feedback system is developed. This classification, along with the associated necessary and sufficient conditions for stability of the feedback system, is achieved through the introduction of a new parameterization, referred to as R-Parameterization, which is a dual of the familiar Q-Parameterization. The classification is made to the stability conditions of the compensators and the plant by themselves; and necessary and sufficient conditions are based on the stability of Q and R themselves.
Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics

NASA Astrophysics Data System (ADS)

Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio

2018-01-01

The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.
Control-group feature normalization for multivariate pattern analysis of structural MRI data using the support vector machine.

PubMed

Linn, Kristin A; Gaonkar, Bilwaj; Satterthwaite, Theodore D; Doshi, Jimit; Davatzikos, Christos; Shinohara, Russell T

2016-05-15

Normalization of feature vector values is a common practice in machine learning. Generally, each feature value is standardized to the unit hypercube or by normalizing to zero mean and unit variance. Classification decisions based on support vector machines (SVMs) or by other methods are sensitive to the specific normalization used on the features. In the context of multivariate pattern analysis using neuroimaging data, standardization effectively up- and down-weights features based on their individual variability. Since the standard approach uses the entire data set to guide the normalization, it utilizes the total variability of these features. This total variation is inevitably dependent on the amount of marginal separation between groups. Thus, such a normalization may attenuate the separability of the data in high dimensional space. In this work we propose an alternate approach that uses an estimate of the control-group standard deviation to normalize features before training. We study our proposed approach in the context of group classification using structural MRI data. We show that control-based normalization leads to better reproducibility of estimated multivariate disease patterns and improves the classifier performance in many cases. Copyright © 2016 Elsevier Inc. All rights reserved.

Laser-induced breakdown spectroscopy-based investigation and classification of pharmaceutical tablets using multivariate chemometric analysis

PubMed Central

Myakalwar, Ashwin Kumar; Sreedhar, S.; Barman, Ishan; Dingari, Narahara Chari; Rao, S. Venugopal; Kiran, P. Prem; Tewari, Surya P.; Kumar, G. Manoj

2012-01-01

We report the effectiveness of laser-induced breakdown spectroscopy (LIBS) in probing the content of pharmaceutical tablets and also investigate its feasibility for routine classification. This method is particularly beneficial in applications where its exquisite chemical specificity and suitability for remote and on site characterization significantly improves the speed and accuracy of quality control and assurance process. Our experiments reveal that in addition to the presence of carbon, hydrogen, nitrogen and oxygen, which can be primarily attributed to the active pharmaceutical ingredients, specific inorganic atoms were also present in all the tablets. Initial attempts at classification by a ratiometric approach using oxygen to nitrogen compositional values yielded an optimal value (at 746.83 nm) with the least relative standard deviation but nevertheless failed to provide an acceptable classification. To overcome this bottleneck in the detection process, two chemometric algorithms, i.e. principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA), were implemented to exploit the multivariate nature of the LIBS data demonstrating that LIBS has the potential to differentiate and discriminate among pharmaceutical tablets. We report excellent prospective classification accuracy using supervised classification via the SIMCA algorithm, demonstrating its potential for future applications in process analytical technology, especially for fast on-line process control monitoring applications in the pharmaceutical industry. PMID:22099648
Statistical methods and neural network approaches for classification of data from multiple sources

NASA Technical Reports Server (NTRS)

Benediktsson, Jon Atli; Swain, Philip H.

1990-01-01

Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.
Various forms of indexing HDMR for modelling multivariate classification problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aksu, Çağrı; Tunga, M. Alper

2014-12-10

The Indexing HDMR method was recently developed for modelling multivariate interpolation problems. The method uses the Plain HDMR philosophy in partitioning the given multivariate data set into less variate data sets and then constructing an analytical structure through these partitioned data sets to represent the given multidimensional problem. Indexing HDMR makes HDMR be applicable to classification problems having real world data. Mostly, we do not know all possible class values in the domain of the given problem, that is, we have a non-orthogonal data structure. However, Plain HDMR needs an orthogonal data structure in the given problem to be modelled.more » In this sense, the main idea of this work is to offer various forms of Indexing HDMR to successfully model these real life classification problems. To test these different forms, several well-known multivariate classification problems given in UCI Machine Learning Repository were used and it was observed that the accuracy results lie between 80% and 95% which are very satisfactory.« less
Classification of Ilex species based on metabolomic fingerprinting using nuclear magnetic resonance and multivariate data analysis.

PubMed

Choi, Young Hae; Sertic, Sarah; Kim, Hye Kyong; Wilson, Erica G; Michopoulos, Filippos; Lefeber, Alfons W M; Erkelens, Cornelis; Prat Kricun, Sergio D; Verpoorte, Robert

2005-02-23

The metabolomic analysis of 11 Ilex species, I. argentina, I. brasiliensis, I. brevicuspis, I. dumosavar. dumosa, I. dumosa var. guaranina, I. integerrima, I. microdonta, I. paraguariensis var. paraguariensis, I. pseudobuxus, I. taubertiana, and I. theezans, was carried out by NMR spectroscopy and multivariate data analysis. The analysis using principal component analysis and classification of the (1)H NMR spectra showed a clear discrimination of those samples based on the metabolites present in the organic and aqueous fractions. The major metabolites that contribute to the discrimination are arbutin, caffeine, phenylpropanoids, and theobromine. Among those metabolites, arbutin, which has not been reported yet as a constituent of Ilex species, was found to be a biomarker for I. argentina,I. brasiliensis, I. brevicuspis, I. integerrima, I. microdonta, I. pseudobuxus, I. taubertiana, and I. theezans. This reliable method based on the determination of a large number of metabolites makes the chemotaxonomical analysis of Ilex species possible.
Multivariate Analysis As a Support for Diagnostic Flowcharts in Allergic Bronchopulmonary Aspergillosis: A Proof-of-Concept Study.

PubMed

Vitte, Joana; Ranque, Stéphane; Carsin, Ania; Gomez, Carine; Romain, Thomas; Cassagne, Carole; Gouitaa, Marion; Baravalle-Einaudi, Mélisande; Bel, Nathalie Stremler-Le; Reynaud-Gaubert, Martine; Dubus, Jean-Christophe; Mège, Jean-Louis; Gaudart, Jean

2017-01-01

Molecular-based allergy diagnosis yields multiple biomarker datasets. The classical diagnostic score for allergic bronchopulmonary aspergillosis (ABPA), a severe disease usually occurring in asthmatic patients and people with cystic fibrosis, comprises succinct immunological criteria formulated in 1977: total IgE, anti- Aspergillus fumigatus ( Af ) IgE, anti- Af "precipitins," and anti- Af IgG. Progress achieved over the last four decades led to multiple IgE and IgG(4) Af biomarkers available with quantitative, standardized, molecular-level reports. These newly available biomarkers have not been included in the current diagnostic criteria, either individually or in algorithms, despite persistent underdiagnosis of ABPA. Large numbers of individual biomarkers may hinder their use in clinical practice. Conversely, multivariate analysis using new tools may bring about a better chance of less diagnostic mistakes. We report here a proof-of-concept work consisting of a three-step multivariate analysis of Af IgE, IgG, and IgG4 biomarkers through a combination of principal component analysis, hierarchical ascendant classification, and classification and regression tree multivariate analysis. The resulting diagnostic algorithms might show the way for novel criteria and improved diagnostic efficiency in Af -sensitized patients at risk for ABPA.
Learning a Mahalanobis Distance-Based Dynamic Time Warping Measure for Multivariate Time Series Classification.

PubMed

Mei, Jiangyuan; Liu, Meizhu; Wang, Yuan-Fang; Gao, Huijun

2016-06-01

Multivariate time series (MTS) datasets broadly exist in numerous fields, including health care, multimedia, finance, and biometrics. How to classify MTS accurately has become a hot research topic since it is an important element in many computer vision and pattern recognition applications. In this paper, we propose a Mahalanobis distance-based dynamic time warping (DTW) measure for MTS classification. The Mahalanobis distance builds an accurate relationship between each variable and its corresponding category. It is utilized to calculate the local distance between vectors in MTS. Then we use DTW to align those MTS which are out of synchronization or with different lengths. After that, how to learn an accurate Mahalanobis distance function becomes another key problem. This paper establishes a LogDet divergence-based metric learning with triplet constraint model which can learn Mahalanobis matrix with high precision and robustness. Furthermore, the proposed method is applied on nine MTS datasets selected from the University of California, Irvine machine learning repository and Robert T. Olszewski's homepage, and the results demonstrate the improved performance of the proposed approach.
A detailed comparison of analysis processes for MCC-IMS data in disease classification—Automated methods can replace manual peak annotations

PubMed Central

Horsch, Salome; Kopczynski, Dominik; Kuthe, Elias; Baumbach, Jörg Ingo; Rahmann, Sven

2017-01-01

Motivation Disease classification from molecular measurements typically requires an analysis pipeline from raw noisy measurements to final classification results. Multi capillary column—ion mobility spectrometry (MCC-IMS) is a promising technology for the detection of volatile organic compounds in the air of exhaled breath. From raw measurements, the peak regions representing the compounds have to be identified, quantified, and clustered across different experiments. Currently, several steps of this analysis process require manual intervention of human experts. Our goal is to identify a fully automatic pipeline that yields competitive disease classification results compared to an established but subjective and tedious semi-manual process. Method We combine a large number of modern methods for peak detection, peak clustering, and multivariate classification into analysis pipelines for raw MCC-IMS data. We evaluate all combinations on three different real datasets in an unbiased cross-validation setting. We determine which specific algorithmic combinations lead to high AUC values in disease classifications across the different medical application scenarios. Results The best fully automated analysis process achieves even better classification results than the established manual process. The best algorithms for the three analysis steps are (i) SGLTR (Savitzky-Golay Laplace-operator filter thresholding regions) and LM (Local Maxima) for automated peak identification, (ii) EM clustering (Expectation Maximization) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise) for the clustering step and (iii) RF (Random Forest) for multivariate classification. Thus, automated methods can replace the manual steps in the analysis process to enable an unbiased high throughput use of the technology. PMID:28910313
Chemometric and multivariate statistical analysis of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulfides.

PubMed

Kalegowda, Yogesh; Harmer, Sarah L

2012-03-20

Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.
A multivariate pattern analysis study of the HIV-related white matter anatomical structural connections alterations

NASA Astrophysics Data System (ADS)

Tang, Zhenchao; Liu, Zhenyu; Li, Ruili; Cui, Xinwei; Li, Hongjun; Dong, Enqing; Tian, Jie

2017-03-01

It's widely known that HIV infection would cause white matter integrity impairments. Nevertheless, it is still unclear that how the white matter anatomical structural connections are affected by HIV infection. In the current study, we employed a multivariate pattern analysis to explore the HIV-related white matter connections alterations. Forty antiretroviraltherapy- naïve HIV patients and thirty healthy controls were enrolled. Firstly, an Automatic Anatomical Label (AAL) atlas based white matter structural network, a 90 × 90 FA-weighted matrix, was constructed for each subject. Then, the white matter connections deprived from the structural network were entered into a lasso-logistic regression model to perform HIV-control group classification. Using leave one out cross validation, a classification accuracy (ACC) of 90% (P=0.002) and areas under the receiver operating characteristic curve (AUC) of 0.96 was obtained by the classification model. This result indicated that the white matter anatomical structural connections contributed greatly to HIV-control group classification, providing solid evidence that the white matter connections were affected by HIV infection. Specially, 11 white matter connections were selected in the classification model, mainly crossing the regions of frontal lobe, Cingulum, Hippocampus, and Thalamus, which were reported to be damaged in previous HIV studies. This might suggest that the white matter connections adjacent to the HIV-related impaired regions were prone to be damaged.
Fast-HPLC Fingerprinting to Discriminate Olive Oil from Other Edible Vegetable Oils by Multivariate Classification Methods.

PubMed

Jiménez-Carvelo, Ana M; González-Casado, Antonio; Pérez-Castaño, Estefanía; Cuadros-Rodríguez, Luis

2017-03-01

A new analytical method for the differentiation of olive oil from other vegetable oils using reversed-phase LC and applying chemometric techniques was developed. A 3 cm short column was used to obtain the chromatographic fingerprint of the methyl-transesterified fraction of each vegetable oil. The chromatographic analysis took only 4 min. The multivariate classification methods used were k-nearest neighbors, partial least-squares (PLS) discriminant analysis, one-class PLS, support vector machine classification, and soft independent modeling of class analogies. The discrimination of olive oil from other vegetable edible oils was evaluated by several classification quality metrics. Several strategies for the classification of the olive oil were used: one input-class, two input-class, and pseudo two input-class.
Sparse Multivariate Autoregressive Modeling for Mild Cognitive Impairment Classification

PubMed Central

Li, Yang; Wee, Chong-Yaw; Jie, Biao; Peng, Ziwen

2014-01-01

Brain connectivity network derived from functional magnetic resonance imaging (fMRI) is becoming increasingly prevalent in the researches related to cognitive and perceptual processes. The capability to detect causal or effective connectivity is highly desirable for understanding the cooperative nature of brain network, particularly when the ultimate goal is to obtain good performance of control-patient classification with biological meaningful interpretations. Understanding directed functional interactions between brain regions via brain connectivity network is a challenging task. Since many genetic and biomedical networks are intrinsically sparse, incorporating sparsity property into connectivity modeling can make the derived models more biologically plausible. Accordingly, we propose an effective connectivity modeling of resting-state fMRI data based on the multivariate autoregressive (MAR) modeling technique, which is widely used to characterize temporal information of dynamic systems. This MAR modeling technique allows for the identification of effective connectivity using the Granger causality concept and reducing the spurious causality connectivity in assessment of directed functional interaction from fMRI data. A forward orthogonal least squares (OLS) regression algorithm is further used to construct a sparse MAR model. By applying the proposed modeling to mild cognitive impairment (MCI) classification, we identify several most discriminative regions, including middle cingulate gyrus, posterior cingulate gyrus, lingual gyrus and caudate regions, in line with results reported in previous findings. A relatively high classification accuracy of 91.89 % is also achieved, with an increment of 5.4 % compared to the fully-connected, non-directional Pearson-correlation-based functional connectivity approach. PMID:24595922
Arm structure in normal spiral galaxies, 1: Multivariate data for 492 galaxies

NASA Technical Reports Server (NTRS)

Magri, Christopher

1994-01-01

Multivariate data have been collected as part of an effort to develop a new classification system for spiral galaxies, one which is not necessarily based on subjective morphological properties. A sample of 492 moderately bright northern Sa and Sc spirals was chosen for future statistical analysis. New observations were made at 20 and 21 cm; the latter data are described in detail here. Infrared Astronomy Satellite (IRAS) fluxes were obtained from archival data. Finally, new estimates of arm pattern radomness and of local environmental harshness were compiled for most sample objects.
Grouping Parturients by Parity, Previous-Cesarean, and Mode of Delivery (P-C-MoD Classification) Better Identifies Groups at Risk for Postpartum Hemorrhage.

PubMed

Reichman, Orna; Gal, Micahel; Sela, Hen Y; Khayyat, Izzat; Emanuel, Michael; Samueloff, Arnon

2016-10-01

Objective We aimed to create a clinical classification to better identify parturients at risk for postpartum hemorrhage (PPH). Method A retrospective cohort, including all women who delivered at a single tertiary care medical center, between 2006 and 2014. Parturients were grouped by parity and history of cesarean delivery (CD): primiparas, multipara, and multipara with previous CD. Each were further subgrouped by mode of delivery (spontaneous vaginal delivery [SVD], operative vaginal delivery [OVD], emergency or elective CD). In all, 12 subgroups, based on parity, previous cesarean, and mode of delivery, formed the P-C-MoD classification. PPH was defined as a decrease of ≥3 gram% hemoglobin from admission and/or transfusion of blood products. Univariate analysis followed by multivariate analysis was performed to assess risk for PPH, controlling for confounders. Results The crude rate of PPH among 126,693 parturients was 7%. The prevalence differed significantly among independent risk factors: primiparity, 14%; multiparity, 4%; OVD, 22%; and CD, 15%. The P-C-MoD classification, segregated better between parturients at risk for PPH. The prevalence of PPH was highest for primiparous undergoing OVD (27%) compared with multiparous with SVD (3%), odds ratio [OR] = 12.8 (95% confidence interval [CI],11.9-13.9). These finding were consistent in the multivariate analysis OR = 13.1 (95% CI,12.1-14.3). Conclusion Employing the P-C-MoD classification more readily identifies parturients at risk for PPH and is superior to estimations based on single risk factors. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Predicting trauma patient mortality: ICD [or ICD-10-AM] versus AIS based approaches.

PubMed

Willis, Cameron D; Gabbe, Belinda J; Jolley, Damien; Harrison, James E; Cameron, Peter A

2010-11-01

The International Classification of Diseases Injury Severity Score (ICISS) has been proposed as an International Classification of Diseases (ICD)-10-based alternative to mortality prediction tools that use Abbreviated Injury Scale (AIS) data, including the Trauma and Injury Severity Score (TRISS). To date, studies have not examined the performance of ICISS using Australian trauma registry data. This study aimed to compare the performance of ICISS with other mortality prediction tools in an Australian trauma registry. This was a retrospective review of prospectively collected data from the Victorian State Trauma Registry. A training dataset was created for model development and a validation dataset for evaluation. The multiplicative ICISS model was compared with a worst injury ICISS approach, Victorian TRISS (V-TRISS, using local coefficients), maximum AIS severity and a multivariable model including ICD-10-AM codes as predictors. Models were investigated for discrimination (C-statistic) and calibration (Hosmer-Lemeshow statistic). The multivariable approach had the highest level of discrimination (C-statistic 0.90) and calibration (H-L 7.65, P= 0.468). Worst injury ICISS, V-TRISS and maximum AIS had similar performance. The multiplicative ICISS produced the lowest level of discrimination (C-statistic 0.80) and poorest calibration (H-L 50.23, P < 0.001). The performance of ICISS may be affected by the data used to develop estimates, the ICD version employed, the methods for deriving estimates and the inclusion of covariates. In this analysis, a multivariable approach using ICD-10-AM codes was the best-performing method. A multivariable ICISS approach may therefore be a useful alternative to AIS-based methods and may have comparable predictive performance to locally derived TRISS models. © 2010 The Authors. ANZ Journal of Surgery © 2010 Royal Australasian College of Surgeons.
Multivariate assessment of event-related potentials with the t-CWT method.

PubMed

Bostanov, Vladimir

2015-11-05

Event-related brain potentials (ERPs) are usually assessed with univariate statistical tests although they are essentially multivariate objects. Brain-computer interface applications are a notable exception to this practice, because they are based on multivariate classification of single-trial ERPs. Multivariate ERP assessment can be facilitated by feature extraction methods. One such method is t-CWT, a mathematical-statistical algorithm based on the continuous wavelet transform (CWT) and Student's t-test. This article begins with a geometric primer on some basic concepts of multivariate statistics as applied to ERP assessment in general and to the t-CWT method in particular. Further, it presents for the first time a detailed, step-by-step, formal mathematical description of the t-CWT algorithm. A new multivariate outlier rejection procedure based on principal component analysis in the frequency domain is presented as an important pre-processing step. The MATLAB and GNU Octave implementation of t-CWT is also made publicly available for the first time as free and open source code. The method is demonstrated on some example ERP data obtained in a passive oddball paradigm. Finally, some conceptually novel applications of the multivariate approach in general and of the t-CWT method in particular are suggested and discussed. Hopefully, the publication of both the t-CWT source code and its underlying mathematical algorithm along with a didactic geometric introduction to some basic concepts of multivariate statistics would make t-CWT more accessible to both users and developers in the field of neuroscience research.
Comparisons of severity classification systems for oropharyngeal dysfunction in children with cerebral palsy: Relations with other functional profiles.

PubMed

Goh, Yu-Ra; Choi, Ja Young; Kim, Seon Ah; Park, Jieun; Park, Eun Sook

2018-01-01

This study aimed to investigate the relationships between various classification systems assessing the severity of oropharyngeal dysphagia and communication function and other functional profiles in children with cerebral palsy (CP). This is a prospective, cross-sectional, study in a university-affiliated, tertiary-care hospital. We recruited 151 children with CP (mean age 6.11 years, SD 3.42, range 3-18yr). The Eating and Drinking Ability Classification System (EDACS) and the dysphagia scales of Functional Oral Intake Scale (FOIS), Swallow Function Scales (SFS), and Food Intake Level Scale (FILS) were used. The Communication Function Classification System (CFCS) and Viking Speech Scale (VSS) were employed to classify communication function and speech intelligibility, respectively. The Pediatric Evaluation of Disability Inventory (PEDI) with the Gross Motor Function Classification System (GFMCS) and the Manual Ability Classification System (MACS) level were also assessed. Spearman correlation analysis to investigate the associations between measures and univariate and multivariate logistic regression models to identify significant factors were used. Median GMFCS level of participants was III (interquartile range II-IV). Significant dysphagia based on EDACS level III-V was noted in 23 children (15.2%). There were strong to very strong relationships between the EDACS level with the dysphagia scales. The EDACS presented strong associations with MACS, CFCS, and VSS, a moderate association with GMFCS level, and a moderate to strong association with each domain of the PEDI. In multivariate analysis, poor functioning in EDACS were associated with poor functioning in gross motor and communication functions. Copyright © 2017. Published by Elsevier Ltd.
How many taxa can be recognized within the complex Tillandsia capillaris (Bromeliaceae, Tillandsioideae)? Analysis of the available classifications using a multivariate approach.

PubMed

Castello, Lucía V; Galetto, Leonardo

2013-01-01

Tillandsia capillaris Ruiz & Pav., which belongs to the subgenus Diaphoranthema is distributed in Ecuador, Peru, Bolivia, northern and central Argentina, and Chile, and includes forms that are difficult to circumscribe, thus considered to form a complex. The entities of this complex are predominantly small-sized epiphytes, adapted to xeric environments. The most widely used classification defines 5 forms for this complex based on few morphological reproductive traits: Tillandsia capillaris Ruiz & Pav. f. capillaris, Tillandsia capillaris f. incana (Mez) L.B. Sm., Tillandsia capillaris f. cordobensis (Hieron.) L.B. Sm., Tillandsia capillaris f. hieronymi (Mez) L.B. Sm. and Tillandsia capillaris f. virescens (Ruiz & Pav.) L.B. Sm. In this study, 35 floral and vegetative characters were analyzed with a multivariate approach in order to assess and discuss different proposals for classification of the Tillandsia capillaris complex, which presents morphotypes that co-occur in central and northern Argentina. To accomplish this, data of quantitative and categorical morphological characters of flowers and leaves were collected from herbarium specimens and field collections and were analyzed with statistical multivariate techniques. The results suggest that the last classification for the complex seems more comprehensive and three taxa were delimited: Tillandsia capillaris (=Tillandsia capillaris f. incana-hieronymi), Tillandsia virescens s. str. (=Tillandsia capillaris f. cordobensis) and Tillandsia virescens s. l. (=Tillandsia capillaris f. virescens). While Tillandsia capillaris and Tillandsia virescens s. str. co-occur, Tillandsia virescens s. l. is restricted to altitudes above 2000 m in Argentina. Characters previously used for taxa delimitation showed continuous variation and therefore were not useful. New diagnostic characters are proposed and a key is provided for delimiting these three taxa within the complex.
How many taxa can be recognized within the complex Tillandsia capillaris (Bromeliaceae, Tillandsioideae)? Analysis of the available classifications using a multivariate approach

PubMed Central

Castello, Lucía V.; Galetto, Leonardo

2013-01-01

Abstract Tillandsia capillaris Ruiz & Pav., which belongs to the subgenus Diaphoranthema is distributed in Ecuador, Peru, Bolivia, northern and central Argentina, and Chile, and includes forms that are difficult to circumscribe, thus considered to form a complex. The entities of this complex are predominantly small-sized epiphytes, adapted to xeric environments. The most widely used classification defines 5 forms for this complex based on few morphological reproductive traits: Tillandsia capillaris Ruiz & Pav. f. capillaris, Tillandsia capillaris f. incana (Mez) L.B. Sm., Tillandsia capillaris f. cordobensis (Hieron.) L.B. Sm., Tillandsia capillaris f. hieronymi (Mez) L.B. Sm. and Tillandsia capillaris f. virescens (Ruiz & Pav.) L.B. Sm. In this study, 35 floral and vegetative characters were analyzed with a multivariate approach in order to assess and discuss different proposals for classification of the Tillandsia capillaris complex, which presents morphotypes that co-occur in central and northern Argentina. To accomplish this, data of quantitative and categorical morphological characters of flowers and leaves were collected from herbarium specimens and field collections and were analyzed with statistical multivariate techniques. The results suggest that the last classification for the complex seems more comprehensive and three taxa were delimited: Tillandsia capillaris (=Tillandsia capillaris f. incana-hieronymi), Tillandsia virescens s. str. (=Tillandsia capillaris f. cordobensis) and Tillandsia virescens s. l. (=Tillandsia capillaris f. virescens). While Tillandsia capillaris and Tillandsia virescens s. str. co-occur, Tillandsia virescens s. l. is restricted to altitudes above 2000 m in Argentina. Characters previously used for taxa delimitation showed continuous variation and therefore were not useful. New diagnostic characters are proposed and a key is provided for delimiting these three taxa within the complex. PMID:23805053
The Japanese Histologic Classification and T-score in the Oxford Classification system could predict renal outcome in Japanese IgA nephropathy patients.

PubMed

Kaihan, Ahmad Baseer; Yasuda, Yoshinari; Katsuno, Takayuki; Kato, Sawako; Imaizumi, Takahiro; Ozeki, Takaya; Hishida, Manabu; Nagata, Takanobu; Ando, Masahiko; Tsuboi, Naotake; Maruyama, Shoichi

2017-12-01

The Oxford Classification is utilized globally, but has not been fully validated. In this study, we conducted a comparative analysis between the Oxford Classification and Japanese Histologic Classification (JHC) to predict renal outcome in Japanese patients with IgA nephropathy (IgAN). A retrospective cohort study including 86 adult IgAN patients was conducted. The Oxford Classification and the JHC were evaluated by 7 independent specialists. The JHC, MEST score in the Oxford Classification, and crescents were analyzed in association with renal outcome, defined as a 50% increase in serum creatinine. In multivariate analysis without the JHC, only the T score was significantly associated with renal outcome. While, a significant association was revealed only in the JHC on multivariate analysis with JHC. The JHC and T score in the Oxford Classification were associated with renal outcome among Japanese patients with IgAN. Superiority of the JHC as a predictive index should be validated with larger study population and cohort studies in different ethnicities.
Simultaneous fecal microbial and metabolite profiling enables accurate classification of pediatric irritable bowel syndrome.

PubMed

Shankar, Vijay; Reo, Nicholas V; Paliy, Oleg

2015-12-09

We previously showed that stool samples of pre-adolescent and adolescent US children diagnosed with diarrhea-predominant IBS (IBS-D) had different compositions of microbiota and metabolites compared to healthy age-matched controls. Here we explored whether observed fecal microbiota and metabolite differences between these two adolescent populations can be used to discriminate between IBS and health. We constructed individual microbiota- and metabolite-based sample classification models based on the partial least squares multivariate analysis and then applied a Bayesian approach to integrate individual models into a single classifier. The resulting combined classification achieved 84 % accuracy of correct sample group assignment and 86 % prediction for IBS-D in cross-validation tests. The performance of the cumulative classification model was further validated by the de novo analysis of stool samples from a small independent IBS-D cohort. High-throughput microbial and metabolite profiling of subject stool samples can be used to facilitate IBS diagnosis.

A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes.

PubMed

Mehmood, Tahir; Bohlin, Jon; Snipen, Lars

2015-01-01

The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.
Attachment-based classifications of children's family drawings: psychometric properties and relations with children's adjustment in kindergarten.

PubMed

Pianta, R C; Longmaid, K; Ferguson, J E

1999-06-01

Investigated an attachment-based theoretical framework and classification system, introduced by Kaplan and Main (1986), for interpreting children's family drawings. This study concentrated on the psychometric properties of the system and the relation between drawings classified using this system and teacher ratings of classroom social-emotional and behavioral functioning, controlling for child age, ethnic status, intelligence, and fine motor skills. This nonclinical sample consisted of 200 kindergarten children of diverse racial and socioeconomic status (SES). Limited support for reliability of this classification system was obtained. Kappas for overall classifications of drawings (e.g., secure) exceeded .80 and mean kappa for discrete drawing features (e.g., figures with smiles) was .82. Coders' endorsement of the presence of certain discrete drawing features predicted their overall classification at 82.5% accuracy. Drawing classification was related to teacher ratings of classroom functioning independent of child age, sex, race, SES, intelligence, and fine motor skills (with p values for the multivariate effects ranging from .043-.001). Results are discussed in terms of the psychometric properties of this system for classifying children's representations of family and the limitations of family drawing techniques for young children.
Evidence-based provisional clinical classification criteria for autoinflammatory periodic fevers.

PubMed

Federici, Silvia; Sormani, Maria Pia; Ozen, Seza; Lachmann, Helen J; Amaryan, Gayane; Woo, Patricia; Koné-Paut, Isabelle; Dewarrat, Natacha; Cantarini, Luca; Insalaco, Antonella; Uziel, Yosef; Rigante, Donato; Quartier, Pierre; Demirkaya, Erkan; Herlin, Troels; Meini, Antonella; Fabio, Giovanna; Kallinich, Tilmann; Martino, Silvana; Butbul, Aviel Yonatan; Olivieri, Alma; Kuemmerle-Deschner, Jasmin; Neven, Benedicte; Simon, Anna; Ozdogan, Huri; Touitou, Isabelle; Frenkel, Joost; Hofer, Michael; Martini, Alberto; Ruperto, Nicolino; Gattorno, Marco

2015-05-01

The objective of this work was to develop and validate a set of clinical criteria for the classification of patients affected by periodic fevers. Patients with inherited periodic fevers (familial Mediterranean fever (FMF); mevalonate kinase deficiency (MKD); tumour necrosis factor receptor-associated periodic fever syndrome (TRAPS); cryopyrin-associated periodic syndromes (CAPS)) enrolled in the Eurofever Registry up until March 2013 were evaluated. Patients with periodic fever, aphthosis, pharyngitis and adenitis (PFAPA) syndrome were used as negative controls. For each genetic disease, patients were considered to be 'gold standard' on the basis of the presence of a confirmatory genetic analysis. Clinical criteria were formulated on the basis of univariate and multivariate analysis in an initial group of patients (training set) and validated in an independent set of patients (validation set). A total of 1215 consecutive patients with periodic fevers were identified, and 518 gold standard patients (291 FMF, 74 MKD, 86 TRAPS, 67 CAPS) and 199 patients with PFAPA as disease controls were evaluated. The univariate and multivariate analyses identified a number of clinical variables that correlated independently with each disease, and four provisional classification scores were created. Cut-off values of the classification scores were chosen using receiver operating characteristic curve analysis as those giving the highest sensitivity and specificity. The classification scores were then tested in an independent set of patients (validation set) with an area under the curve of 0.98 for FMF, 0.95 for TRAPS, 0.96 for MKD, and 0.99 for CAPS. In conclusion, evidence-based provisional clinical criteria with high sensitivity and specificity for the clinical classification of patients with inherited periodic fevers have been developed. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Multivariate detrending of fMRI signal drifts for real-time multiclass pattern classification.

PubMed

Lee, Dongha; Jang, Changwon; Park, Hae-Jeong

2015-03-01

Signal drift in functional magnetic resonance imaging (fMRI) is an unavoidable artifact that limits classification performance in multi-voxel pattern analysis of fMRI. As conventional methods to reduce signal drift, global demeaning or proportional scaling disregards regional variations of drift, whereas voxel-wise univariate detrending is too sensitive to noisy fluctuations. To overcome these drawbacks, we propose a multivariate real-time detrending method for multiclass classification that involves spatial demeaning at each scan and the recursive detrending of drifts in the classifier outputs driven by a multiclass linear support vector machine. Experiments using binary and multiclass data showed that the linear trend estimation of the classifier output drift for each class (a weighted sum of drifts in the class-specific voxels) was more robust against voxel-wise artifacts that lead to inconsistent spatial patterns and the effect of online processing than voxel-wise detrending. The classification performance of the proposed method was significantly better, especially for multiclass data, than that of voxel-wise linear detrending, global demeaning, and classifier output detrending without demeaning. We concluded that the multivariate approach using classifier output detrending of fMRI signals with spatial demeaning preserves spatial patterns, is less sensitive than conventional methods to sample size, and increases classification performance, which is a useful feature for real-time fMRI classification. Copyright © 2014 Elsevier Inc. All rights reserved.
Multivariate pattern analysis of fMRI data reveals deficits in distributed representations in schizophrenia

PubMed Central

Yoon, Jong H.; Tamir, Diana; Minzenberg, Michael J.; Ragland, J. Daniel; Ursu, Stefan; Carter, Cameron S.

2009-01-01

Background Multivariate pattern analysis is an alternative method of analyzing fMRI data, which is capable of decoding distributed neural representations. We applied this method to test the hypothesis of the impairment in distributed representations in schizophrenia. We also compared the results of this method with traditional GLM-based univariate analysis. Methods 19 schizophrenia and 15 control subjects viewed two runs of stimuli--exemplars of faces, scenes, objects, and scrambled images. To verify engagement with stimuli, subjects completed a 1-back matching task. A multi-voxel pattern classifier was trained to identify category-specific activity patterns on one run of fMRI data. Classification testing was conducted on the remaining run. Correlation of voxel-wise activity across runs evaluated variance over time in activity patterns. Results Patients performed the task less accurately. This group difference was reflected in the pattern analysis results with diminished classification accuracy in patients compared to controls, 59% and 72% respectively. In contrast, there was no group difference in GLM-based univariate measures. In both groups, classification accuracy was significantly correlated with behavioral measures. Both groups showed highly significant correlation between inter-run correlations and classification accuracy. Conclusions Distributed representations of visual objects are impaired in schizophrenia. This impairment is correlated with diminished task performance, suggesting that decreased integrity of cortical activity patterns is reflected in impaired behavior. Comparisons with univariate results suggest greater sensitivity of pattern analysis in detecting group differences in neural activity and reduced likelihood of non-specific factors driving these results. PMID:18822407
Three-Way Analysis of Spectrospatial Electromyography Data: Classification and Interpretation

PubMed Central

Kauppi, Jukka-Pekka; Hahne, Janne; Müller, Klaus-Robert; Hyvärinen, Aapo

2015-01-01

Classifying multivariate electromyography (EMG) data is an important problem in prosthesis control as well as in neurophysiological studies and diagnosis. With modern high-density EMG sensor technology, it is possible to capture the rich spectrospatial structure of the myoelectric activity. We hypothesize that multi-way machine learning methods can efficiently utilize this structure in classification as well as reveal interesting patterns in it. To this end, we investigate the suitability of existing three-way classification methods to EMG-based hand movement classification in spectrospatial domain, as well as extend these methods by sparsification and regularization. We propose to use Fourier-domain independent component analysis as preprocessing to improve classification and interpretability of the results. In high-density EMG experiments on hand movements across 10 subjects, three-way classification yielded higher average performance compared with state-of-the art classification based on temporal features, suggesting that the three-way analysis approach can efficiently utilize detailed spectrospatial information of high-density EMG. Phase and amplitude patterns of features selected by the classifier in finger-movement data were found to be consistent with known physiology. Thus, our approach can accurately resolve hand and finger movements on the basis of detailed spectrospatial information, and at the same time allows for physiological interpretation of the results. PMID:26039100
Discriminative analysis of early Alzheimer's disease based on two intrinsically anti-correlated networks with resting-state fMRI.

PubMed

Wang, Kun; Jiang, Tianzi; Liang, Meng; Wang, Liang; Tian, Lixia; Zhang, Xinqing; Li, Kuncheng; Liu, Zhening

2006-01-01

In this work, we proposed a discriminative model of Alzheimer's disease (AD) on the basis of multivariate pattern classification and functional magnetic resonance imaging (fMRI). This model used the correlation/anti-correlation coefficients of two intrinsically anti-correlated networks in resting brains, which have been suggested by two recent studies, as the feature of classification. Pseudo-Fisher Linear Discriminative Analysis (pFLDA) was then performed on the feature space and a linear classifier was generated. Using leave-one-out (LOO) cross validation, our results showed a correct classification rate of 83%. We also compared the proposed model with another one based on the whole brain functional connectivity. Our proposed model outperformed the other one significantly, and this implied that the two intrinsically anti-correlated networks may be a more susceptible part of the whole brain network in the early stage of AD.
Recursive Partitioning Analysis for New Classification of Patients With Esophageal Cancer Treated by Chemoradiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nomura, Motoo, E-mail: excell@hkg.odn.ne.jp; Department of Clinical Oncology, Aichi Cancer Center Hospital, Nagoya; Department of Radiation Oncology, Aichi Cancer Center Hospital, Nagoya

2012-11-01

Background: The 7th edition of the American Joint Committee on Cancer staging system does not include lymph node size in the guidelines for staging patients with esophageal cancer. The objectives of this study were to determine the prognostic impact of the maximum metastatic lymph node diameter (ND) on survival and to develop and validate a new staging system for patients with esophageal squamous cell cancer who were treated with definitive chemoradiotherapy (CRT). Methods: Information on 402 patients with esophageal cancer undergoing CRT at two institutions was reviewed. Univariate and multivariate analyses of data from one institution were used to assessmore » the impact of clinical factors on survival, and recursive partitioning analysis was performed to develop the new staging classification. To assess its clinical utility, the new classification was validated using data from the second institution. Results: By multivariate analysis, gender, T, N, and ND stages were independently and significantly associated with survival (p < 0.05). The resulting new staging classification was based on the T and ND. The four new stages led to good separation of survival curves in both the developmental and validation datasets (p < 0.05). Conclusions: Our results showed that lymph node size is a strong independent prognostic factor and that the new staging system, which incorporated lymph node size, provided good prognostic power, and discriminated effectively for patients with esophageal cancer undergoing CRT.« less
Evaluation of AMOEBA: a spectral-spatial classification method

USGS Publications Warehouse

Jenson, Susan K.; Loveland, Thomas R.; Bryant, J.

1982-01-01

Muitispectral remotely sensed images have been treated as arbitrary multivariate spectral data for purposes of clustering and classifying. However, the spatial properties of image data can also be exploited. AMOEBA is a clustering and classification method that is based on a spatially derived model for image data. In an evaluation test, Landsat data were classified with both AMOEBA and a widely used spectral classifier. The test showed that irrigated crop types can be classified as accurately with the AMOEBA method as with the generally used spectral method ISOCLS; the AMOEBA method, however, requires less computer time.
Use of Neuroanatomical Pattern Classification to Identify Subjects in At-Risk Mental States of Psychosis and Predict Disease Transition

PubMed Central

Koutsouleris, Nikolaos; Meisenzahl, Eva M.; Davatzikos, Christos; Bottlender, Ronald; Frodl, Thomas; Scheuerecker, Johanna; Schmitt, Gisela; Zetzsche, Thomas; Decker, Petra; Reiser, Maximilian; Möller, Hans-Jürgen; Gaser, Christian

2014-01-01

Context Identification of individuals at high risk of developing psychosis has relied on prodromal symptomatology. Recently, machine learning algorithms have been successfully used for magnetic resonance imaging–based diagnostic classification of neuropsychiatric patient populations. Objective To determine whether multivariate neuroanatomical pattern classification facilitates identification of individuals in different at-risk mental states (ARMS) of psychosis and enables the prediction of disease transition at the individual level. Design Multivariate neuroanatomical pattern classification was performed on the structural magnetic resonance imaging data of individuals in early or late ARMS vs healthy controls (HCs). The predictive power of the method was then evaluated by categorizing the baseline imaging data of individuals with transition to psychosis vs those without transition vs HCs after 4 years of clinical follow-up. Classification generalizability was estimated by cross-validation and by categorizing an independent cohort of 45 new HCs. Setting Departments of Psychiatry and Psychotherapy, Ludwig-Maximilians-University, Munich, Germany. Participants The first classification analysis included 20 early and 25 late at-risk individuals and 25 matched HCs. The second analysis consisted of 15 individuals with transition, 18 without transition, and 17 matched HCs. Main Outcome Measures Specificity, sensitivity, and accuracy of classification. Results The 3-group, cross-validated classification accuracies of the first analysis were 86% (HCs vs the rest), 91% (early at-risk individuals vs the rest), and 86% (late at-risk individuals vs the rest). The accuracies in the second analysis were 90% (HCs vs the rest), 88% (individuals with transition vs the rest), and 86% (individuals without transition vs the rest). Independent HCs were correctly classified in 96% (first analysis) and 93% (second analysis) of cases. Conclusions Different ARMSs and their clinical outcomes may be reliably identified on an individual basis by assessing patterns of whole-brain neuroanatomical abnormalities. These patterns may serve as valuable biomarkers for the clinician to guide early detection in the prodromal phase of psychosis. PMID:19581561
Identification of Enterococcus, Streptococcus, and Staphylococcus by Multivariate Analysis of Proton Magnetic Resonance Spectroscopic Data from Plate Cultures

PubMed Central

Bourne, Roger; Himmelreich, Uwe; Sharma, Ansuiya; Mountford, Carolyn; Sorrell, Tania

2001-01-01

A new fingerprinting technique with the potential for rapid identification of bacteria was developed by combining proton magnetic resonance spectroscopy (1H MRS) with multivariate statistical analysis. This resulted in an objective identification strategy for common clinical isolates belonging to the bacterial species Staphylococcus aureus, Staphylococcus epidermidis, Enterococcus faecalis, Streptococcus pneumoniae, Streptococcus pyogenes, Streptococcus agalactiae, and the Streptococcus milleri group. Duplicate cultures of 104 different isolates were examined one or more times using 1H MRS. A total of 312 cultures were examined. An optimized classifier was developed using a bootstrapping process and a seven-group linear discriminant analysis to provide objective classification of the spectra. Identification of isolates was based on consistent high-probability classification of spectra from duplicate cultures and achieved 92% agreement with conventional methods of identification. Fewer than 1% of isolates were identified incorrectly. Identification of the remaining 7% of isolates was defined as indeterminate. PMID:11474013
A conceptual weather-type classification procedure for the Philadelphia, Pennsylvania, area

USGS Publications Warehouse

McCabe, Gregory J.

1990-01-01

A simple method of weather-type classification, based on a conceptual model of pressure systems that pass through the Philadelphia, Pennsylvania, area, has been developed. The only inputs required for the procedure are daily mean wind direction and cloud cover, which are used to index the relative position of pressure systems and fronts to Philadelphia.Daily mean wind-direction and cloud-cover data recorded at Philadelphia, Pennsylvania, from January 1954 through August 1988 were used to categorize daily weather conditions. The conceptual weather types reflect changes in daily air and dew-point temperatures, and changes in monthly mean temperature and monthly and annual precipitation. The weather-type classification produced by using the conceptual model was similar to a classification produced by using a multivariate statistical classification procedure. Even though the conceptual weather types are derived from a small amount of data, they appear to account for the variability of daily weather patterns sufficiently to describe distinct weather conditions for use in environmental analyses of weather-sensitive processes.
DEFINITION OF MULTIVARIATE GEOCHEMICAL ASSOCIATIONS WITH POLYMETALLIC MINERAL OCCURRENCES USING A SPATIALLY DEPENDENT CLUSTERING TECHNIQUE AND RASTERIZED STREAM SEDIMENT DATA - AN ALASKAN EXAMPLE.

USGS Publications Warehouse

Jenson, Susan K.; Trautwein, C.M.

1984-01-01

The application of an unsupervised, spatially dependent clustering technique (AMOEBA) to interpolated raster arrays of stream sediment data has been found to provide useful multivariate geochemical associations for modeling regional polymetallic resource potential. The technique is based on three assumptions regarding the compositional and spatial relationships of stream sediment data and their regional significance. These assumptions are: (1) compositionally separable classes exist and can be statistically distinguished; (2) the classification of multivariate data should minimize the pair probability of misclustering to establish useful compositional associations; and (3) a compositionally defined class represented by three or more contiguous cells within an array is a more important descriptor of a terrane than a class represented by spatial outliers.
False alarm reduction by the And-ing of multiple multivariate Gaussian classifiers

NASA Astrophysics Data System (ADS)

Dobeck, Gerald J.; Cobb, J. Tory

2003-09-01

The high-resolution sonar is one of the principal sensors used by the Navy to detect and classify sea mines in minehunting operations. For such sonar systems, substantial effort has been devoted to the development of automated detection and classification (D/C) algorithms. These have been spurred by several factors including (1) aids for operators to reduce work overload, (2) more optimal use of all available data, and (3) the introduction of unmanned minehunting systems. The environments where sea mines are typically laid (harbor areas, shipping lanes, and the littorals) give rise to many false alarms caused by natural, biologic, and man-made clutter. The objective of the automated D/C algorithms is to eliminate most of these false alarms while still maintaining a very high probability of mine detection and classification (PdPc). In recent years, the benefits of fusing the outputs of multiple D/C algorithms have been studied. We refer to this as Algorithm Fusion. The results have been remarkable, including reliable robustness to new environments. This paper describes a method for training several multivariate Gaussian classifiers such that their And-ing dramatically reduces false alarms while maintaining a high probability of classification. This training approach is referred to as the Focused- Training method. This work extends our 2001-2002 work where the Focused-Training method was used with three other types of classifiers: the Attractor-based K-Nearest Neighbor Neural Network (a type of radial-basis, probabilistic neural network), the Optimal Discrimination Filter Classifier (based linear discrimination theory), and the Quadratic Penalty Function Support Vector Machine (QPFSVM). Although our experience has been gained in the area of sea mine detection and classification, the principles described herein are general and can be applied to a wide range of pattern recognition and automatic target recognition (ATR) problems.
Discrimination of irradiated MOX fuel from UOX fuel by multivariate statistical analysis of simulated activities of gamma-emitting isotopes

NASA Astrophysics Data System (ADS)

Åberg Lindell, M.; Andersson, P.; Grape, S.; Hellesen, C.; Håkansson, A.; Thulin, M.

2018-03-01

This paper investigates how concentrations of certain fission products and their related gamma-ray emissions can be used to discriminate between uranium oxide (UOX) and mixed oxide (MOX) type fuel. Discrimination of irradiated MOX fuel from irradiated UOX fuel is important in nuclear facilities and for transport of nuclear fuel, for purposes of both criticality safety and nuclear safeguards. Although facility operators keep records on the identity and properties of each fuel, tools for nuclear safeguards inspectors that enable independent verification of the fuel are critical in the recovery of continuity of knowledge, should it be lost. A discrimination methodology for classification of UOX and MOX fuel, based on passive gamma-ray spectroscopy data and multivariate analysis methods, is presented. Nuclear fuels and their gamma-ray emissions were simulated in the Monte Carlo code Serpent, and the resulting data was used as input to train seven different multivariate classification techniques. The trained classifiers were subsequently implemented and evaluated with respect to their capabilities to correctly predict the classes of unknown fuel items. The best results concerning successful discrimination of UOX and MOX-fuel were acquired when using non-linear classification techniques, such as the k nearest neighbors method and the Gaussian kernel support vector machine. For fuel with cooling times up to 20 years, when it is considered that gamma-rays from the isotope 134Cs can still be efficiently measured, success rates of 100% were obtained. A sensitivity analysis indicated that these methods were also robust.
Incremental Validity of Multidimensional Proficiency Scores from Diagnostic Classification Models: An Illustration for Elementary School Mathematics

ERIC Educational Resources Information Center

Kunina-Habenicht, Olga; Rupp, André A.; Wilhelm, Oliver

2017-01-01

Diagnostic classification models (DCMs) hold great potential for applications in summative and formative assessment by providing discrete multivariate proficiency scores that yield statistically driven classifications of students. Using data from a newly developed diagnostic arithmetic assessment that was administered to 2032 fourth-grade students…
Unique Characteristics of Diagnostic Classification Models: A Comprehensive Review of the Current State-of-the-Art

ERIC Educational Resources Information Center

Rupp, Andre A.; Templin, Jonathan L.

2008-01-01

"Diagnostic classification models" (DCM) are frequently promoted by psychometricians as important modelling alternatives for analyzing response data in situations where multivariate classifications of respondents are made on the basis of multiple postulated latent skills. In this review paper, a definitional boundary of the space of DCM…
Selected-ion flow-tube mass-spectrometry (SIFT-MS) fingerprinting versus chemical profiling for geographic traceability of Moroccan Argan oils.

PubMed

Kharbach, Mourad; Kamal, Rabie; Mansouri, Mohammed Alaoui; Marmouzi, Ilias; Viaene, Johan; Cherrah, Yahia; Alaoui, Katim; Vercammen, Joeri; Bouklouze, Abdelaziz; Vander Heyden, Yvan

2018-10-15

This study investigated the effectiveness of SIFT-MS versus chemical profiling, both coupled to multivariate data analysis, to classify 95 Extra Virgin Argan Oils (EVAO), originating from five Moroccan Argan forest locations. The full scan option of SIFT-MS, is suitable to indicate the geographic origin of EVAO based on the fingerprints obtained using the three chemical ionization precursors (H 3 O + , NO + and O 2 + ). The chemical profiling (including acidity, peroxide value, spectrophotometric indices, fatty acids, tocopherols- and sterols composition) was also used for classification. Partial least squares discriminant analysis (PLS-DA), soft independent modeling of class analogy (SIMCA), K-nearest neighbors (KNN), and support vector machines (SVM), were compared. The SIFT-MS data were therefore fed to variable-selection methods to find potential biomarkers for classification. The classification models based either on chemical profiling or SIFT-MS data were able to classify the samples with high accuracy. SIFT-MS was found to be advantageous for rapid geographic classification. Copyright © 2018 Elsevier Ltd. All rights reserved.
Biological validation of physical coastal waters classification along the NE Atlantic region based on rocky macroalgae distribution

NASA Astrophysics Data System (ADS)

Ramos, Elvira; Puente, Araceli; Juanes, José Antonio; Neto, João M.; Pedersen, Are; Bartsch, Inka; Scanlan, Clare; Wilkes, Robert; Van den Bergh, Erika; Ar Gall, Erwan; Melo, Ricardo

2014-06-01

A methodology to classify rocky shores along the North East Atlantic (NEA) region was developed. Previously, biotypes and the variability of environmental conditions within these were recognized based on abiotic data. A biological validation was required in order to support the ecological meaning of the physical typologies obtained. A database of intertidal macroalgae species occurring in the coastal area between Norway and the South Iberian Peninsula was generated. Semi-quantitative abundance data of the most representative macroalgal taxa were collected in three levels: common, rare or absent. Ordination and classification multivariate analyses revealed a clear latitudinal gradient in the distribution of macroalgae species resulting in two distinct groups: one northern and one southern group, separated at the coast of Brittany (France). In general, the results based on biological data coincided with the results based on physical characteristics. The ecological meaning of the coastal waters classification at a broad scale shown in this work demonstrates that it can be valuable as a practical tool for conservation and management purposes.
Towards exaggerated emphysema stereotypes

NASA Astrophysics Data System (ADS)

Chen, C.; Sørensen, L.; Lauze, F.; Igel, C.; Loog, M.; Feragen, A.; de Bruijne, M.; Nielsen, M.

2012-03-01

Classification is widely used in the context of medical image analysis and in order to illustrate the mechanism of a classifier, we introduce the notion of an exaggerated image stereotype based on training data and trained classifier. The stereotype of some image class of interest should emphasize/exaggerate the characteristic patterns in an image class and visualize the information the employed classifier relies on. This is useful for gaining insight into the classification and serves for comparison with the biological models of disease. In this work, we build exaggerated image stereotypes by optimizing an objective function which consists of a discriminative term based on the classification accuracy, and a generative term based on the class distributions. A gradient descent method based on iterated conditional modes (ICM) is employed for optimization. We use this idea with Fisher's linear discriminant rule and assume a multivariate normal distribution for samples within a class. The proposed framework is applied to computed tomography (CT) images of lung tissue with emphysema. The synthesized stereotypes illustrate the exaggerated patterns of lung tissue with emphysema, which is underpinned by three different quantitative evaluation methods.

FTIR microspectroscopy for rapid screening and monitoring of polyunsaturated fatty acid production in commercially valuable marine yeasts and protists.

PubMed

Vongsvivut, Jitraporn; Heraud, Philip; Gupta, Adarsha; Puri, Munish; McNaughton, Don; Barrow, Colin J

2013-10-21

The increase in polyunsaturated fatty acid (PUFA) consumption has prompted research into alternative resources other than fish oil. In this study, a new approach based on focal-plane-array Fourier transform infrared (FPA-FTIR) microspectroscopy and multivariate data analysis was developed for the characterisation of some marine microorganisms. Cell and lipid compositions in lipid-rich marine yeasts collected from the Australian coast were characterised in comparison to a commercially available PUFA-producing marine fungoid protist, thraustochytrid. Multivariate classification methods provided good discriminative accuracy evidenced from (i) separation of the yeasts from thraustochytrids and distinct spectral clusters among the yeasts that conformed well to their biological identities, and (ii) correct classification of yeasts from a totally independent set using cross-validation testing. The findings further indicated additional capability of the developed FPA-FTIR methodology, when combined with partial least squares regression (PLSR) analysis, for rapid monitoring of lipid production in one of the yeasts during the growth period, which was achieved at a high accuracy compared to the results obtained from the traditional lipid analysis based on gas chromatography. The developed FTIR-based approach when coupled to programmable withdrawal devices and a cytocentrifugation module would have strong potential as a novel online monitoring technology suited for bioprocessing applications and large-scale production.
A revision of chiggers of the minuta species-group (Acari: Trombiculidae: Neotrombicula Hirst, 1925) using multivariate morphometrics.

PubMed

Stekolnikov, Alexandr A; Klimov, Pavel B

2010-09-01

We revise chiggers belonging to the minuta-species group (genus Neotrombicula Hirst, 1925) from the Palaearctic using size-free multivariate morphometrics. This approach allowed us to resolve several diagnostic problems. We show that the widely distributed Neotrombicula scrupulosa Kudryashova, 1993 forms three spatially and ecologically isolated groups different from each other in size or shape (morphometric property) only: specimens from the Caucasus are distinct from those from Asia in shape, whereas the Asian specimens from plains and mountains are different from each other in size. We developed a multivariate classification model to separate three closely related species: N. scrupulosa, N. lubrica Kudryashova, 1993 and N. minuta Schluger, 1966. This model is based on five shape variables selected from an initial 17 variables by a best subset analysis using a custom size-correction subroutine. The variable selection procedure slightly improved the predictive power of the model, suggesting that it not only removed redundancy but also reduced 'noise' in the dataset. The overall classification accuracy of this model is 96.2, 96.2 and 95.5%, as estimated by internal validation, external validation and jackknife statistics, respectively. Our analyses resulted in one new synonymy: N. dimidiata Stekolnikov, 1995 is considered to be a synonym of N. lubrica. Both N. scrupulosa and N. lubrica are recorded from new localities. A key to species of the minuta-group incorporating results from our multivariate analyses is presented.
Determination of fragrance content in perfume by Raman spectroscopy and multivariate calibration

NASA Astrophysics Data System (ADS)

Godinho, Robson B.; Santos, Mauricio C.; Poppi, Ronei J.

2016-03-01

An alternative methodology is herein proposed for determination of fragrance content in perfumes and their classification according to the guidelines established by fine perfume manufacturers. The methodology is based on Raman spectroscopy associated with multivariate calibration, allowing the determination of fragrance content in a fast, nondestructive, and sustainable manner. The results were considered consistent with the conventional method, whose standard error of prediction values was lower than the 1.0%. This result indicates that the proposed technology is a feasible analytical tool for determination of the fragrance content in a hydro-alcoholic solution for use in manufacturing, quality control and regulatory agencies.
Physiological sensor signals classification for healthcare using sensor data fusion and case-based reasoning.

PubMed

Begum, Shahina; Barua, Shaibal; Ahmed, Mobyen Uddin

2014-07-03

Today, clinicians often do diagnosis and classification of diseases based on information collected from several physiological sensor signals. However, sensor signal could easily be vulnerable to uncertain noises or interferences and due to large individual variations sensitivity to different physiological sensors could also vary. Therefore, multiple sensor signal fusion is valuable to provide more robust and reliable decision. This paper demonstrates a physiological sensor signal classification approach using sensor signal fusion and case-based reasoning. The proposed approach has been evaluated to classify Stressed or Relaxed individuals using sensor data fusion. Physiological sensor signals i.e., Heart Rate (HR), Finger Temperature (FT), Respiration Rate (RR), Carbon dioxide (CO2) and Oxygen Saturation (SpO2) are collected during the data collection phase. Here, sensor fusion has been done in two different ways: (i) decision-level fusion using features extracted through traditional approaches; and (ii) data-level fusion using features extracted by means of Multivariate Multiscale Entropy (MMSE). Case-Based Reasoning (CBR) is applied for the classification of the signals. The experimental result shows that the proposed system could classify Stressed or Relaxed individual 87.5% accurately compare to an expert in the domain. So, it shows promising result in the psychophysiological domain and could be possible to adapt this approach to other relevant healthcare systems.
A statistical approach to root system classification

PubMed Central

Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

2013-01-01

Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200
A statistical approach to root system classification.

PubMed

Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

2013-01-01

Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential.
Multivariate Classification of Structural MRI Data Detects Chronic Low Back Pain

PubMed Central

Ung, Hoameng; Brown, Justin E.; Johnson, Kevin A.; Younger, Jarred; Hush, Julia; Mackey, Sean

2014-01-01

Chronic low back pain (cLBP) has a tremendous personal and socioeconomic impact, yet the underlying pathology remains a mystery in the majority of cases. An objective measure of this condition, that augments self-report of pain, could have profound implications for diagnostic characterization and therapeutic development. Contemporary research indicates that cLBP is associated with abnormal brain structure and function. Multivariate analyses have shown potential to detect a number of neurological diseases based on structural neuroimaging. Therefore, we aimed to empirically evaluate such an approach in the detection of cLBP, with a goal to also explore the relevant neuroanatomy. We extracted brain gray matter (GM) density from magnetic resonance imaging scans of 47 patients with cLBP and 47 healthy controls. cLBP was classified with an accuracy of 76% by support vector machine analysis. Primary drivers of the classification included areas of the somatosensory, motor, and prefrontal cortices—all areas implicated in the pain experience. Differences in areas of the temporal lobe, including bordering the amygdala, medial orbital gyrus, cerebellum, and visual cortex, were also useful for the classification. Our findings suggest that cLBP is characterized by a pattern of GM changes that can have discriminative power and reflect relevant pathological brain morphology. PMID:23246778
Discriminant analysis of cardiovascular and respiratory variables for classification of road cyclists by specialty.

PubMed

Nikolić, Biljana; Martinović, Jelena; Matić, Milan; Stefanović, Đorđe

2018-05-29

Different variables determine the performance of cyclists, which brings up the question how these parameters may help in their classification by specialty. The aim of the study was to determine differences in cardiorespiratory parameters of male cyclists according to their specialty, flat rider (N=21), hill rider (N=35) and sprinter (N=20) and obtain the multivariate model for further cyclists classification by specialties, based on selected variables. Seventeen variables were measured at submaximal and maximum load on the cycle ergometer Cosmed E 400HK (Cosmed, Rome, Italy) (initial 100W with 25W increase, 90-100 rpm). Multivariate discriminant analysis was used to determine which variables group cyclists within their specialty, and to predict which variables can direct cyclists to a particular specialty. Among nine variables that statistically contribute to the discriminant power of the model, achieved power on the anaerobic threshold and the produced CO2 had the biggest impact. The obtained discriminatory model correctly classified 91.43% of flat riders, 85.71% of hill riders, while sprinters were classified completely correct (100%), i.e. 92.10% of examinees were correctly classified, which point out the strength of the discriminatory model. Respiratory indicators mostly contribute to the discriminant power of the model, which may significantly contribute to training practice and laboratory tests in future.
Non-targeted 1H NMR fingerprinting and multivariate statistical analyses for the characterisation of the geographical origin of Italian sweet cherries.

PubMed

Longobardi, F; Ventrella, A; Bianco, A; Catucci, L; Cafagna, I; Gallo, V; Mastrorilli, P; Agostiano, A

2013-12-01

In this study, non-targeted (1)H NMR fingerprinting was used in combination with multivariate statistical techniques for the classification of Italian sweet cherries based on their different geographical origins (Emilia Romagna and Puglia). As classification techniques, Soft Independent Modelling of Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Linear Discriminant Analysis (LDA) were carried out and the results were compared. For LDA, before performing a refined selection of the number/combination of variables, two different strategies for a preliminary reduction of the variable number were tested. The best average recognition and CV prediction abilities (both 100.0%) were obtained for all the LDA models, although PLS-DA also showed remarkable performances (94.6%). All the statistical models were validated by observing the prediction abilities with respect to an external set of cherry samples. The best result (94.9%) was obtained with LDA by performing a best subset selection procedure on a set of 30 principal components previously selected by a stepwise decorrelation. The metabolites that mostly contributed to the classification performances of such LDA model, were found to be malate, glucose, fructose, glutamine and succinate. Copyright © 2013 Elsevier Ltd. All rights reserved.
Decoding the Traumatic Memory among Women with PTSD: Implications for Neurocircuitry Models of PTSD and Real-Time fMRI Neurofeedback

PubMed Central

Cisler, Josh M.; Bush, Keith; James, G. Andrew; Smitherman, Sonet; Kilts, Clinton D.

2015-01-01

Posttraumatic Stress Disorder (PTSD) is characterized by intrusive recall of the traumatic memory. While numerous studies have investigated the neural processing mechanisms engaged during trauma memory recall in PTSD, these analyses have only focused on group-level contrasts that reveal little about the predictive validity of the identified brain regions. By contrast, a multivariate pattern analysis (MVPA) approach towards identifying the neural mechanisms engaged during trauma memory recall would entail testing whether a multivariate set of brain regions is reliably predictive of (i.e., discriminates) whether an individual is engaging in trauma or non-trauma memory recall. Here, we use a MVPA approach to test 1) whether trauma memory vs neutral memory recall can be predicted reliably using a multivariate set of brain regions among women with PTSD related to assaultive violence exposure (N=16), 2) the methodological parameters (e.g., spatial smoothing, number of memory recall repetitions, etc.) that optimize classification accuracy and reproducibility of the feature weight spatial maps, and 3) the correspondence between brain regions that discriminate trauma memory recall and the brain regions predicted by neurocircuitry models of PTSD. Cross-validation classification accuracy was significantly above chance for all methodological permutations tested; mean accuracy across participants was 76% for the methodological parameters selected as optimal for both efficiency and accuracy. Classification accuracy was significantly better for a voxel-wise approach relative to voxels within restricted regions-of-interest (ROIs); classification accuracy did not differ when using PTSD-related ROIs compared to randomly generated ROIs. ROI-based analyses suggested the reliable involvement of the left hippocampus in discriminating memory recall across participants and that the contribution of the left amygdala to the decision function was dependent upon PTSD symptom severity. These results have methodological implications for real-time fMRI neurofeedback of the trauma memory in PTSD and conceptual implications for neurocircuitry models of PTSD that attempt to explain core neural processing mechanisms mediating PTSD. PMID:26241958
Decoding the Traumatic Memory among Women with PTSD: Implications for Neurocircuitry Models of PTSD and Real-Time fMRI Neurofeedback.

PubMed

Cisler, Josh M; Bush, Keith; James, G Andrew; Smitherman, Sonet; Kilts, Clinton D

2015-01-01

Posttraumatic Stress Disorder (PTSD) is characterized by intrusive recall of the traumatic memory. While numerous studies have investigated the neural processing mechanisms engaged during trauma memory recall in PTSD, these analyses have only focused on group-level contrasts that reveal little about the predictive validity of the identified brain regions. By contrast, a multivariate pattern analysis (MVPA) approach towards identifying the neural mechanisms engaged during trauma memory recall would entail testing whether a multivariate set of brain regions is reliably predictive of (i.e., discriminates) whether an individual is engaging in trauma or non-trauma memory recall. Here, we use a MVPA approach to test 1) whether trauma memory vs neutral memory recall can be predicted reliably using a multivariate set of brain regions among women with PTSD related to assaultive violence exposure (N=16), 2) the methodological parameters (e.g., spatial smoothing, number of memory recall repetitions, etc.) that optimize classification accuracy and reproducibility of the feature weight spatial maps, and 3) the correspondence between brain regions that discriminate trauma memory recall and the brain regions predicted by neurocircuitry models of PTSD. Cross-validation classification accuracy was significantly above chance for all methodological permutations tested; mean accuracy across participants was 76% for the methodological parameters selected as optimal for both efficiency and accuracy. Classification accuracy was significantly better for a voxel-wise approach relative to voxels within restricted regions-of-interest (ROIs); classification accuracy did not differ when using PTSD-related ROIs compared to randomly generated ROIs. ROI-based analyses suggested the reliable involvement of the left hippocampus in discriminating memory recall across participants and that the contribution of the left amygdala to the decision function was dependent upon PTSD symptom severity. These results have methodological implications for real-time fMRI neurofeedback of the trauma memory in PTSD and conceptual implications for neurocircuitry models of PTSD that attempt to explain core neural processing mechanisms mediating PTSD.
Multivariate analyses of Erzgebirge granite and rhyolite composition: Implications for classification of granites and their genetic relations

USGS Publications Warehouse

Forster, H.-J.; Davis, J.C.; Tischendorf, G.; Seltmann, R.

1999-01-01

High-precision major, minor and trace element analyses for 44 elements have been made of 329 Late Variscan granitic and rhyolitic rocks from the Erzgebirge metallogenic province of Germany. The intrusive histories of some of these granites are not completely understood and exposures of rock are not adequate to resolve relationships between what apparently are different plutons. Therefore, it is necessary to turn to chemical analyses to decipher the evolution of the plutons and their relationships. A new classification of Erzgebirge plutons into five major groups of granites, based on petrologic interpretations of geochemical and mineralogical relationships (low-F biotite granites; low-F two-mica granites; high-F, high-P2O5 Li-mica granites; high-F, low-P2O5 Li-mica granites; high-F, low-P2O5 biotite granites) was tested by multivariate techniques. Canonical analyses of major elements, minor elements, trace elements and ratio variables all distinguish the groups with differing amounts of success. Univariate ANOVA's, in combination with forward-stepwise and backward-elimination canonical analyses, were used to select ten variables which were most effective in distinguishing groups. In a biplot, groups form distinct clusters roughly arranged along a quadratic path. Within groups, individual plutons tend to be arranged in patterns possibly reflecting granitic evolution. Canonical functions were used to classify samples of rhyolites of unknown association into the five groups. Another canonical analysis was based on ten elements traditionally used in petrology and which were important in the new classification of granites. Their biplot pattern is similar to that from statistically chosen variables but less effective at distinguishing the five groups of granites. This study shows that multivariate statistical techniques can provide significant insight into problems of granitic petrogenesis and may be superior to conventional procedures for petrological interpretation.
Comparison of Xenon-Enhanced Area-Detector CT and Krypton Ventilation SPECT/CT for Assessment of Pulmonary Functional Loss and Disease Severity in Smokers.

PubMed

Ohno, Yoshiharu; Fujisawa, Yasuko; Takenaka, Daisuke; Kaminaga, Shigeo; Seki, Shinichiro; Sugihara, Naoki; Yoshikawa, Takeshi

2018-02-01

The objective of this study was to compare the capability of xenon-enhanced area-detector CT (ADCT) performed with a subtraction technique and coregistered 81m Kr-ventilation SPECT/CT for the assessment of pulmonary functional loss and disease severity in smokers. Forty-six consecutive smokers (32 men and 14 women; mean age, 67.0 years) underwent prospective unenhanced and xenon-enhanced ADCT, 81m Kr-ventilation SPECT/CT, and pulmonary function tests. Disease severity was evaluated according to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) classification. CT-based functional lung volume (FLV), the percentage of wall area to total airway area (WA%), and ventilated FLV on xenon-enhanced ADCT and SPECT/CT were calculated for each smoker. All indexes were correlated with percentage of forced expiratory volume in 1 second (%FEV 1 ) using step-wise regression analyses, and univariate and multivariate logistic regression analyses were performed. In addition, the diagnostic accuracy of the proposed model was compared with that of each radiologic index by means of McNemar analysis. Multivariate logistic regression showed that %FEV 1 was significantly affected (r = 0.77, r 2 = 0.59) by two factors: the first factor, ventilated FLV on xenon-enhanced ADCT (p < 0.0001); and the second factor, WA% (p = 0.004). Univariate logistic regression analyses indicated that all indexes significantly affected GOLD classification (p < 0.05). Multivariate logistic regression analyses revealed that ventilated FLV on xenon-enhanced ADCT and CT-based FLV significantly influenced GOLD classification (p < 0.0001). The diagnostic accuracy of the proposed model was significantly higher than that of ventilated FLV on SPECT/CT (p = 0.03) and WA% (p = 0.008). Xenon-enhanced ADCT is more effective than 81m Kr-ventilation SPECT/CT for the assessment of pulmonary functional loss and disease severity.
Development of an ecological classification system for the Wayne National Forest

Treesearch

David M. Hix; Andrea M. Chech

1993-01-01

In 1991, a collaborative research project was initiated to create an ecological classification system for the Wayne National Forest of southeastern Ohio. The work focuses on the ecological land type (ELT) level of ecosystem classification. The most common ELTs are being identified and described using information from intensive field sampling and multivariate data...
Improving the analysis of near-spectroscopy data with multivariate classification of hemodynamic patterns: a theoretical formulation and validation.

PubMed

Gemignani, Jessica; Middell, Eike; Barbour, Randall L; Graber, Harry L; Blankertz, Benjamin

2018-04-04

The statistical analysis of functional near infrared spectroscopy (fNIRS) data based on the general linear model (GLM) is often made difficult by serial correlations, high inter-subject variability of the hemodynamic response, and the presence of motion artifacts. In this work we propose to extract information on the pattern of hemodynamic activations without using any a priori model for the data, by classifying the channels as 'active' or 'not active' with a multivariate classifier based on linear discriminant analysis (LDA). This work is developed in two steps. First we compared the performance of the two analyses, using a synthetic approach in which simulated hemodynamic activations were combined with either simulated or real resting-state fNIRS data. This procedure allowed for exact quantification of the classification accuracies of GLM and LDA. In the case of real resting-state data, the correlations between classification accuracy and demographic characteristics were investigated by means of a Linear Mixed Model. In the second step, to further characterize the reliability of the newly proposed analysis method, we conducted an experiment in which participants had to perform a simple motor task and data were analyzed with the LDA-based classifier as well as with the standard GLM analysis. The results of the simulation study show that the LDA-based method achieves higher classification accuracies than the GLM analysis, and that the LDA results are more uniform across different subjects and, in contrast to the accuracies achieved by the GLM analysis, have no significant correlations with any of the demographic characteristics. Findings from the real-data experiment are consistent with the results of the real-plus-simulation study, in that the GLM-analysis results show greater inter-subject variability than do the corresponding LDA results. The results obtained suggest that the outcome of GLM analysis is highly vulnerable to violations of theoretical assumptions, and that therefore a data-driven approach such as that provided by the proposed LDA-based method is to be favored.
Multivariate Density Estimation and Remote Sensing

NASA Technical Reports Server (NTRS)

Scott, D. W.

1983-01-01

Current efforts to develop methods and computer algorithms to effectively represent multivariate data commonly encountered in remote sensing applications are described. While this may involve scatter diagrams, multivariate representations of nonparametric probability density estimates are emphasized. The density function provides a useful graphical tool for looking at data and a useful theoretical tool for classification. This approach is called a thunderstorm data analysis.
A Raman spectroscopy bio-sensor for tissue discrimination in surgical robotics.

PubMed

Ashok, Praveen C; Giardini, Mario E; Dholakia, Kishan; Sibbett, Wilson

2014-01-01

We report the development of a fiber-based Raman sensor to be used in tumour margin identification during endoluminal robotic surgery. Although this is a generic platform, the sensor we describe was adapted for the ARAKNES (Array of Robots Augmenting the KiNematics of Endoluminal Surgery) robotic platform. On such a platform, the Raman sensor is intended to identify ambiguous tissue margins during robot-assisted surgeries. To maintain sterility of the probe during surgical intervention, a disposable sleeve was specially designed. A straightforward user-compatible interface was implemented where a supervised multivariate classification algorithm was used to classify different tissue types based on specific Raman fingerprints so that it could be used without prior knowledge of spectroscopic data analysis. The protocol avoids inter-patient variability in data and the sensor system is not restricted for use in the classification of a particular tissue type. Representative tissue classification assessments were performed using this system on excised tissue. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Reliable Early Classification on Multivariate Time Series with Numerical and Categorical Attributes

DTIC Science & Technology

2015-05-22

design a procedure of feature extraction in REACT named MEG (Mining Equivalence classes with shapelet Generators) based on the concept of...Equivalence Classes Mining [12, 15]. MEG can efficiently and effectively generate the discriminative features. In addition, several strategies are proposed...technique of parallel computing [4] to propose a process of pa- rallel MEG for substantially reducing the computational overhead of discovering shapelet
Refined composite multivariate generalized multiscale fuzzy entropy: A tool for complexity analysis of multichannel signals

NASA Astrophysics Data System (ADS)

Azami, Hamed; Escudero, Javier

2017-01-01

Multiscale entropy (MSE) is an appealing tool to characterize the complexity of time series over multiple temporal scales. Recent developments in the field have tried to extend the MSE technique in different ways. Building on these trends, we propose the so-called refined composite multivariate multiscale fuzzy entropy (RCmvMFE) whose coarse-graining step uses variance (RCmvMFEσ2) or mean (RCmvMFEμ). We investigate the behavior of these multivariate methods on multichannel white Gaussian and 1/ f noise signals, and two publicly available biomedical recordings. Our simulations demonstrate that RCmvMFEσ2 and RCmvMFEμ lead to more stable results and are less sensitive to the signals' length in comparison with the other existing multivariate multiscale entropy-based methods. The classification results also show that using both the variance and mean in the coarse-graining step offers complexity profiles with complementary information for biomedical signal analysis. We also made freely available all the Matlab codes used in this paper.
Unsupervised classification of multivariate geostatistical data: Two algorithms

NASA Astrophysics Data System (ADS)

Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

2015-12-01

With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.

Geomorphic Classification and Assessment of Channel Dynamics in the Missouri National Recreational River, South Dakota and Nebraska

USGS Publications Warehouse

Elliott, Caroline M.; Jacobson, Robert B.

2006-01-01

A multiscale geomorphic classification was established for the 39-mile, 59-mile, and adjacent segments of the Missouri National Recreational River administered by the National Park Service in South Dakota and Nebraska. The objective of the classification was to define naturally occurring clusters of geomorphic characteristics that would be indicative of discrete sets of geomorphic processes, with the intent that such a classification would be useful in river-management and rehabilitation decisions. The statistical classification was based on geomorphic characteristics of the river collected from 1999 orthophotography and the persistence of classified units was evaluated by comparison with similar datasets for 2003 and 2004 and by evaluating variation of bank erosion rates by geomorphic class. Changes in channel location and form were also explored using imagery and maps from 1993-2004, 1941 and 1894. The multivariate classification identified a hierarchy of naturally occurring clusters of reach-scale geomorphic characteristics. The simplest level of the hierarchy divides the river from segments into discrete reaches characterized by single and multithread channels and additional hierarchical levels established 4-part and 10-part classifications. The classification system presents a physical framework that can be applied to prioritization and design of bank stabilization projects, design of habitat rehabilitation projects, and stratification of monitoring and assessment sampling programs.
Intrapartum fetal heart rate classification from trajectory in Sparse SVM feature space.

PubMed

Spilka, J; Frecon, J; Leonarduzzi, R; Pustelnik, N; Abry, P; Doret, M

2015-01-01

Intrapartum fetal heart rate (FHR) constitutes a prominent source of information for the assessment of fetal reactions to stress events during delivery. Yet, early detection of fetal acidosis remains a challenging signal processing task. The originality of the present contribution are three-fold: multiscale representations and wavelet leader based multifractal analysis are used to quantify FHR variability ; Supervised classification is achieved by means of Sparse-SVM that aim jointly to achieve optimal detection performance and to select relevant features in a multivariate setting ; Trajectories in the feature space accounting for the evolution along time of features while labor progresses are involved in the construction of indices quantifying fetal health. The classification performance permitted by this combination of tools are quantified on a intrapartum FHR large database (≃ 1250 subjects) collected at a French academic public hospital.
Grading the neuroendocrine tumors of the lung: an evidence-based proposal.

PubMed

Rindi, G; Klersy, C; Inzani, F; Fellegara, G; Ampollini, L; Ardizzoni, A; Campanini, N; Carbognani, P; De Pas, T M; Galetta, D; Granone, P L; Righi, L; Rusca, M; Spaggiari, L; Tiseo, M; Viale, G; Volante, M; Papotti, M; Pelosi, G

2014-02-01

Lung neuroendocrine tumors are catalogued in four categories by the World Health Organization (WHO 2004) classification. Its reproducibility and prognostic efficacy was disputed. The WHO 2010 classification of digestive neuroendocrine neoplasms is based on Ki67 proliferation assessment and proved prognostically effective. This study aims at comparing these two classifications and at defining a prognostic grading system for lung neuroendocrine tumors. The study included 399 patients who underwent surgery and with at least 1 year follow-up between 1989 and 2011. Data on 21 variables were collected, and performance of grading systems and their components was compared by Cox regression and multivariable analyses. All statistical tests were two-sided. At Cox analysis, WHO 2004 stratified patients into three major groups with statistically significant survival difference (typical carcinoid vs atypical carcinoid (AC), P=0.021; AC vs large-cell/small-cell lung neuroendocrine carcinomas, P<0.001). Optimal discrimination in three groups was observed by Ki67% (Ki67% cutoffs: G1 <4, G2 4-<25, G3 ≥25; G1 vs G2, P=0.021; and G2 vs G3, P≤0.001), mitotic count (G1 ≤2, G2 >2-47, G3 >47; G1 vs G2, P≤0.001; and G2 vs G3, P≤0.001), and presence of necrosis (G1 absent, G2 <10% of sample, G3 >10% of sample; G1 vs G2, P≤0.001; and G2 vs G3, P≤0.001) at uni and multivariable analyses. The combination of these three variables resulted in a simple and effective grading system. A three-tiers grading system based on Ki67 index, mitotic count, and necrosis with cutoffs specifically generated for lung neuroendocrine tumors is prognostically effective and accurate.
Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

NASA Astrophysics Data System (ADS)

Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying

2018-06-01

In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.
Visual classification of very fine-grained sediments: Evaluation through univariate and multivariate statistics

USGS Publications Warehouse

Hohn, M. Ed; Nuhfer, E.B.; Vinopal, R.J.; Klanderman, D.S.

1980-01-01

Classifying very fine-grained rocks through fabric elements provides information about depositional environments, but is subject to the biases of visual taxonomy. To evaluate the statistical significance of an empirical classification of very fine-grained rocks, samples from Devonian shales in four cored wells in West Virginia and Virginia were measured for 15 variables: quartz, illite, pyrite and expandable clays determined by X-ray diffraction; total sulfur, organic content, inorganic carbon, matrix density, bulk density, porosity, silt, as well as density, sonic travel time, resistivity, and ??-ray response measured from well logs. The four lithologic types comprised: (1) sharply banded shale, (2) thinly laminated shale, (3) lenticularly laminated shale, and (4) nonbanded shale. Univariate and multivariate analyses of variance showed that the lithologic classification reflects significant differences for the variables measured, difference that can be detected independently of stratigraphic effects. Little-known statistical methods found useful in this work included: the multivariate analysis of variance with more than one effect, simultaneous plotting of samples and variables on canonical variates, and the use of parametric ANOVA and MANOVA on ranked data. ?? 1980 Plenum Publishing Corporation.
Estuarial fingerprinting through multidimensional fluorescence and multivariate analysis.

PubMed

Hall, Gregory J; Clow, Kerin E; Kenny, Jonathan E

2005-10-01

As part of a strategy for preventing the introduction of aquatic nuisance species (ANS) to U.S. estuaries, ballast water exchange (BWE) regulations have been imposed. Enforcing these regulations requires a reliable method for determining the port of origin of water in the ballast tanks of ships entering U.S. waters. This study shows that a three-dimensional fluorescence fingerprinting technique, excitation emission matrix (EEM) spectroscopy, holds great promise as a ballast water analysis tool. In our technique, EEMs are analyzed by multivariate classification and curve resolution methods, such as N-way partial least squares Regression-discriminant analysis (NPLS-DA) and parallel factor analysis (PARAFAC). We demonstrate that classification techniques can be used to discriminate among sampling sites less than 10 miles apart, encompassing Boston Harbor and two tributaries in the Mystic River Watershed. To our knowledge, this work is the first to use multivariate analysis to classify water as to location of origin. Furthermore, it is shown that curve resolution can show seasonal features within the multidimensional fluorescence data sets, which correlate with difficulty in classification.
Large Uptake of Titania and Iron Oxide Nanoparticles in the Nucleus of Lung Epithelial Cells as Measured by Raman Imaging and Multivariate Classification

PubMed Central

Ahlinder, Linnea; Ekstrand-Hammarström, Barbro; Geladi, Paul; Österlund, Lars

2013-01-01

It is a challenging task to characterize the biodistribution of nanoparticles in cells and tissue on a subcellular level. Conventional methods to study the interaction of nanoparticles with living cells rely on labeling techniques that either selectively stain the particles or selectively tag them with tracer molecules. In this work, Raman imaging, a label-free technique that requires no extensive sample preparation, was combined with multivariate classification to quantify the spatial distribution of oxide nanoparticles inside living lung epithelial cells (A549). Cells were exposed to TiO2 (titania) and/or α-FeO(OH) (goethite) nanoparticles at various incubation times (4 or 48 h). Using multivariate classification of hyperspectral Raman data with partial least-squares discriminant analysis, we show that a surprisingly large fraction of spectra, classified as belonging to the cell nucleus, show Raman bands associated with nanoparticles. Up to 40% of spectra from the cell nucleus show Raman bands associated with nanoparticles. Complementary transmission electron microscopy data for thin cell sections qualitatively support the conclusions. PMID:23870252
Stability and bias of classification rates in biological applications of discriminant analysis

USGS Publications Warehouse

Williams, B.K.; Titus, K.; Hines, J.E.

1990-01-01

We assessed the sampling stability of classification rates in discriminant analysis by using a factorial design with factors for multivariate dimensionality, dispersion structure, configuration of group means, and sample size. A total of 32,400 discriminant analyses were conducted, based on data from simulated populations with appropriate underlying statistical distributions. Simulation results indicated strong bias in correct classification rates when group sample sizes were small and when overlap among groups was high. We also found that stability of the correct classification rates was influenced by these factors, indicating that the number of samples required for a given level of precision increases with the amount of overlap among groups. In a review of 60 published studies, we found that 57% of the articles presented results on classification rates, though few of them mentioned potential biases in their results. Wildlife researchers should choose the total number of samples per group to be at least 2 times the number of variables to be measured when overlap among groups is low. Substantially more samples are required as the overlap among groups increases
Multivariate classification of the infrared spectra of cell and tissue samples

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haaland, D.M.; Jones, H.D.; Thomas, E.V.

1997-03-01

Infrared microspectroscopy of biopsied canine lymph cells and tissue was performed to investigate the possibility of using IR spectra coupled with multivariate classification methods to classify the samples as normal, hyperplastic, or neoplastic (malignant). IR spectra were obtained in transmission mode through BaF{sub 2} windows and in reflection mode from samples prepared on gold-coated microscope slides. Cytology and histopathology samples were prepared by a variety of methods to identify the optimal methods of sample preparation. Cytospinning procedures that yielded a monolayer of cells on the BaF{sub 2} windows produced a limited set of IR transmission spectra. These transmission spectra weremore » converted to absorbance and formed the basis for a classification rule that yielded 100{percent} correct classification in a cross-validated context. Classifications of normal, hyperplastic, and neoplastic cell sample spectra were achieved by using both partial least-squares (PLS) and principal component regression (PCR) classification methods. Linear discriminant analysis applied to principal components obtained from the spectral data yielded a small number of misclassifications. PLS weight loading vectors yield valuable qualitative insight into the molecular changes that are responsible for the success of the infrared classification. These successful classification results show promise for assisting pathologists in the diagnosis of cell types and offer future potential for {ital in vivo} IR detection of some types of cancer. {copyright} {ital 1997} {ital Society for Applied Spectroscopy}« less
Comparing statistical and machine learning classifiers: alternatives for predictive modeling in human factors research.

PubMed

Carnahan, Brian; Meyer, Gérard; Kuntz, Lois-Ann

2003-01-01

Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches--genetic programming and decision tree induction--were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.
Determination of fragrance content in perfume by Raman spectroscopy and multivariate calibration.

PubMed

Godinho, Robson B; Santos, Mauricio C; Poppi, Ronei J

2016-03-15

An alternative methodology is herein proposed for determination of fragrance content in perfumes and their classification according to the guidelines established by fine perfume manufacturers. The methodology is based on Raman spectroscopy associated with multivariate calibration, allowing the determination of fragrance content in a fast, nondestructive, and sustainable manner. The results were considered consistent with the conventional method, whose standard error of prediction values was lower than the 1.0%. This result indicates that the proposed technology is a feasible analytical tool for determination of the fragrance content in a hydro-alcoholic solution for use in manufacturing, quality control and regulatory agencies. Copyright © 2015 Elsevier B.V. All rights reserved.
Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

PubMed

Henrard, S; Speybroeck, N; Hermans, C

2015-11-01

Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Forensic Discrimination of Latent Fingerprints Using Laser-Induced Breakdown Spectroscopy (LIBS) and Chemometric Approaches.

PubMed

Yang, Jun-Ho; Yoh, Jack J

2018-01-01

A novel technique is reported for separating overlapping latent fingerprints using chemometric approaches that combine laser-induced breakdown spectroscopy (LIBS) and multivariate analysis. The LIBS technique provides the capability of real time analysis and high frequency scanning as well as the data regarding the chemical composition of overlapping latent fingerprints. These spectra offer valuable information for the classification and reconstruction of overlapping latent fingerprints by implementing appropriate statistical multivariate analysis. The current study employs principal component analysis and partial least square methods for the classification of latent fingerprints from the LIBS spectra. This technique was successfully demonstrated through a classification study of four distinct latent fingerprints using classification methods such as soft independent modeling of class analogy (SIMCA) and partial least squares discriminant analysis (PLS-DA). The novel method yielded an accuracy of more than 85% and was proven to be sufficiently robust. Furthermore, through laser scanning analysis at a spatial interval of 125 µm, the overlapping fingerprints were reconstructed as separate two-dimensional forms.
Identifying when tagged fishes have been consumed by piscivorous predators: application of multivariate mixture models to movement parameters of telemetered fishes

USGS Publications Warehouse

Romine, Jason G.; Perry, Russell W.; Johnston, Samuel V.; Fitzer, Christopher W.; Pagliughi, Stephen W.; Blake, Aaron R.

2013-01-01

Mixture models proved valuable as a means to differentiate between salmonid smolts and predators that consumed salmonid smolts. However, successful application of this method requires that telemetered fishes and their predators exhibit measurable differences in movement behavior. Our approach is flexible, allows inclusion of multiple track statistics and improves upon rule-based manual classification methods.
Characterization of Escherichia coli isolates from different fecal sources by means of classification tree analysis of fatty acid methyl ester (FAME) profiles.

PubMed

Seurinck, Sylvie; Deschepper, Ellen; Deboch, Bishaw; Verstraete, Willy; Siciliano, Steven

2006-03-01

Microbial source tracking (MST) methods need to be rapid, inexpensive and accurate. Unfortunately, many MST methods provide a wealth of information that is difficult to interpret by the regulators who use this information to make decisions. This paper describes the use of classification tree analysis to interpret the results of a MST method based on fatty acid methyl ester (FAME) profiles of Escherichia coli isolates, and to present results in a format readily interpretable by water quality managers. Raw sewage E. coli isolates and animal E. coli isolates from cow, dog, gull, and horse were isolated and their FAME profiles collected. Correct classification rates determined with leaveone-out cross-validation resulted in an overall low correct classification rate of 61%. A higher overall correct classification rate of 85% was obtained when the animal isolates were pooled together and compared to the raw sewage isolates. Bootstrap aggregation or adaptive resampling and combining of the FAME profile data increased correct classification rates substantially. Other MST methods may be better suited to differentiate between different fecal sources but classification tree analysis has enabled us to distinguish raw sewage from animal E. coli isolates, which previously had not been possible with other multivariate methods such as principal component analysis and cluster analysis.
Using Copula Distributions to Support More Accurate Imaging-Based Diagnostic Classifiers for Neuropsychiatric Disorders

PubMed Central

Bansal, Ravi; Hao, Xuejun; Liu, Jun; Peterson, Bradley S.

2014-01-01

Many investigators have tried to apply machine learning techniques to magnetic resonance images (MRIs) of the brain in order to diagnose neuropsychiatric disorders. Usually the number of brain imaging measures (such as measures of cortical thickness and measures of local surface morphology) derived from the MRIs (i.e., their dimensionality) has been large (e.g. >10) relative to the number of participants who provide the MRI data (<100). Sparse data in a high dimensional space increases the variability of the classification rules that machine learning algorithms generate, thereby limiting the validity, reproducibility, and generalizability of those classifiers. The accuracy and stability of the classifiers can improve significantly if the multivariate distributions of the imaging measures can be estimated accurately. To accurately estimate the multivariate distributions using sparse data, we propose to estimate first the univariate distributions of imaging data and then combine them using a Copula to generate more accurate estimates of their multivariate distributions. We then sample the estimated Copula distributions to generate dense sets of imaging measures and use those measures to train classifiers. We hypothesize that the dense sets of brain imaging measures will generate classifiers that are stable to variations in brain imaging measures, thereby improving the reproducibility, validity, and generalizability of diagnostic classification algorithms in imaging datasets from clinical populations. In our experiments, we used both computer-generated and real-world brain imaging datasets to assess the accuracy of multivariate Copula distributions in estimating the corresponding multivariate distributions of real-world imaging data. Our experiments showed that diagnostic classifiers generated using imaging measures sampled from the Copula were significantly more accurate and more reproducible than were the classifiers generated using either the real-world imaging measures or their multivariate Gaussian distributions. Thus, our findings demonstrate that estimated multivariate Copula distributions can generate dense sets of brain imaging measures that can in turn be used to train classifiers, and those classifiers are significantly more accurate and more reproducible than are those generated using real-world imaging measures alone. PMID:25093634
Parental Perceptions of Their Adolescent's Weight Status: The ECHO Study

ERIC Educational Resources Information Center

Hearst, Mary O.; Sherwood, Nancy E.; Klein, Elizabeth G.; Pasch, Keryn E.; Lytle, Leslie A.

2011-01-01

Objectives: To assess the correlates of parental classification of adolescent weight status. Methods: Measured adolescent weight status was compared to parent self-report perception data (n 374 dyads) using multivariate analyses with interactions to identify characteristics associated with inaccurate parent classification of adolescent weight…
Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

PubMed

Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

2016-01-01

Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
The influence of different classification standards of age groups on prognosis in high-grade hemispheric glioma patients.

PubMed

Chen, Jian-Wu; Zhou, Chang-Fu; Lin, Zhi-Xiong

2015-09-15

Although age is thought to correlate with the prognosis of glioma patients, the most appropriate age-group classification standard to evaluate prognosis had not been fully studied. This study aimed to investigate the influence of age-group classification standards on the prognosis of patients with high-grade hemispheric glioma (HGG). This retrospective study of 125 HGG patients used three different classification standards of age-groups (≤ 50 and >50 years old, ≤ 60 and >60 years old, ≤ 45 and 45-65 and ≥ 65 years old) to evaluate the impact of age on prognosis. The primary end-point was overall survival (OS). The Kaplan-Meier method was applied for univariate analysis and Cox proportional hazards model for multivariate analysis. Univariate analysis showed a significant correlation between OS and all three classification standards of age-groups as well as between OS and pathological grade, gender, location of glioma, and regular chemotherapy and radiotherapy treatment. Multivariate analysis showed that the only independent predictors of OS were classification standard of age-groups ≤ 50 and > 50 years old, pathological grade and regular chemotherapy. In summary, the most appropriate classification standard of age-groups as an independent prognostic factor was ≤ 50 and > 50 years old. Pathological grade and chemotherapy were also independent predictors of OS in post-operative HGG patients. Copyright © 2015. Published by Elsevier B.V.
Weather patterns as a downscaling tool - evaluating their skill in stratifying local climate variables

NASA Astrophysics Data System (ADS)

Murawski, Aline; Bürger, Gerd; Vorogushyn, Sergiy; Merz, Bruno

2016-04-01

The use of a weather pattern based approach for downscaling of coarse, gridded atmospheric data, as usually obtained from the output of general circulation models (GCM), allows for investigating the impact of anthropogenic greenhouse gas emissions on fluxes and state variables of the hydrological cycle such as e.g. on runoff in large river catchments. Here we aim at attributing changes in high flows in the Rhine catchment to anthropogenic climate change. Therefore we run an objective classification scheme (simulated annealing and diversified randomisation - SANDRA, available from the cost733 classification software) on ERA20C reanalyses data and apply the established classification to GCMs from the CMIP5 project. After deriving weather pattern time series from GCM runs using forcing from all greenhouse gases (All-Hist) and using natural greenhouse gas forcing only (Nat-Hist), a weather generator will be employed to obtain climate data time series for the hydrological model. The parameters of the weather pattern classification (i.e. spatial extent, number of patterns, classification variables) need to be selected in a way that allows for good stratification of the meteorological variables that are of interest for the hydrological modelling. We evaluate the skill of the classification in stratifying meteorological data using a multi-variable approach. This allows for estimating the stratification skill for all meteorological variables together, not separately as usually done in existing similar work. The advantage of the multi-variable approach is to properly account for situations where e.g. two patterns are associated with similar mean daily temperature, but one pattern is dry while the other one is related to considerable amounts of precipitation. Thus, the separation of these two patterns would not be justified when considering temperature only, but is perfectly reasonable when accounting for precipitation as well. Besides that, the weather patterns derived from reanalyses data should be well represented in the All-Hist GCM runs in terms of e.g. frequency, seasonality, and persistence. In this contribution we show how to select the most appropriate weather pattern classification and how the classes derived from it are reflected in the GCMs.

Diagnostic Classification of Schizophrenia Patients on the Basis of Regional Reward-Related fMRI Signal Patterns

PubMed Central

Koch, Stefan P.; Hägele, Claudia; Haynes, John-Dylan; Heinz, Andreas; Schlagenhauf, Florian; Sterzer, Philipp

2015-01-01

Functional neuroimaging has provided evidence for altered function of mesolimbic circuits implicated in reward processing, first and foremost the ventral striatum, in patients with schizophrenia. While such findings based on significant group differences in brain activations can provide important insights into the pathomechanisms of mental disorders, the use of neuroimaging results from standard univariate statistical analysis for individual diagnosis has proven difficult. In this proof of concept study, we tested whether the predictive accuracy for the diagnostic classification of schizophrenia patients vs. healthy controls could be improved using multivariate pattern analysis (MVPA) of regional functional magnetic resonance imaging (fMRI) activation patterns for the anticipation of monetary reward. With a searchlight MVPA approach using support vector machine classification, we found that the diagnostic category could be predicted from local activation patterns in frontal, temporal, occipital and midbrain regions, with a maximal cluster peak classification accuracy of 93% for the right pallidum. Region-of-interest based MVPA for the ventral striatum achieved a maximal cluster peak accuracy of 88%, whereas the classification accuracy on the basis of standard univariate analysis reached only 75%. Moreover, using support vector regression we could additionally predict the severity of negative symptoms from ventral striatal activation patterns. These results show that MVPA can be used to substantially increase the accuracy of diagnostic classification on the basis of task-related fMRI signal patterns in a regionally specific way. PMID:25799236
Spatial patterns of brain atrophy in MCI patients, identified via high-dimensional pattern classification, predict subsequent cognitive decline

PubMed Central

Fan, Yong; Batmanghelich, Nematollah; Clark, Chris M.; Davatzikos, Christos

2010-01-01

Spatial patterns of brain atrophy in mild cognitive impairment (MCI) and Alzheimer’s disease (AD) were measured via methods of computational neuroanatomy. These patterns were spatially complex and involved many brain regions. In addition to the hippocampus and the medial temporal lobe gray matter, a number of other regions displayed significant atrophy, including orbitofrontal and medial-prefrontal grey matter, cingulate (mainly posterior), insula, uncus, and temporal lobe white matter. Approximately 2/3 of the MCI group presented patterns of atrophy that overlapped with AD, whereas the remaining 1/3 overlapped with cognitively normal individuals, thereby indicating that some, but not all, MCI patients have significant and extensive brain atrophy in this cohort of MCI patients. Importantly, the group with AD-like patterns presented much higher rate of MMSE decline in follow-up visits; conversely, pattern classification provided relatively high classification accuracy (87%) of the individuals that presented relatively higher MMSE decline within a year from baseline. High-dimensional pattern classification, a nonlinear multivariate analysis, provided measures of structural abnormality that can potentially be useful for individual patient classification, as well as for predicting progression and examining multivariate relationships in group analyses. PMID:18053747
Are regional variations in activity of dispatcher-assisted cardiopulmonary resuscitation associated with out-of-hospital cardiac arrests outcomes? A nation-wide population-based cohort study.

PubMed

Nishi, Taiki; Kamikura, Takahisa; Funada, Akira; Myojo, Yasuhiro; Ishida, Tetsuya; Inaba, Hideo

2016-01-01

Dispatcher-assisted cardiopulmonary resuscitation (DA-CPR) impacts the rates of bystander CPR (BCPR) and survival after out-of-hospital cardiac arrests (OHCAs). This study aimed to elucidate whether regional variations in indexes for BCPR and emergency medical service (EMS) may be associated with OHCA outcomes. We conducted a population-based observational study involving 157,093 bystander-witnessed, resuscitation-attempted OHCAs without physician involvement between 2007 and 2011. For each index of BCPR and EMS, we classified the 47 prefectures into the following three groups: advanced, intermediate, and developing regions. Nominal logit analysis followed by multivariable logistic regression including OHCA backgrounds was employed to examine the association between neurologically favourable 1-month survival, and regional classifications based on BCPR- and EMS-related indexes. Logit analysis including all regional classifications revealed that the number of BLS training course participants per population or bystander's own performance of BCPR without DA-CPR was not associated with the survival. Multivariable logistic regression including the OHCA backgrounds known to be associated with survival (BCPR provision, arrest aetiology, initial rhythm, patient age, time intervals of witness-to-call and call-to-arrival at patient), the following regional classifications based on DA-CPR but not on EMS were associated with survival: sensitivity of DA-CPR [adjusted odds ratio (95% confidence intervals) for advanced region; those for intermediate region, with developing region as reference, 1.277 (1.131-1.441); 1.162 (1.058-1.277)]; the proportion of bystanders to follow DA-CPR [1.749 (1.554-1.967); 1.280 (1.188-1.380)]. Good outcomes of bystander-witnessed OHCAs correlate with regions having higher sensitivity of DA-CPR and larger proportion of bystanders to follow DA-CPR. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Authentication of Trappist beers by LC-MS fingerprints and multivariate data analysis.

PubMed

Mattarucchi, Elia; Stocchero, Matteo; Moreno-Rojas, José Manuel; Giordano, Giuseppe; Reniero, Fabiano; Guillou, Claude

2010-12-08

The aim of this study was to asses the applicability of LC-MS profiling to authenticate a selected Trappist beer as part of a program on traceability funded by the European Commission. A total of 232 beers were fingerprinted and classified through multivariate data analysis. The selected beer was clearly distinguished from beers of different brands, while only 3 samples (3.5% of the test set) were wrongly classified when compared with other types of beer of the same Trappist brewery. The fingerprints were further analyzed to extract the most discriminating variables, which proved to be sufficient for classification, even using a simplified unsupervised model. This reduced fingerprint allowed us to study the influence of batch-to-batch variability on the classification model. Our results can easily be applied to different matrices and they confirmed the effectiveness of LC-MS profiling in combination with multivariate data analysis for the characterization of food products.
Multivariate classification of edible salts: Simultaneous Laser-Induced Breakdown Spectroscopy and Laser-Ablation Inductively Coupled Plasma Mass Spectrometry Analysis

NASA Astrophysics Data System (ADS)

Lee, Yonghoon; Nam, Sang-Ho; Ham, Kyung-Sik; Gonzalez, Jhanis; Oropeza, Dayana; Quarles, Derrick; Yoo, Jonghyun; Russo, Richard E.

2016-04-01

Laser-Induced Breakdown Spectroscopy (LIBS) and Laser-Ablation Inductively Coupled Plasma Mass Spectrometry (LA-ICP-MS), both based on laser ablation sampling, can be employed simultaneously to obtain different chemical fingerprints from a sample. We demonstrated that this analysis approach can provide complementary information for improved classification of edible salts. LIBS could detect several of the minor metallic elements along with Na and Cl, while LA-ICP-MS spectra were used to measure non-metallic and trace heavy metal elements. Principal component analysis using LIBS and LA-ICP-MS spectra showed that their major spectral variations classified the sample salts in different ways. Three classification models were developed by using partial least squares-discriminant analysis based on the LIBS, LA-ICP-MS, and their fused data. From the cross-validation performances and confusion matrices of these models, the minor metallic elements (Mg, Ca, and K) detected by LIBS and the non-metallic (I) and trace heavy metal (Ba, W, and Pb) elements detected by LA-ICP-MS provided complementary chemical information to distinguish particular salt samples.
Quality classification of Spanish olive oils by untargeted gas chromatography coupled to hybrid quadrupole-time of flight mass spectrometry with atmospheric pressure chemical ionization and metabolomics-based statistical approach.

PubMed

Sales, C; Cervera, M I; Gil, R; Portolés, T; Pitarch, E; Beltran, J

2017-02-01

The novel atmospheric pressure chemical ionization (APCI) source has been used in combination with gas chromatography (GC) coupled to hybrid quadrupole time-of-flight (QTOF) mass spectrometry (MS) for determination of volatile components of olive oil, enhancing its potential for classification of olive oil samples according to their quality using a metabolomics-based approach. The full-spectrum acquisition has allowed the detection of volatile organic compounds (VOCs) in olive oil samples, including Extra Virgin, Virgin and Lampante qualities. A dynamic headspace extraction with cartridge solvent elution was applied. The metabolomics strategy consisted of three different steps: a full mass spectral alignment of GC-MS data using MzMine 2.0, a multivariate analysis using Ez-Info and the creation of the statistical model with combinations of responses for molecular fragments. The model was finally validated using blind samples, obtaining an accuracy in oil classification of 70%, taking the official established method, "PANEL TEST", as reference. Copyright © 2016 Elsevier Ltd. All rights reserved.
PyMVPA: A python toolbox for multivariate pattern analysis of fMRI data.

PubMed

Hanke, Michael; Halchenko, Yaroslav O; Sederberg, Per B; Hanson, Stephen José; Haxby, James V; Pollmann, Stefan

2009-01-01

Decoding patterns of neural activity onto cognitive states is one of the central goals of functional brain imaging. Standard univariate fMRI analysis methods, which correlate cognitive and perceptual function with the blood oxygenation-level dependent (BOLD) signal, have proven successful in identifying anatomical regions based on signal increases during cognitive and perceptual tasks. Recently, researchers have begun to explore new multivariate techniques that have proven to be more flexible, more reliable, and more sensitive than standard univariate analysis. Drawing on the field of statistical learning theory, these new classifier-based analysis techniques possess explanatory power that could provide new insights into the functional properties of the brain. However, unlike the wealth of software packages for univariate analyses, there are few packages that facilitate multivariate pattern classification analyses of fMRI data. Here we introduce a Python-based, cross-platform, and open-source software toolbox, called PyMVPA, for the application of classifier-based analysis techniques to fMRI datasets. PyMVPA makes use of Python's ability to access libraries written in a large variety of programming languages and computing environments to interface with the wealth of existing machine learning packages. We present the framework in this paper and provide illustrative examples on its usage, features, and programmability.
PyMVPA: A Python toolbox for multivariate pattern analysis of fMRI data

PubMed Central

Hanke, Michael; Halchenko, Yaroslav O.; Sederberg, Per B.; Hanson, Stephen José; Haxby, James V.; Pollmann, Stefan

2009-01-01

Decoding patterns of neural activity onto cognitive states is one of the central goals of functional brain imaging. Standard univariate fMRI analysis methods, which correlate cognitive and perceptual function with the blood oxygenation-level dependent (BOLD) signal, have proven successful in identifying anatomical regions based on signal increases during cognitive and perceptual tasks. Recently, researchers have begun to explore new multivariate techniques that have proven to be more flexible, more reliable, and more sensitive than standard univariate analysis. Drawing on the field of statistical learning theory, these new classifier-based analysis techniques possess explanatory power that could provide new insights into the functional properties of the brain. However, unlike the wealth of software packages for univariate analyses, there are few packages that facilitate multivariate pattern classification analyses of fMRI data. Here we introduce a Python-based, cross-platform, and open-source software toolbox, called PyMVPA, for the application of classifier-based analysis techniques to fMRI datasets. PyMVPA makes use of Python's ability to access libraries written in a large variety of programming languages and computing environments to interface with the wealth of existing machine-learning packages. We present the framework in this paper and provide illustrative examples on its usage, features, and programmability. PMID:19184561
Data analysis techniques

NASA Technical Reports Server (NTRS)

Park, Steve

1990-01-01

A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.
Discrimination of inflammatory bowel disease using Raman spectroscopy and linear discriminant analysis methods

NASA Astrophysics Data System (ADS)

Ding, Hao; Cao, Ming; DuPont, Andrew W.; Scott, Larry D.; Guha, Sushovan; Singhal, Shashideep; Younes, Mamoun; Pence, Isaac; Herline, Alan; Schwartz, David; Xu, Hua; Mahadevan-Jansen, Anita; Bi, Xiaohong

2016-03-01

Inflammatory bowel disease (IBD) is an idiopathic disease that is typically characterized by chronic inflammation of the gastrointestinal tract. Recently much effort has been devoted to the development of novel diagnostic tools that can assist physicians for fast, accurate, and automated diagnosis of the disease. Previous research based on Raman spectroscopy has shown promising results in differentiating IBD patients from normal screening cases. In the current study, we examined IBD patients in vivo through a colonoscope-coupled Raman system. Optical diagnosis for IBD discrimination was conducted based on full-range spectra using multivariate statistical methods. Further, we incorporated several feature selection methods in machine learning into the classification model. The diagnostic performance for disease differentiation was significantly improved after feature selection. Our results showed that improved IBD diagnosis can be achieved using Raman spectroscopy in combination with multivariate analysis and feature selection.
PREDICTING APHASIA TYPE FROM BRAIN DAMAGE MEASURED WITH STRUCTURAL MRI

PubMed Central

Yourganov, Grigori; Smith, Kimberly G.; Fridriksson, Julius; Rorden, Chris

2015-01-01

Chronic aphasia is a common consequence of a left-hemisphere stroke. Since the early insights by Broca and Wernicke, studying the relationship between the loci of cortical damage and patterns of language impairment has been one of the concerns of aphasiology. We utilized multivariate classification in a cross-validation framework to predict the type of chronic aphasia from the spatial pattern of brain damage. Our sample consisted of 98 patients with five types of aphasia (Broca’s, Wernicke’s, global, conduction, and anomic), classified based on scores on the Western Aphasia Battery. Binary lesion maps were obtained from structural MRI scans (obtained at least 6 months poststroke, and within 2 days of behavioural assessment); after spatial normalization, the lesions were parcellated into a disjoint set of brain areas. The proportion of damage to the brain areas was used to classify patients’ aphasia type. To create this parcellation, we relied on five brain atlases; our classifier (support vector machine) could differentiate between different kinds of aphasia using any of the five parcellations. In our sample, the best classification accuracy was obtained when using a novel parcellation that combined two previously published brain atlases, with the first atlas providing the segmentation of grey matter, and the second atlas used to segment the white matter. For each aphasia type, we computed the relative importance of different brain areas for distinguishing it from other aphasia types; our findings were consistent with previously published reports of lesion locations implicated in different types of aphasia. Overall, our results revealed that automated multivariate classification could distinguish between aphasia types based on damage to atlas-defined brain areas. PMID:26465238
Comparison of cystatin C and creatinine to determine the incidence of composite adverse outcomes in HIV-infected individuals.

PubMed

Yanagisawa, Naoki; Sasaki, Shugo; Suganuma, Akihiko; Imamura, Akifumi; Ajisawa, Atsushi; Ando, Minoru

2015-02-01

Cystatin C is an overall biomarker of pathophysiologic abnormalities that accompany chronic kidney disease (CKD). The utility of cystatin C is not fully understood in an HIV-infected population. This prospective study investigated 661 HIV-infected individuals for 4 years to determine the incidence of adverse outcomes, including all-cause mortality, cardiovascular disease, and renal dysfunction. The risk of developing the outcomes was discriminated with a 4 color-coded classification in a 3 × 6 contingency table, that combined 3 grades of dipstick proteinuria with 6 grades of estimated glomerular filtration rate (eGFR) calculated using either serum creatinine (eGFRcr) or cystatin C (eGFRcy): green, low risk; yellow, moderately increased risk; orange, high risk; and red, very high risk. The cumulative incidence of the outcomes was assessed by the Kaplan-Meier method, and the association between color-coded risk and the time to outcome was evaluated using multivariate proportional hazards analysis. Compared with eGFRcr, the use of eGFRcy reduced the prevalence of risk ≥ orange by 0.8%. The adverse outcomes were significantly more likely to occur to the patients with baseline risk category ≥orange than those with ≤ yellow, independent of risk categories based on eGFRcr or eGFRcy. However, in multivariate analysis, risk category ≥orange with eGFRcy-based classification was significantly associated with adverse outcomes, but not the one with eGFRcr. Replacing creatinine by cystatin C in the CKD color-coded risk classification may be appropriate to discriminate HIV-infected patients at increased risk of a poor prognosis. Copyright © 2014 Japanese Society of Chemotherapy and The Japanese Association for Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
A novel latent gaussian copula framework for modeling spatial correlation in quantized SAR imagery with applications to ATR

NASA Astrophysics Data System (ADS)

Thelen, Brian T.; Xique, Ismael J.; Burns, Joseph W.; Goley, G. Steven; Nolan, Adam R.; Benson, Jonathan W.

2017-04-01

With all of the new remote sensing modalities available, and with ever increasing capabilities and frequency of collection, there is a desire to fundamentally understand/quantify the information content in the collected image data relative to various exploitation goals, such as detection/classification. A fundamental approach for this is the framework of Bayesian decision theory, but a daunting challenge is to have significantly flexible and accurate multivariate models for the features and/or pixels that capture a wide assortment of distributions and dependen- cies. In addition, data can come in the form of both continuous and discrete representations, where the latter is often generated based on considerations of robustness to imaging conditions and occlusions/degradations. In this paper we propose a novel suite of "latent" models fundamentally based on multivariate Gaussian copula models that can be used for quantized data from SAR imagery. For this Latent Gaussian Copula (LGC) model, we derive an approximate, maximum-likelihood estimation algorithm and demonstrate very reasonable estimation performance even for the larger images with many pixels. However applying these LGC models to large dimen- sions/images within a Bayesian decision/classification theory is infeasible due to the computational/numerical issues in evaluating the true full likelihood, and we propose an alternative class of novel pseudo-likelihoood detection statistics that are computationally feasible. We show in a few simple examples that these statistics have the potential to provide very good and robust detection/classification performance. All of this framework is demonstrated on a simulated SLICY data set, and the results show the importance of modeling the dependencies, and of utilizing the pseudo-likelihood methods.
Predicting aphasia type from brain damage measured with structural MRI.

PubMed

Yourganov, Grigori; Smith, Kimberly G; Fridriksson, Julius; Rorden, Chris

2015-12-01

Chronic aphasia is a common consequence of a left-hemisphere stroke. Since the early insights by Broca and Wernicke, studying the relationship between the loci of cortical damage and patterns of language impairment has been one of the concerns of aphasiology. We utilized multivariate classification in a cross-validation framework to predict the type of chronic aphasia from the spatial pattern of brain damage. Our sample consisted of 98 patients with five types of aphasia (Broca's, Wernicke's, global, conduction, and anomic), classified based on scores on the Western Aphasia Battery (WAB). Binary lesion maps were obtained from structural MRI scans (obtained at least 6 months poststroke, and within 2 days of behavioural assessment); after spatial normalization, the lesions were parcellated into a disjoint set of brain areas. The proportion of damage to the brain areas was used to classify patients' aphasia type. To create this parcellation, we relied on five brain atlases; our classifier (support vector machine - SVM) could differentiate between different kinds of aphasia using any of the five parcellations. In our sample, the best classification accuracy was obtained when using a novel parcellation that combined two previously published brain atlases, with the first atlas providing the segmentation of grey matter, and the second atlas used to segment the white matter. For each aphasia type, we computed the relative importance of different brain areas for distinguishing it from other aphasia types; our findings were consistent with previously published reports of lesion locations implicated in different types of aphasia. Overall, our results revealed that automated multivariate classification could distinguish between aphasia types based on damage to atlas-defined brain areas. Copyright © 2015 Elsevier Ltd. All rights reserved.
Texture as a basis for acoustic classification of substrate in the nearshore region

NASA Astrophysics Data System (ADS)

Dennison, A.; Wattrus, N. J.

2016-12-01

Segmentation and classification of substrate type from two locations in Lake Superior, are predicted using multivariate statistical processing of textural measures derived from shallow-water, high-resolution multibeam bathymetric data. During a multibeam sonar survey, both bathymetric and backscatter data are collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on substrate type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. Preliminary results from an analysis of bathymetric data and ground-truth samples collected from the Amnicon River, Superior, Wisconsin, and the Lester River, Duluth, Minnesota, demonstrate the ability to process and develop a novel classification scheme of the bottom type in two geomorphologically distinct areas.
Rapid differentiation of Ghana cocoa beans by FT-NIR spectroscopy coupled with multivariate classification

NASA Astrophysics Data System (ADS)

Teye, Ernest; Huang, Xingyi; Dai, Huang; Chen, Quansheng

2013-10-01

Quick, accurate and reliable technique for discrimination of cocoa beans according to geographical origin is essential for quality control and traceability management. This current study presents the application of Near Infrared Spectroscopy technique and multivariate classification for the differentiation of Ghana cocoa beans. A total of 194 cocoa bean samples from seven cocoa growing regions were used. Principal component analysis (PCA) was used to extract relevant information from the spectral data and this gave visible cluster trends. The performance of four multivariate classification methods: Linear discriminant analysis (LDA), K-nearest neighbors (KNN), Back propagation artificial neural network (BPANN) and Support vector machine (SVM) were compared. The performances of the models were optimized by cross validation. The results revealed that; SVM model was superior to all the mathematical methods with a discrimination rate of 100% in both the training and prediction set after preprocessing with Mean centering (MC). BPANN had a discrimination rate of 99.23% for the training set and 96.88% for prediction set. While LDA model had 96.15% and 90.63% for the training and prediction sets respectively. KNN model had 75.01% for the training set and 72.31% for prediction set. The non-linear classification methods used were superior to the linear ones. Generally, the results revealed that NIR Spectroscopy coupled with SVM model could be used successfully to discriminate cocoa beans according to their geographical origins for effective quality assurance.
Risk factors associated with oroantral perforation during surgical removal of maxillary third molar teeth.

PubMed

Hasegawa, Takumi; Tachibana, Akira; Takeda, Daisuke; Iwata, Eiji; Arimoto, Satomi; Sakakibara, Akiko; Akashi, Masaya; Komori, Takahide

2016-12-01

The relationship between radiographic findings and the occurrence of oroantral perforation is controversial. Few studies have quantitatively analyzed the risk factors contributing to oroantral perforation, and no study has reported multivariate analysis of the relationship(s) between these various factors. This retrospective study aims to fill this void. Various risk factors for oroantral perforation during maxillary third molar extraction were investigated by univariate and multivariate analysis. The proximity of the roots to the maxillary sinus floor (root-sinus [RS] classification) was assessed using panoramic radiography and classified as types 1-5. The relationship between the maxillary second and third molars was classified according to a modified version of the Archer classification. The relative depth of the maxillary third molar in the bone was classified as class A-C, and its angulation relative to the long axis of the second molar was also recorded. Performance of an incision (OR 5.16), mesioangular tooth angulation (OR 6.05), and type 3 RS classification (i.e., significant superimposition of the roots of all posterior maxillary teeth with the sinus floor; OR 10.18) were all identified as risk factors with significant association to an outcome of oroantral perforation. To our knowledge, this is the first multivariate analysis of the risk factors for oroantral perforation during surgical extraction of the maxillary third molar. This RS classification may offer a new predictive parameter for estimating the risk of oroantral perforation.
Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding.

PubMed

Xu, Yun; Muhamadali, Howbeer; Sayqal, Ali; Dixon, Neil; Goodacre, Royston

2016-10-28

Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a "pure" regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.
Quantitative Outline-based Shape Analysis and Classification of Planetary Craterforms using Supervised Learning Models

NASA Astrophysics Data System (ADS)

Slezak, Thomas Joseph; Radebaugh, Jani; Christiansen, Eric

2017-10-01

The shapes of craterform morphology on planetary surfaces provides rich information about their origins and evolution. While morphologic information provides rich visual clues to geologic processes and properties, the ability to quantitatively communicate this information is less easily accomplished. This study examines the morphology of craterforms using the quantitative outline-based shape methods of geometric morphometrics, commonly used in biology and paleontology. We examine and compare landforms on planetary surfaces using shape, a property of morphology that is invariant to translation, rotation, and size. We quantify the shapes of paterae on Io, martian calderas, terrestrial basaltic shield calderas, terrestrial ash-flow calderas, and lunar impact craters using elliptic Fourier analysis (EFA) and the Zahn and Roskies (Z-R) shape function, or tangent angle approach to produce multivariate shape descriptors. These shape descriptors are subjected to multivariate statistical analysis including canonical variate analysis (CVA), a multiple-comparison variant of discriminant analysis, to investigate the link between craterform shape and classification. Paterae on Io are most similar in shape to terrestrial ash-flow calderas and the shapes of terrestrial basaltic shield volcanoes are most similar to martian calderas. The shapes of lunar impact craters, including simple, transitional, and complex morphology, are classified with a 100% rate of success in all models. Multiple CVA models effectively predict and classify different craterforms using shape-based identification and demonstrate significant potential for use in the analysis of planetary surfaces.
Patterns of brain structural connectivity differentiate normal weight from overweight subjects

PubMed Central

Gupta, Arpana; Mayer, Emeran A.; Sanmiguel, Claudia P.; Van Horn, John D.; Woodworth, Davis; Ellingson, Benjamin M.; Fling, Connor; Love, Aubrey; Tillisch, Kirsten; Labus, Jennifer S.

2015-01-01

Background Alterations in the hedonic component of ingestive behaviors have been implicated as a possible risk factor in the pathophysiology of overweight and obese individuals. Neuroimaging evidence from individuals with increasing body mass index suggests structural, functional, and neurochemical alterations in the extended reward network and associated networks. Aim To apply a multivariate pattern analysis to distinguish normal weight and overweight subjects based on gray and white-matter measurements. Methods Structural images (N = 120, overweight N = 63) and diffusion tensor images (DTI) (N = 60, overweight N = 30) were obtained from healthy control subjects. For the total sample the mean age for the overweight group (females = 32, males = 31) was 28.77 years (SD = 9.76) and for the normal weight group (females = 32, males = 25) was 27.13 years (SD = 9.62). Regional segmentation and parcellation of the brain images was performed using Freesurfer. Deterministic tractography was performed to measure the normalized fiber density between regions. A multivariate pattern analysis approach was used to examine whether brain measures can distinguish overweight from normal weight individuals. Results 1. White-matter classification: The classification algorithm, based on 2 signatures with 17 regional connections, achieved 97% accuracy in discriminating overweight individuals from normal weight individuals. For both brain signatures, greater connectivity as indexed by increased fiber density was observed in overweight compared to normal weight between the reward network regions and regions of the executive control, emotional arousal, and somatosensory networks. In contrast, the opposite pattern (decreased fiber density) was found between ventromedial prefrontal cortex and the anterior insula, and between thalamus and executive control network regions. 2. Gray-matter classification: The classification algorithm, based on 2 signatures with 42 morphological features, achieved 69% accuracy in discriminating overweight from normal weight. In both brain signatures regions of the reward, salience, executive control and emotional arousal networks were associated with lower morphological values in overweight individuals compared to normal weight individuals, while the opposite pattern was seen for regions of the somatosensory network. Conclusions 1. An increased BMI (i.e., overweight subjects) is associated with distinct changes in gray-matter and fiber density of the brain. 2. Classification algorithms based on white-matter connectivity involving regions of the reward and associated networks can identify specific targets for mechanistic studies and future drug development aimed at abnormal ingestive behavior and in overweight/obesity. PMID:25737959

Remodeling characteristics and collagen distribution in synthetic mesh materials explanted from human subjects after abdominal wall reconstruction: an analysis of remodeling characteristics by patient risk factors and surgical site classifications

PubMed Central

Cavallo, Jaime A.; Roma, Andres A.; Jasielec, Mateusz S.; Ousley, Jenny; Creamer, Jennifer; Pichert, Matthew D.; Baalman, Sara; Frisella, Margaret M.; Matthews, Brent D.

2014-01-01

Background The purpose of this study was to evaluate the associations between patient characteristics or surgical site classifications and the histologic remodeling scores of synthetic meshes biopsied from their abdominal wall repair sites in the first attempt to generate a multivariable risk prediction model of non-constructive remodeling. Methods Biopsies of the synthetic meshes were obtained from the abdominal wall repair sites of 51 patients during a subsequent abdominal re-exploration. Biopsies were stained with hematoxylin and eosin, and evaluated according to a semi-quantitative scoring system for remodeling characteristics (cell infiltration, cell types, extracellular matrix deposition, inflammation, fibrous encapsulation, and neovascularization) and a mean composite score (CR). Biopsies were also stained with Sirius Red and Fast Green, and analyzed to determine the collagen I:III ratio. Based on univariate analyses between subject clinical characteristics or surgical site classification and the histologic remodeling scores, cohort variables were selected for multivariable regression models using a threshold p value of ≤0.200. Results The model selection process for the extracellular matrix score yielded two variables: subject age at time of mesh implantation, and mesh classification (c-statistic = 0.842). For CR score, the model selection process yielded two variables: subject age at time of mesh implantation and mesh classification (r2 = 0.464). The model selection process for the collagen III area yielded a model with two variables: subject body mass index at time of mesh explantation and pack-year history (r2 = 0.244). Conclusion Host characteristics and surgical site assessments may predict degree of remodeling for synthetic meshes used to reinforce abdominal wall repair sites. These preliminary results constitute the first steps in generating a risk prediction model that predicts the patients and clinical circumstances for which non-constructive remodeling of an abdominal wall repair site with synthetic mesh reinforcement is most likely to occur. PMID:24442681
Automated classification of single airborne particles from two-dimensional angle-resolved optical scattering (TAOS) patterns by non-linear filtering

NASA Astrophysics Data System (ADS)

Crosta, Giovanni Franco; Pan, Yong-Le; Aptowicz, Kevin B.; Casati, Caterina; Pinnick, Ronald G.; Chang, Richard K.; Videen, Gorden W.

2013-12-01

Measurement of two-dimensional angle-resolved optical scattering (TAOS) patterns is an attractive technique for detecting and characterizing micron-sized airborne particles. In general, the interpretation of these patterns and the retrieval of the particle refractive index, shape or size alone, are difficult problems. By reformulating the problem in statistical learning terms, a solution is proposed herewith: rather than identifying airborne particles from their scattering patterns, TAOS patterns themselves are classified through a learning machine, where feature extraction interacts with multivariate statistical analysis. Feature extraction relies on spectrum enhancement, which includes the discrete cosine FOURIER transform and non-linear operations. Multivariate statistical analysis includes computation of the principal components and supervised training, based on the maximization of a suitable figure of merit. All algorithms have been combined together to analyze TAOS patterns, organize feature vectors, design classification experiments, carry out supervised training, assign unknown patterns to classes, and fuse information from different training and recognition experiments. The algorithms have been tested on a data set with more than 3000 TAOS patterns. The parameters that control the algorithms at different stages have been allowed to vary within suitable bounds and are optimized to some extent. Classification has been targeted at discriminating aerosolized Bacillus subtilis particles, a simulant of anthrax, from atmospheric aerosol particles and interfering particles, like diesel soot. By assuming that all training and recognition patterns come from the respective reference materials only, the most satisfactory classification result corresponds to 20% false negatives from B. subtilis particles and <11% false positives from all other aerosol particles. The most effective operations have consisted of thresholding TAOS patterns in order to reject defective ones, and forming training sets from three or four pattern classes. The presented automated classification method may be adapted into a real-time operation technique, capable of detecting and characterizing micron-sized airborne particles.
Tools based on multivariate statistical analysis for classification of soil and groundwater in Apulian agricultural sites.

PubMed

Ielpo, Pierina; Leardi, Riccardo; Pappagallo, Giuseppe; Uricchio, Vito Felice

2017-06-01

In this paper, the results obtained from multivariate statistical techniques such as PCA (Principal component analysis) and LDA (Linear discriminant analysis) applied to a wide soil data set are presented. The results have been compared with those obtained on a groundwater data set, whose samples were collected together with soil ones, within the project "Improvement of the Regional Agro-meteorological Monitoring Network (2004-2007)". LDA, applied to soil data, has allowed to distinguish the geographical origin of the sample from either one of the two macroaeras: Bari and Foggia provinces vs Brindisi, Lecce e Taranto provinces, with a percentage of correct prediction in cross validation of 87%. In the case of the groundwater data set, the best classification was obtained when the samples were grouped into three macroareas: Foggia province, Bari province and Brindisi, Lecce and Taranto provinces, by reaching a percentage of correct predictions in cross validation of 84%. The obtained information can be very useful in supporting soil and water resource management, such as the reduction of water consumption and the reduction of energy and chemical (nutrients and pesticides) inputs in agriculture.
Predicting clinical diagnosis in Huntington's disease: An imaging polymarker

PubMed Central

Daws, Richard E.; Soreq, Eyal; Johnson, Eileanoir B.; Scahill, Rachael I.; Tabrizi, Sarah J.; Barker, Roger A.; Hampshire, Adam

2018-01-01

Objective Huntington's disease (HD) gene carriers can be identified before clinical diagnosis; however, statistical models for predicting when overt motor symptoms will manifest are too imprecise to be useful at the level of the individual. Perfecting this prediction is integral to the search for disease modifying therapies. This study aimed to identify an imaging marker capable of reliably predicting real‐life clinical diagnosis in HD. Method A multivariate machine learning approach was applied to resting‐state and structural magnetic resonance imaging scans from 19 premanifest HD gene carriers (preHD, 8 of whom developed clinical disease in the 5 years postscanning) and 21 healthy controls. A classification model was developed using cross‐group comparisons between preHD and controls, and within the preHD group in relation to “estimated” and “actual” proximity to disease onset. Imaging measures were modeled individually, and combined, and permutation modeling robustly tested classification accuracy. Results Classification performance for preHDs versus controls was greatest when all measures were combined. The resulting polymarker predicted converters with high accuracy, including those who were not expected to manifest in that time scale based on the currently adopted statistical models. Interpretation We propose that a holistic multivariate machine learning treatment of brain abnormalities in the premanifest phase can be used to accurately identify those patients within 5 years of developing motor features of HD, with implications for prognostication and preclinical trials. Ann Neurol 2018;83:532–543 PMID:29405351
How do we understand the disagreement in the frequency of surgical site infection between the CDC and Clavien-Dindo classifications?

PubMed

Yamamoto, Takanobu; Takahashi, Satoshi; Ichihara, Koji; Hiyama, Yoshiki; Uehara, Teruhisa; Hashimoto, Jiro; Hirobe, Megumi; Masumori, Naoya

2015-02-01

To clarify the discrepancy in the incidence and severity of surgical site infections (SSI) for radical cystectomy between reports based on the CDC guideline and those using the Clavien-Dindo classification we evaluated 449 consecutive patients who underwent radical cystectomy for bladder cancer between 1990 and 2012. Of the 115 (25.6%) patients with SSI defined by the CDC guideline, 89 could be analyzed. We compared the SSI rates and severity defined by the CDC guideline and Clavien-Dindo classifications. There were 58 patients with superficial SSI, 16 with deep SSI, and 15 with organ/space SSI according to the CDC guideline. All patients with organ/space SSI were judged as "not having SSI" by the Clavien-Dindo classification. They were classified as having "intestinal prolapse", "intestinal fistula", "abdominal abscess" and "pelvic abscess." There was a significant association between the treatment duration and depth of SSI based on the CDC guideline by Spearman's rank-correlation coefficient (p < 0.001, r = 0.614) and with the grade of complications (p < 0.001, r = 0.632) in the Clavien-Dindo classification. Multivariate analysis showed that patients with grade III SSI in the Clavien-Dindo classification needed a significantly longer treatment duration. It is necessary to be aware that a discrepancy can occur automatically due to the different natures of the definitions. Using the CDC guideline, we can effectively estimate the future treatment period when SSI occurs. With the Clavien-Dindo classification, grade III SSI requires a longer treatment duration. Copyright © 2014 Japanese Society of Chemotherapy and The Japanese Association for Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
The ITE Land classification: Providing an environmental stratification of Great Britain.

PubMed

Bunce, R G; Barr, C J; Gillespie, M K; Howard, D C

1996-01-01

The surface of Great Britain (GB) varies continuously in land cover from one area to another. The objective of any environmentally based land classification is to produce classes that match the patterns that are present by helping to define clear boundaries. The more appropriate the analysis and data used, the better the classes will fit the natural patterns. The observation of inter-correlations between ecological factors is the basis for interpreting ecological patterns in the field, and the Institute of Terrestrial Ecology (ITE) Land Classification formalises such subjective ideas. The data inevitably comprise a large number of factors in order to describe the environment adequately. Single factors, such as altitude, would only be useful on a national basis if they were the only dominant causative agent of ecological variation.The ITE Land Classification has defined 32 environmental categories called 'land classes', initially based on a sample of 1-km squares in Great Britain but subsequently extended to all 240 000 1-km squares. The original classification was produced using multivariate analysis of 75 environmental variables. The extension to all squares in GB was performed using a combination of logistic discrimination and discriminant functions. The classes have provided a stratification for successive ecological surveys, the results of which have characterised the classes in terms of botanical, zoological and landscape features.The classification has also been applied to integrate diverse datasets including satellite imagery, soils and socio-economic information. A variety of models have used the structure of the classification, for example to show potential land use change under different economic conditions. The principal data sets relevant for planning purposes have been incorporated into a user-friendly computer package, called the 'Countryside Information System'.
Gender, Race, and Survival: A Study in Non-Small-Cell Lung Cancer Brain Metastases Patients Utilizing the Radiation Therapy Oncology Group Recursive Partitioning Analysis Classification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Videtic, Gregory M.M., E-mail: videtig@ccf.or; Reddy, Chandana A.; Chao, Samuel T.

Purpose: To explore whether gender and race influence survival in non-small-cell lung cancer (NSCLC) in patients with brain metastases, using our large single-institution brain tumor database and the Radiation Therapy Oncology Group recursive partitioning analysis (RPA) brain metastases classification. Methods and materials: A retrospective review of a single-institution brain metastasis database for the interval January 1982 to September 2004 yielded 835 NSCLC patients with brain metastases for analysis. Patient subsets based on combinations of gender, race, and RPA class were then analyzed for survival differences. Results: Median follow-up was 5.4 months (range, 0-122.9 months). There were 485 male patients (M)more » (58.4%) and 346 female patients (F) (41.6%). Of the 828 evaluable patients (99%), 143 (17%) were black/African American (B) and 685 (83%) were white/Caucasian (W). Median survival time (MST) from time of brain metastasis diagnosis for all patients was 5.8 months. Median survival time by gender (F vs. M) and race (W vs. B) was 6.3 months vs. 5.5 months (p = 0.013) and 6.0 months vs. 5.2 months (p = 0.08), respectively. For patients stratified by RPA class, gender, and race, MST significantly favored BFs over BMs in Class II: 11.2 months vs. 4.6 months (p = 0.021). On multivariable analysis, significant variables were gender (p = 0.041, relative risk [RR] 0.83) and RPA class (p < 0.0001, RR 0.28 for I vs. III; p < 0.0001, RR 0.51 for II vs. III) but not race. Conclusions: Gender significantly influences NSCLC brain metastasis survival. Race trended to significance in overall survival but was not significant on multivariable analysis. Multivariable analysis identified gender and RPA classification as significant variables with respect to survival.« less
Automatic classification of scar tissue in late gadolinium enhancement cardiac MRI for the assessment of left-atrial wall injury after radiofrequency ablation

PubMed Central

Morris, Alan; Burgon, Nathan; McGann, Christopher; MacLeod, Robert; Cates, Joshua

2013-01-01

Radiofrequency ablation is a promising procedure for treating atrial fibrillation (AF) that relies on accurate lesion delivery in the left atrial (LA) wall for success. Late Gadolinium Enhancement MRI (LGE MRI) at three months post-ablation has proven effective for noninvasive assessment of the location and extent of scar formation, which are important factors for predicting patient outcome and planning of redo ablation procedures. We have developed an algorithm for automatic classification in LGE MRI of scar tissue in the LA wall and have evaluated accuracy and consistency compared to manual scar classifications by expert observers. Our approach clusters voxels based on normalized intensity and was chosen through a systematic comparison of the performance of multivariate clustering on many combinations of image texture. Algorithm performance was determined by overlap with ground truth, using multiple overlap measures, and the accuracy of the estimation of the total amount of scar in the LA. Ground truth was determined using the STAPLE algorithm, which produces a probabilistic estimate of the true scar classification from multiple expert manual segmentations. Evaluation of the ground truth data set was based on both inter- and intra-observer agreement, with variation among expert classifiers indicating the difficulty of scar classification for a given a dataset. Our proposed automatic scar classification algorithm performs well for both scar localization and estimation of scar volume: for ground truth datasets considered easy, variability from the ground truth was low; for those considered difficult, variability from ground truth was on par with the variability across experts. PMID:24236224
Automatic classification of scar tissue in late gadolinium enhancement cardiac MRI for the assessment of left-atrial wall injury after radiofrequency ablation

NASA Astrophysics Data System (ADS)

Perry, Daniel; Morris, Alan; Burgon, Nathan; McGann, Christopher; MacLeod, Robert; Cates, Joshua

2012-03-01

Radiofrequency ablation is a promising procedure for treating atrial fibrillation (AF) that relies on accurate lesion delivery in the left atrial (LA) wall for success. Late Gadolinium Enhancement MRI (LGE MRI) at three months post-ablation has proven effective for noninvasive assessment of the location and extent of scar formation, which are important factors for predicting patient outcome and planning of redo ablation procedures. We have developed an algorithm for automatic classification in LGE MRI of scar tissue in the LA wall and have evaluated accuracy and consistency compared to manual scar classifications by expert observers. Our approach clusters voxels based on normalized intensity and was chosen through a systematic comparison of the performance of multivariate clustering on many combinations of image texture. Algorithm performance was determined by overlap with ground truth, using multiple overlap measures, and the accuracy of the estimation of the total amount of scar in the LA. Ground truth was determined using the STAPLE algorithm, which produces a probabilistic estimate of the true scar classification from multiple expert manual segmentations. Evaluation of the ground truth data set was based on both inter- and intra-observer agreement, with variation among expert classifiers indicating the difficulty of scar classification for a given a dataset. Our proposed automatic scar classification algorithm performs well for both scar localization and estimation of scar volume: for ground truth datasets considered easy, variability from the ground truth was low; for those considered difficult, variability from ground truth was on par with the variability across experts.
Metabolomic analysis applied to chemosystematics and evolution of megadiverse Brazilian Vernonieae (Asteraceae).

PubMed

Gallon, Marília Elias; Monge, Marcelo; Casoti, Rosana; Da Costa, Fernando Batista; Semir, João; Gobbo-Neto, Leonardo

2018-06-01

Vernonia sensu lato is the largest and most complex genus of the tribe Vernonieae (Asteraceae). The tribe is chemically characterized by the presence of sesquiterpene lactones and flavonoids. Over the years, several taxonomic classifications have been proposed for Vernonia s.l. and for the tribe; however, there has been no consensus among the researches. According to traditional classification, Vernonia s.l. comprises more than 1000 species divided into sections, subsections and series (sensu Bentham). In a more recent classification, these species have been segregated into other genera and some subtribes were proposed, while the genus Vernonia sensu stricto was restricted to 22 species distributed mainly in North America (sensu Robinson). In this study, species from the subtribes Vernoniinae, Lepidaploinae and Rolandrinae were analyzed by UHPLC-UV-HRMS followed by multivariate statistical analysis. Data mining was performed using unsupervised (HCA and PCA) and supervised methods (OPLS-DA). The HCA showed the segregation of the species into four main groups. Comparing the HCA with taxonomical classifications of Vernonieae, we observed that the groups of the dendogram, based on metabolic profiling, were in accordance with the generic classification proposed by Robinson and with previous phylogenetic studies. The species of the genera Stenocephalum, Stilpnopappus, Strophopappus and Rolandra (Group 1) were revealed to be more related to the species of the genus Vernonanthura (Group 2), while the genera Cyrtocymura, Chrysolaena and Echinocoryne (Group 3) were chemically more similar to the genera Lessingianthus and Lepidaploa (Group 4). These findings indicated that the subtribes Vernoniinae and Lepidaploinae are non-chemically homogeneous groups and highlighted the application of untargeted metabolomic tools for taxonomy and as indicators of species evolution. Discriminant compounds for the groups obtained by OPLS-DA were determined. Groups 1 and 2 were characterized by the presence of 3',4'-dimethoxyluteolin, glaucolide A and 8-tigloyloxyglaucolide A. The species of Groups 3 and 4 were characterized by the presence of putative acacetin 7-O-rutinoside and glaucolide B. Therefore, untargeted metabolomic approach combined with multivariate statistical analysis, as proposed herein, allowed the identification of potential chemotaxonomic markers, helping in the taxonomic classifications. Copyright © 2018 Elsevier Ltd. All rights reserved.
Multivariate pattern classification reveals autonomic and experiential representations of discrete emotions.

PubMed

Kragel, Philip A; Labar, Kevin S

2013-08-01

Defining the structural organization of emotions is a central unresolved question in affective science. In particular, the extent to which autonomic nervous system activity signifies distinct affective states remains controversial. Most prior research on this topic has used univariate statistical approaches in attempts to classify emotions from psychophysiological data. In the present study, electrodermal, cardiac, respiratory, and gastric activity, as well as self-report measures were taken from healthy subjects during the experience of fear, anger, sadness, surprise, contentment, and amusement in response to film and music clips. Information pertaining to affective states present in these response patterns was analyzed using multivariate pattern classification techniques. Overall accuracy for classifying distinct affective states was 58.0% for autonomic measures and 88.2% for self-report measures, both of which were significantly above chance. Further, examining the error distribution of classifiers revealed that the dimensions of valence and arousal selectively contributed to decoding emotional states from self-report, whereas a categorical configuration of affective space was evident in both self-report and autonomic measures. Taken together, these findings extend recent multivariate approaches to study emotion and indicate that pattern classification tools may improve upon univariate approaches to reveal the underlying structure of emotional experience and physiological expression. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Multivariate Pattern Classification Reveals Autonomic and Experiential Representations of Discrete Emotions

PubMed Central

Kragel, Philip A.; LaBar, Kevin S.

2013-01-01

Defining the structural organization of emotions is a central unresolved question in affective science. In particular, the extent to which autonomic nervous system activity signifies distinct affective states remains controversial. Most prior research on this topic has used univariate statistical approaches in attempts to classify emotions from psychophysiological data. In the present study, electrodermal, cardiac, respiratory, and gastric activity, as well as self-report measures were taken from healthy subjects during the experience of fear, anger, sadness, surprise, contentment, and amusement in response to film and music clips. Information pertaining to affective states present in these response patterns was analyzed using multivariate pattern classification techniques. Overall accuracy for classifying distinct affective states was 58.0% for autonomic measures and 88.2% for self-report measures, both of which were significantly above chance. Further, examining the error distribution of classifiers revealed that the dimensions of valence and arousal selectively contributed to decoding emotional states from self-report, whereas a categorical configuration of affective space was evident in both self-report and autonomic measures. Taken together, these findings extend recent multivariate approaches to study emotion and indicate that pattern classification tools may improve upon univariate approaches to reveal the underlying structure of emotional experience and physiological expression. PMID:23527508
Design of neural networks for classification of remotely sensed imagery

NASA Technical Reports Server (NTRS)

Chettri, Samir R.; Cromp, Robert F.; Birmingham, Mark

1992-01-01

Classification accuracies of a backpropagation neural network are discussed and compared with a maximum likelihood classifier (MLC) with multivariate normal class models. We have found that, because of its nonparametric nature, the neural network outperforms the MLC in this area. In addition, we discuss techniques for constructing optimal neural nets on parallel hardware like the MasPar MP-1 currently at GSFC. Other important discussions are centered around training and classification times of the two methods, and sensitivity to the training data. Finally, we discuss future work in the area of classification and neural nets.
Identification of Reliable Components in Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS): a Data-Driven Approach across Metabolic Processes.

PubMed

Motegi, Hiromi; Tsuboi, Yuuri; Saga, Ayako; Kagami, Tomoko; Inoue, Maki; Toki, Hideaki; Minowa, Osamu; Noda, Tetsuo; Kikuchi, Jun

2015-11-04

There is an increasing need to use multivariate statistical methods for understanding biological functions, identifying the mechanisms of diseases, and exploring biomarkers. In addition to classical analyses such as hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis, various multivariate strategies, including independent component analysis, non-negative matrix factorization, and multivariate curve resolution, have recently been proposed. However, determining the number of components is problematic. Despite the proposal of several different methods, no satisfactory approach has yet been reported. To resolve this problem, we implemented a new idea: classifying a component as "reliable" or "unreliable" based on the reproducibility of its appearance, regardless of the number of components in the calculation. Using the clustering method for classification, we applied this idea to multivariate curve resolution-alternating least squares (MCR-ALS). Comparisons between conventional and modified methods applied to proton nuclear magnetic resonance ((1)H-NMR) spectral datasets derived from known standard mixtures and biological mixtures (urine and feces of mice) revealed that more plausible results are obtained by the modified method. In particular, clusters containing little information were detected with reliability. This strategy, named "cluster-aided MCR-ALS," will facilitate the attainment of more reliable results in the metabolomics datasets.
Mapping Informative Clusters in a Hierarchial Framework of fMRI Multivariate Analysis

PubMed Central

Xu, Rui; Zhen, Zonglei; Liu, Jia

2010-01-01

Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies. PMID:21152081
Introduction to multivariate discrimination

NASA Astrophysics Data System (ADS)

Kégl, Balázs

2013-07-01

Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either relevant to or even motivated by certain unorthodox applications of multivariate discrimination in experimental physics.
Functional Groups Based on Leaf Physiology: Are they Spatially and Temporally Robust?

NASA Technical Reports Server (NTRS)

Foster, Tammy E.; Brooks, J. Renee

2004-01-01

The functional grouping hypothesis, which suggests that complexity in ecosystem function can be simplified by grouping species with similar responses, was tested in the Florida scrub habitat. Functional groups were identified based on how species in fire maintained Florida scrub regulate exchange of carbon and water with the atmosphere as indicated by both instantaneous gas exchange measurements and integrated measures of function (%N, delta C-13, delta N-15, C-N ratio). Using cluster analysis, five distinct physiologically-based functional groups were identified in the fire maintained scrub. These functional groups were tested to determine if they were robust spatially, temporally, and with management regime. Analysis of Similarities (ANOSIM), a non-parametric multivariate analysis, indicated that these five physiologically-based groupings were not altered by plot differences (R = -0.115, p = 0.893) or by the three different management regimes; prescribed burn, mechanically treated and burn, and fire-suppressed (R = 0.018, p = 0.349). The physiological groupings also remained robust between the two climatically different years 1999 and 2000 (R = -0.027, p = 0.725). Easy-to-measure morphological characteristics indicating functional groups would be more practical for scaling and modeling ecosystem processes than detailed gas-exchange measurements, therefore we tested a variety of morphological characteristics as functional indicators. A combination of non-parametric multivariate techniques (Hierarchical cluster analysis, non-metric Multi-Dimensional Scaling, and ANOSIM) were used to compare the ability of life form, leaf thickness, and specific leaf area classifications to identify the physiologically-based functional groups. Life form classifications (ANOSIM; R = 0.629, p 0.001) were able to depict the physiological groupings more adequately than either specific leaf area (ANOSIM; R = 0.426, p = 0.001) or leaf thickness (ANOSIM; R 0.344, p 0.001). The ability of life forms to depict the physiological groupings was improved by separating the parasitic Ximenia americana from the shrub category (ANOSIM; R = 0.794, p = 0.001). Therefore, a life form classification including parasites was determined to be a good indicator of the physiological processes of scrub species, and would be a useful method of grouping for scaling physiological processes to the ecosystem level.
Multivariate data analysis and machine learning in Alzheimer's disease with a focus on structural magnetic resonance imaging.

PubMed

Falahati, Farshad; Westman, Eric; Simmons, Andrew

2014-01-01

Machine learning algorithms and multivariate data analysis methods have been widely utilized in the field of Alzheimer's disease (AD) research in recent years. Advances in medical imaging and medical image analysis have provided a means to generate and extract valuable neuroimaging information. Automatic classification techniques provide tools to analyze this information and observe inherent disease-related patterns in the data. In particular, these classifiers have been used to discriminate AD patients from healthy control subjects and to predict conversion from mild cognitive impairment to AD. In this paper, recent studies are reviewed that have used machine learning and multivariate analysis in the field of AD research. The main focus is on studies that used structural magnetic resonance imaging (MRI), but studies that included positron emission tomography and cerebrospinal fluid biomarkers in addition to MRI are also considered. A wide variety of materials and methods has been employed in different studies, resulting in a range of different outcomes. Influential factors such as classifiers, feature extraction algorithms, feature selection methods, validation approaches, and cohort properties are reviewed, as well as key MRI-based and multi-modal based studies. Current and future trends are discussed.
Identification and classification of silks using infrared spectroscopy

PubMed Central

Boulet-Audet, Maxime; Vollrath, Fritz; Holland, Chris

2015-01-01

ABSTRACT Lepidopteran silks number in the thousands and display a vast diversity of structures, properties and industrial potential. To map this remarkable biochemical diversity, we present an identification and screening method based on the infrared spectra of native silk feedstock and cocoons. Multivariate analysis of over 1214 infrared spectra obtained from 35 species allowed us to group silks into distinct hierarchies and a classification that agrees well with current phylogenetic data and taxonomies. This approach also provides information on the relative content of sericin, calcium oxalate, phenolic compounds, poly-alanine and poly(alanine-glycine) β-sheets. It emerged that the domesticated mulberry silkmoth Bombyx mori represents an outlier compared with other silkmoth taxa in terms of spectral properties. Interestingly, Epiphora bauhiniae was found to contain the highest amount of β-sheets reported to date for any wild silkmoth. We conclude that our approach provides a new route to determine cocoon chemical composition and in turn a novel, biological as well as material, classification of silks. PMID:26347557
Comparative study on fast classification of brick samples by combination of principal component analysis and linear discriminant analysis using stand-off and table-top laser-induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef

2014-11-01

Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.

A neuromorphic network for generic multivariate data classification

PubMed Central

Schmuker, Michael; Pfeil, Thomas; Nawrot, Martin Paul

2014-01-01

Computational neuroscience has uncovered a number of computational principles used by nervous systems. At the same time, neuromorphic hardware has matured to a state where fast silicon implementations of complex neural networks have become feasible. En route to future technical applications of neuromorphic computing the current challenge lies in the identification and implementation of functional brain algorithms. Taking inspiration from the olfactory system of insects, we constructed a spiking neural network for the classification of multivariate data, a common problem in signal and data analysis. In this model, real-valued multivariate data are converted into spike trains using “virtual receptors” (VRs). Their output is processed by lateral inhibition and drives a winner-take-all circuit that supports supervised learning. VRs are conveniently implemented in software, whereas the lateral inhibition and classification stages run on accelerated neuromorphic hardware. When trained and tested on real-world datasets, we find that the classification performance is on par with a naïve Bayes classifier. An analysis of the network dynamics shows that stable decisions in output neuron populations are reached within less than 100 ms of biological time, matching the time-to-decision reported for the insect nervous system. Through leveraging a population code, the network tolerates the variability of neuronal transfer functions and trial-to-trial variation that is inevitably present on the hardware system. Our work provides a proof of principle for the successful implementation of a functional spiking neural network on a configurable neuromorphic hardware system that can readily be applied to real-world computing problems. PMID:24469794
Fluorescent marker-based and marker-free discrimination between healthy and cancerous human tissues using hyper-spectral imaging

NASA Astrophysics Data System (ADS)

Arnold, Thomas; De Biasio, Martin; Leitner, Raimund

2015-06-01

Two problems are addressed in this paper (i) the fluorescent marker-based and the (ii) marker-free discrimination between healthy and cancerous human tissues. For both applications the performance of hyper-spectral methods are quantified. Fluorescent marker-based tissue classification uses a number of fluorescent markers to dye specific parts of a human cell. The challenge is that the emission spectra of the fluorescent dyes overlap considerably. They are, furthermore disturbed by the inherent auto-fluorescence of human tissue. This results in ambiguities and decreased image contrast causing difficulties for the treatment decision. The higher spectral resolution introduced by tunable-filter-based spectral imaging in combination with spectral unmixing techniques results in an improvement of the image contrast and therefore more reliable information for the physician to choose the treatment decision. Marker-free tissue classification is based solely on the subtle spectral features of human tissue without the use of artificial markers. The challenge in this case is that the spectral differences between healthy and cancerous tissues are subtle and embedded in intra- and inter-patient variations of these features. The contributions of this paper are (i) the evaluation of hyper-spectral imaging in combination with spectral unmixing techniques for fluorescence marker-based tissue classification, (ii) the evaluation of spectral imaging for marker-free intra surgery tissue classification. Within this paper, we consider real hyper-spectral fluorescence and endoscopy data sets to emphasize the practical capability of the proposed methods. It is shown that the combination of spectral imaging with multivariate statistical methods can improve the sensitivity and specificity of the detection and the staging of cancerous tissues compared to standard procedures.
Classification of autistic individuals and controls using cross-task characterization of fMRI activity

PubMed Central

Chanel, Guillaume; Pichon, Swann; Conty, Laurence; Berthoz, Sylvie; Chevallier, Coralie; Grèzes, Julie

2015-01-01

Multivariate pattern analysis (MVPA) has been applied successfully to task-based and resting-based fMRI recordings to investigate which neural markers distinguish individuals with autistic spectrum disorders (ASD) from controls. While most studies have focused on brain connectivity during resting state episodes and regions of interest approaches (ROI), a wealth of task-based fMRI datasets have been acquired in these populations in the last decade. This calls for techniques that can leverage information not only from a single dataset, but from several existing datasets that might share some common features and biomarkers. We propose a fully data-driven (voxel-based) approach that we apply to two different fMRI experiments with social stimuli (faces and bodies). The method, based on Support Vector Machines (SVMs) and Recursive Feature Elimination (RFE), is first trained for each experiment independently and each output is then combined to obtain a final classification output. Second, this RFE output is used to determine which voxels are most often selected for classification to generate maps of significant discriminative activity. Finally, to further explore the clinical validity of the approach, we correlate phenotypic information with obtained classifier scores. The results reveal good classification accuracy (range between 69% and 92.3%). Moreover, we were able to identify discriminative activity patterns pertaining to the social brain without relying on a priori ROI definitions. Finally, social motivation was the only dimension which correlated with classifier scores, suggesting that it is the main dimension captured by the classifiers. Altogether, we believe that the present RFE method proves to be efficient and may help identifying relevant biomarkers by taking advantage of acquired task-based fMRI datasets in psychiatric populations. PMID:26793434
Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy

NASA Astrophysics Data System (ADS)

He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

2015-02-01

A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety.
Discrimination between Alzheimer's Disease and Late Onset Bipolar Disorder Using Multivariate Analysis.

PubMed

Besga, Ariadna; Gonzalez, Itxaso; Echeburua, Enrique; Savio, Alexandre; Ayerdi, Borja; Chyzhyk, Darya; Madrigal, Jose L M; Leza, Juan C; Graña, Manuel; Gonzalez-Pinto, Ana Maria

2015-01-01

Late onset bipolar disorder (LOBD) is often difficult to distinguish from degenerative dementias, such as Alzheimer disease (AD), due to comorbidities and common cognitive symptoms. Moreover, LOBD prevalence in the elder population is not negligible and it is increasing. Both pathologies share pathophysiological neuroinflammation features. Improvements in differential diagnosis of LOBD and AD will help to select the best personalized treatment. The aim of this study is to assess the relative significance of clinical observations, neuropsychological tests, and specific blood plasma biomarkers (inflammatory and neurotrophic), separately and combined, in the differential diagnosis of LOBD versus AD. It was carried out evaluating the accuracy achieved by classification-based computer-aided diagnosis (CAD) systems based on these variables. A sample of healthy controls (HC) (n = 26), AD patients (n = 37), and LOBD patients (n = 32) was recruited at the Alava University Hospital. Clinical observations, neuropsychological tests, and plasma biomarkers were measured at recruitment time. We applied multivariate machine learning classification methods to discriminate subjects from HC, AD, and LOBD populations in the study. We analyzed, for each classification contrast, feature sets combining clinical observations, neuropsychological measures, and biological markers, including inflammation biomarkers. Furthermore, we analyzed reduced feature sets containing variables with significative differences determined by a Welch's t-test. Furthermore, a battery of classifier architectures were applied, encompassing linear and non-linear Support Vector Machines (SVM), Random Forests (RF), Classification and regression trees (CART), and their performance was evaluated in a leave-one-out (LOO) cross-validation scheme. Post hoc analysis of Gini index in CART classifiers provided a measure of each variable importance. Welch's t-test found one biomarker (Malondialdehyde) with significative differences (p < 0.001) in LOBD vs. AD contrast. Classification results with the best features are as follows: discrimination of HC vs. AD patients reaches accuracy 97.21% and AUC 98.17%. Discrimination of LOBD vs. AD patients reaches accuracy 90.26% and AUC 89.57%. Discrimination of HC vs LOBD patients achieves accuracy 95.76% and AUC 88.46%. It is feasible to build CAD systems for differential diagnosis of LOBD and AD on the basis of a reduced set of clinical variables. Clinical observations provide the greatest discrimination. Neuropsychological tests are improved by the addition of biomarkers, and both contribute significantly to improve the overall predictive performance.
The classification of anxiety and hysterical states. Part I. Historical review and empirical delineation.

PubMed

Sheehan, D V; Sheehan, K H

1982-08-01

The history of the classification of anxiety, hysterical, and hypochondriacal disorders is reviewed. Problems in the ability of current classification schemes to predict, control, and describe the relationship between the symptoms and other phenomena are outlined. Existing classification schemes failed the first test of a good classification model--that of providing categories that are mutually exclusive. The independence of these diagnostic categories from each other does not appear to hold up on empirical testing. In the absence of inherently mutually exclusive categories, further empirical investigation of these classes is obstructed since statistically valid analysis of the nominal data and any useful multivariate analysis would be difficult if not impossible. It is concluded that the existing classifications are unsatisfactory and require some fundamental reconceptualization.
Towards intra-operative diagnosis of tumours during breast conserving surgery by selective-sampling Raman micro-spectroscopy

NASA Astrophysics Data System (ADS)

Kong, Kenny; Zaabar, Fazliyana; Rakha, Emad; Ellis, Ian; Koloydenko, Alexey; Notingher, Ioan

2014-10-01

Breast-conserving surgery (BCS) is increasingly employed for the treatment of early stage breast cancer. One of the key challenges in BCS is to ensure complete removal of the tumour while conserving as much healthy tissue as possible. In this study we have investigated the potential of Raman micro-spectroscopy (RMS) for automated intra-operative evaluation of tumour excision. First, a multivariate classification model based on Raman spectra of normal and malignant breast tissue samples was built and achieved diagnosis of mammary ductal carcinoma (DC) with 95.6% sensitivity and 96.2% specificity (5-fold cross-validation). The tumour regions were discriminated from the healthy tissue structures based on increased concentration of nucleic acids and reduced concentration of collagen and fat. The multivariate classification model was then applied to sections from fresh tissue of new patients to produce diagnosis images for DC. The diagnosis images obtained by raster scanning RMS were in agreement with the conventional histopathology diagnosis but were limited to long data acquisition times (typically 10 000 spectra mm-2, which is equivalent to ~5 h mm-2). Selective-sampling based on integrated auto-fluorescence imaging and Raman spectroscopy was used to reduce the number of Raman spectra to ~20 spectra mm-2, which is equivalent to an acquisition time of ~15 min for 5 × 5 mm2 tissue samples. This study suggests that selective-sampling Raman microscopy has the potential to provide a rapid and objective intra-operative method to detect mammary carcinoma in tissue and assess resection margins.
Robust Averaging of Covariances for EEG Recordings Classification in Motor Imagery Brain-Computer Interfaces.

PubMed

Uehara, Takashi; Sartori, Matteo; Tanaka, Toshihisa; Fiori, Simone

2017-06-01

The estimation of covariance matrices is of prime importance to analyze the distribution of multivariate signals. In motor imagery-based brain-computer interfaces (MI-BCI), covariance matrices play a central role in the extraction of features from recorded electroencephalograms (EEGs); therefore, correctly estimating covariance is crucial for EEG classification. This letter discusses algorithms to average sample covariance matrices (SCMs) for the selection of the reference matrix in tangent space mapping (TSM)-based MI-BCI. Tangent space mapping is a powerful method of feature extraction and strongly depends on the selection of a reference covariance matrix. In general, the observed signals may include outliers; therefore, taking the geometric mean of SCMs as the reference matrix may not be the best choice. In order to deal with the effects of outliers, robust estimators have to be used. In particular, we discuss and test the use of geometric medians and trimmed averages (defined on the basis of several metrics) as robust estimators. The main idea behind trimmed averages is to eliminate data that exhibit the largest distance from the average covariance calculated on the basis of all available data. The results of the experiments show that while the geometric medians show little differences from conventional methods in terms of classification accuracy in the classification of electroencephalographic recordings, the trimmed averages show significant improvement for all subjects.
Identification and classification of failure modes in laminated composites by using a multivariate statistical analysis of wavelet coefficients

NASA Astrophysics Data System (ADS)

Baccar, D.; Söffker, D.

2017-11-01

Acoustic Emission (AE) is a suitable method to monitor the health of composite structures in real-time. However, AE-based failure mode identification and classification are still complex to apply due to the fact that AE waves are generally released simultaneously from all AE-emitting damage sources. Hence, the use of advanced signal processing techniques in combination with pattern recognition approaches is required. In this paper, AE signals generated from laminated carbon fiber reinforced polymer (CFRP) subjected to indentation test are examined and analyzed. A new pattern recognition approach involving a number of processing steps able to be implemented in real-time is developed. Unlike common classification approaches, here only CWT coefficients are extracted as relevant features. Firstly, Continuous Wavelet Transform (CWT) is applied to the AE signals. Furthermore, dimensionality reduction process using Principal Component Analysis (PCA) is carried out on the coefficient matrices. The PCA-based feature distribution is analyzed using Kernel Density Estimation (KDE) allowing the determination of a specific pattern for each fault-specific AE signal. Moreover, waveform and frequency content of AE signals are in depth examined and compared with fundamental assumptions reported in this field. A correlation between the identified patterns and failure modes is achieved. The introduced method improves the damage classification and can be used as a non-destructive evaluation tool.
Origin Discrimination of Osmanthus fragrans var. thunbergii Flowers using GC-MS and UPLC-PDA Combined with Multivariable Analysis Methods.

PubMed

Zhou, Fei; Zhao, Yajing; Peng, Jiyu; Jiang, Yirong; Li, Maiquan; Jiang, Yuan; Lu, Baiyi

2017-07-01

Osmanthus fragrans flowers are used as folk medicine and additives for teas, beverages and foods. The metabolites of O. fragrans flowers from different geographical origins were inconsistent in some extent. Chromatography and mass spectrometry combined with multivariable analysis methods provides an approach for discriminating the origin of O. fragrans flowers. To discriminate the Osmanthus fragrans var. thunbergii flowers from different origins with the identified metabolites. GC-MS and UPLC-PDA were conducted to analyse the metabolites in O. fragrans var. thunbergii flowers (in total 150 samples). Principal component analysis (PCA), soft independent modelling of class analogy analysis (SIMCA) and random forest (RF) analysis were applied to group the GC-MS and UPLC-PDA data. GC-MS identified 32 compounds common to all samples while UPLC-PDA/QTOF-MS identified 16 common compounds. PCA of the UPLC-PDA data generated a better clustering than PCA of the GC-MS data. Ten metabolites (six from GC-MS and four from UPLC-PDA) were selected as effective compounds for discrimination by PCA loadings. SIMCA and RF analysis were used to build classification models, and the RF model, based on the four effective compounds (caffeic acid derivative, acteoside, ligustroside and compound 15), yielded better results with the classification rate of 100% in the calibration set and 97.8% in the prediction set. GC-MS and UPLC-PDA combined with multivariable analysis methods can discriminate the origin of Osmanthus fragrans var. thunbergii flowers. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Unbiased metabolite profiling by liquid chromatography-quadrupole time-of-flight mass spectrometry and multivariate data analysis for herbal authentication: classification of seven Lonicera species flower buds.

PubMed

Gao, Wen; Yang, Hua; Qi, Lian-Wen; Liu, E-Hu; Ren, Mei-Ting; Yan, Yu-Ting; Chen, Jun; Li, Ping

2012-07-06

Plant-based medicines become increasingly popular over the world. Authentication of herbal raw materials is important to ensure their safety and efficacy. Some herbs belonging to closely related species but differing in medicinal properties are difficult to be identified because of similar morphological and microscopic characteristics. Chromatographic fingerprinting is an alternative method to distinguish them. Existing approaches do not allow a comprehensive analysis for herbal authentication. We have now developed a strategy consisting of (1) full metabolic profiling of herbal medicines by rapid resolution liquid chromatography (RRLC) combined with quadrupole time-of-flight mass spectrometry (QTOF MS), (2) global analysis of non-targeted compounds by molecular feature extraction algorithm, (3) multivariate statistical analysis for classification and prediction, and (4) marker compounds characterization. This approach has provided a fast and unbiased comparative multivariate analysis of the metabolite composition of 33-batch samples covering seven Lonicera species. Individual metabolic profiles are performed at the level of molecular fragments without prior structural assignment. In the entire set, the obtained classifier for seven Lonicera species flower buds showed good prediction performance and a total of 82 statistically different components were rapidly obtained by the strategy. The elemental compositions of discriminative metabolites were characterized by the accurate mass measurement of the pseudomolecular ions and their chemical types were assigned by the MS/MS spectra. The high-resolution, comprehensive and unbiased strategy for metabolite data analysis presented here is powerful and opens the new direction of authentication in herbal analysis. Copyright © 2012 Elsevier B.V. All rights reserved.
Pattern classification of brain activation during emotional processing in subclinical depression: psychosis proneness as potential confounding factor.

PubMed

Modinos, Gemma; Mechelli, Andrea; Pettersson-Yeo, William; Allen, Paul; McGuire, Philip; Aleman, Andre

2013-01-01

We used Support Vector Machine (SVM) to perform multivariate pattern classification based on brain activation during emotional processing in healthy participants with subclinical depressive symptoms. Six-hundred undergraduate students completed the Beck Depression Inventory II (BDI-II). Two groups were subsequently formed: (i) subclinical (mild) mood disturbance (n = 17) and (ii) no mood disturbance (n = 17). Participants also completed a self-report questionnaire on subclinical psychotic symptoms, the Community Assessment of Psychic Experiences Questionnaire (CAPE) positive subscale. The functional magnetic resonance imaging (fMRI) paradigm entailed passive viewing of negative emotional and neutral scenes. The pattern of brain activity during emotional processing allowed correct group classification with an overall accuracy of 77% (p = 0.002), within a network of regions including the amygdala, insula, anterior cingulate cortex and medial prefrontal cortex. However, further analysis suggested that the classification accuracy could also be explained by subclinical psychotic symptom scores (correlation with SVM weights r = 0.459, p = 0.006). Psychosis proneness may thus be a confounding factor for neuroimaging studies in subclinical depression.
Biomarkers for Musculoskeletal Pain Conditions: Use of Brain Imaging and Machine Learning.

PubMed

Boissoneault, Jeff; Sevel, Landrew; Letzen, Janelle; Robinson, Michael; Staud, Roland

2017-01-01

Chronic musculoskeletal pain condition often shows poor correlations between tissue abnormalities and clinical pain. Therefore, classification of pain conditions like chronic low back pain, osteoarthritis, and fibromyalgia depends mostly on self report and less on objective findings like X-ray or magnetic resonance imaging (MRI) changes. However, recent advances in structural and functional brain imaging have identified brain abnormalities in chronic pain conditions that can be used for illness classification. Because the analysis of complex and multivariate brain imaging data is challenging, machine learning techniques have been increasingly utilized for this purpose. The goal of machine learning is to train specific classifiers to best identify variables of interest on brain MRIs (i.e., biomarkers). This report describes classification techniques capable of separating MRI-based brain biomarkers of chronic pain patients from healthy controls with high accuracy (70-92%) using machine learning, as well as critical scientific, practical, and ethical considerations related to their potential clinical application. Although self-report remains the gold standard for pain assessment, machine learning may aid in the classification of chronic pain disorders like chronic back pain and fibromyalgia as well as provide mechanistic information regarding their neural correlates.
Sex estimation of the tibia in modern Turkish: A computed tomography study.

PubMed

Ekizoglu, Oguzhan; Er, Ali; Bozdag, Mustafa; Akcaoglu, Mustafa; Can, Ismail Ozgur; García-Donas, Julieta G; Kranioti, Elena F

2016-11-01

The utilization of computed tomography is beneficial for the analysis of skeletal remains and it has important advantages for anthropometric studies. The present study investigated morphometry of left tibia using CT images of a contemporary Turkish population. Seven parameters were measured on 203 individuals (124 males and 79 females) within the 19-92-years age group. The first objective of this study was to provide population-specific sex estimation equations for the contemporary Turkish population based on CT images. A second objective was to test the sex estimation formulae on Southern Europeans by Kranioti and Apostol (2015). Univariate discriminant functions resulted in classification accuracy that ranged from 66 to 86%. The best single variable was found to be upper epiphyseal breadth (86%) followed by lower epiphyseal breadth (85%). Multivariate discriminant functions resulted in classification accuracy for cross-validated data ranged from 79 to 86%. Applying the multivariate sex estimation formulae on Southern Europeans (SE) by Kranioti and Apostol in our sample resulted in very high classification accuracy ranging from 81 to 88%. In addition, 35.5-47% of the total Turkish sample is correctly classified with over 95% posterior probability, which is actually higher than the one reported for the original sample (25-43%). We conclude that the tibia is a very useful bone for sex estimation in the contemporary Turkish population. Moreover, our test results support the hypothesis that the SE formulae are sufficient for the contemporary Turkish population and they can be used safely for criminal investigations when posterior probabilities are over 95%. Copyright Â© 2016 Elsevier Ireland Ltd. All rights reserved.
Crystallization tendency of active pharmaceutical ingredients following rapid solvent evaporation--classification and comparison with crystallization tendency from undercooled melts.

PubMed

Van Eerdenbrugh, Bernard; Baird, Jared A; Taylor, Lynne S

2010-09-01

In this study, the crystallization behavior of a variety of compounds was studied following rapid solvent evaporation using spin coating. Initial screening to determine model compound suitability was performed using a structurally diverse set of 51 compounds in three different solvent systems [dichloromethane (DCM), a 1:1 (w/w) dichloromethane/ethanol mixture (MIX), and ethanol (EtOH)]. Of this starting set of 153 drug-solvent combinations, 93 (40 compounds) were selected for further evaluation based on solubility, chemical solution stability, and processability criteria. These systems were spin coated and their crystallization was monitored using polarized light microscopy (7 days, dry conditions). The crystallization behavior of the samples could be classified as rapid (Class I: 39 cases), intermediate (Class II: 23 cases), or slow (Class III: 31 cases). The solvent system employed influenced the classification outcome for only four of the compounds. The various compounds showed very diverse crystallization behavior. Upon comparison of classification results with those of a previous study, where cooling from the melt was used as a preparation technique, a good similarity was found whereby 68% of the cases were identically classified. Multivariate analysis was performed using a set of relevant physicochemical compound characteristics. It was found that a number of these parameters tended to differ between the different classes. These could be further interpreted in terms of the nature of the crystallization process. Additional multivariate analysis on the separate classes of compounds indicated some potential in predicting the crystallization tendency of a given compound.
Automated detection of radioisotopes from an aircraft platform by pattern recognition analysis of gamma-ray spectra.

PubMed

Dess, Brian W; Cardarelli, John; Thomas, Mark J; Stapleton, Jeff; Kroutil, Robert T; Miller, David; Curry, Timothy; Small, Gary W

2018-03-08

A generalized methodology was developed for automating the detection of radioisotopes from gamma-ray spectra collected from an aircraft platform using sodium-iodide detectors. Employing data provided by the U.S Environmental Protection Agency Airborne Spectral Photometric Environmental Collection Technology (ASPECT) program, multivariate classification models based on nonparametric linear discriminant analysis were developed for application to spectra that were preprocessed through a combination of altitude-based scaling and digital filtering. Training sets of spectra for use in building classification models were assembled from a combination of background spectra collected in the field and synthesized spectra obtained by superimposing laboratory-collected spectra of target radioisotopes onto field backgrounds. This approach eliminated the need for field experimentation with radioactive sources for use in building classification models. Through a bi-Gaussian modeling procedure, the discriminant scores that served as the outputs from the classification models were related to associated confidence levels. This provided an easily interpreted result regarding the presence or absence of the signature of a specific radioisotope in each collected spectrum. Through the use of this approach, classifiers were built for cesium-137 ( 137 Cs) and cobalt-60 ( 60 Co), two radioisotopes that are of interest in airborne radiological monitoring applications. The optimized classifiers were tested with field data collected from a set of six geographically diverse sites, three of which contained either 137 Cs, 60 Co, or both. When the optimized classification models were applied, the overall percentages of correct classifications for spectra collected at these sites were 99.9 and 97.9% for the 60 Co and 137 Cs classifiers, respectively. Copyright © 2018 Elsevier Ltd. All rights reserved.
PGI chicory (Cichorium intybus L.) traceability by means of HRMAS-NMR spectroscopy: a preliminary study.

PubMed

Ritota, Mena; Casciani, Lorena; Valentini, Massimiliano

2013-05-01

Analytical traceability of PGI and PDO foods (Protected Geographical Indication and Protected Denomination Origin respectively) is one of the most challenging tasks of current applied research. Here we proposed a metabolomic approach based on the combination of (1)H high-resolution magic angle spinning-nuclear magnetic resonance (HRMAS-NMR) spectroscopy with multivariate analysis, i.e. PLS-DA, as a reliable tool for the traceability of Italian PGI chicories (Cichorium intybus L.), i.e. Radicchio Rosso di Treviso and Radicchio Variegato di Castelfranco, also known as red and red-spotted, respectively. The metabolic profile was gained by means of HRMAS-NMR, and multivariate data analysis allowed us to build statistical models capable of providing clear discrimination among the two varieties and classification according to the geographical origin. Based on Variable Importance in Projection values, the molecular markers for classifying the different types of red chicories analysed were found accounting for both the cultivar and the place of origin. © 2012 Society of Chemical Industry.
Comparative study of different approaches for multivariate image analysis in HPTLC fingerprinting of natural products such as plant resin.

PubMed

Ristivojević, Petar; Trifković, Jelena; Vovk, Irena; Milojković-Opsenica, Dušanka

2017-01-01

Considering the introduction of phytochemical fingerprint analysis, as a method of screening the complex natural products for the presence of most bioactive compounds, use of chemometric classification methods, application of powerful scanning and image capturing and processing devices and algorithms, advancement in development of novel stationary phases as well as various separation modalities, high-performance thin-layer chromatography (HPTLC) fingerprinting is becoming attractive and fruitful field of separation science. Multivariate image analysis is crucial in the light of proper data acquisition. In a current study, different image processing procedures were studied and compared in detail on the example of HPTLC chromatograms of plant resins. In that sense, obtained variables such as gray intensities of pixels along the solvent front, peak area and mean values of peak were used as input data and compared to obtained best classification models. Important steps in image analysis, baseline removal, denoising, target peak alignment and normalization were pointed out. Numerical data set based on mean value of selected bands and intensities of pixels along the solvent front proved to be the most convenient for planar-chromatographic profiling, although required at least the basic knowledge on image processing methodology, and could be proposed for further investigation in HPLTC fingerprinting. Copyright © 2016 Elsevier B.V. All rights reserved.
Understanding perception of active noise control system through multichannel EEG analysis.

PubMed

Bagha, Sangeeta; Tripathy, R K; Nanda, Pranati; Preetam, C; Das, Debi Prasad

2018-06-01

In this Letter, a method is proposed to investigate the effect of noise with and without active noise control (ANC) on multichannel electroencephalogram (EEG) signal. The multichannel EEG signal is recorded during different listening conditions such as silent, music, noise, ANC with background noise and ANC with both background noise and music. The multiscale analysis of EEG signal of each channel is performed using the discrete wavelet transform. The multivariate multiscale matrices are formulated based on the sub-band signals of each EEG channel. The singular value decomposition is applied to the multivariate matrices of multichannel EEG at significant scales. The singular value features at significant scales and the extreme learning machine classifier with three different activation functions are used for classification of multichannel EEG signal. The experimental results demonstrate that, for ANC with noise and ANC with noise and music classes, the proposed method has sensitivity values of 75.831% ( p < 0.001 ) and 99.31% ( p < 0.001 ), respectively. The method has an accuracy value of 83.22% for the classification of EEG signal with music and ANC with music as stimuli. The important finding of this study is that by the introduction of ANC, music can be better perceived by the human brain.
Classification bias in commercial business lists for retail food stores in the U.S.

PubMed

Han, Euna; Powell, Lisa M; Zenk, Shannon N; Rimkus, Leah; Ohri-Vachaspati, Punam; Chaloupka, Frank J

2012-04-18

Aspects of the food environment such as the availability of different types of food stores have recently emerged as key modifiable factors that may contribute to the increased prevalence of obesity. Given that many of these studies have derived their results based on secondary datasets and the relationship of food stores with individual weight outcomes has been reported to vary by store type, it is important to understand the extent to which often-used secondary data correctly classify food stores. We evaluated the classification bias of food stores in Dun & Bradstreet (D&B) and InfoUSA commercial business lists. We performed a full census in 274 randomly selected census tracts in the Chicago metropolitan area and collected detailed store attributes inside stores for classification. Store attributes were compared by classification match status and store type. Systematic classification bias by census tract characteristics was assessed in multivariate regression. D&B had a higher classification match rate than InfoUSA for supermarkets and grocery stores, while InfoUSA was higher for convenience stores. Both lists were more likely to correctly classify large supermarkets, grocery stores, and convenience stores with more cash registers and different types of service counters (supermarkets and grocery stores only). The likelihood of a correct classification match for supermarkets and grocery stores did not vary systemically by tract characteristics whereas convenience stores were more likely to be misclassified in predominately Black tracts. Researches can rely on classification of food stores in commercial datasets for supermarkets and grocery stores whereas classifications for convenience and specialty food stores are subject to some systematic bias by neighborhood racial/ethnic composition.

Classification bias in commercial business lists for retail food stores in the U.S.

PubMed Central

2012-01-01

Background Aspects of the food environment such as the availability of different types of food stores have recently emerged as key modifiable factors that may contribute to the increased prevalence of obesity. Given that many of these studies have derived their results based on secondary datasets and the relationship of food stores with individual weight outcomes has been reported to vary by store type, it is important to understand the extent to which often-used secondary data correctly classify food stores. We evaluated the classification bias of food stores in Dun & Bradstreet (D&B) and InfoUSA commercial business lists. Methods We performed a full census in 274 randomly selected census tracts in the Chicago metropolitan area and collected detailed store attributes inside stores for classification. Store attributes were compared by classification match status and store type. Systematic classification bias by census tract characteristics was assessed in multivariate regression. Results D&B had a higher classification match rate than InfoUSA for supermarkets and grocery stores, while InfoUSA was higher for convenience stores. Both lists were more likely to correctly classify large supermarkets, grocery stores, and convenience stores with more cash registers and different types of service counters (supermarkets and grocery stores only). The likelihood of a correct classification match for supermarkets and grocery stores did not vary systemically by tract characteristics whereas convenience stores were more likely to be misclassified in predominately Black tracts. Conclusion Researches can rely on classification of food stores in commercial datasets for supermarkets and grocery stores whereas classifications for convenience and specialty food stores are subject to some systematic bias by neighborhood racial/ethnic composition. PMID:22512874
Panic disorder and agoraphobia: A direct comparison of their multivariate comorbidity patterns.

PubMed

Greene, Ashley L; Eaton, Nicholas R

2016-01-15

Scientific debate has long surrounded whether agoraphobia is a severe consequence of panic disorder or a frequently comorbid diagnosis. Multivariate comorbidity investigations typically treat these diagnoses as fungible in structural models, assuming both are manifestations of the fear-subfactor in the internalizing-externalizing model. No studies have directly compared these disorders' multivariate associations, which could clarify their conceptualization in classification and comorbidity research. In a nationally representative sample (N=43,093), we examined the multivariate comorbidity of panic disorder (1) without agoraphobia, (2) with agoraphobia, and (3) regardless of agoraphobia; and (4) agoraphobia without panic. We conducted exploratory and confirmatory factor analyses of these and 10 other lifetime DSM-IV diagnoses in a nationally representative sample (N=43,093). Differing bivariate and multivariate relations were found. Panic disorder without agoraphobia was largely a distress disorder, related to emotional disorders. Agoraphobia without panic was largely a fear disorder, related to phobias. When considered jointly, concomitant agoraphobia and panic was a fear disorder, and when panic was assessed without regard to agoraphobia (some individuals had agoraphobia while others did not) it was a mixed distress and fear disorder. Diagnoses were obtained from comprehensively trained lay interviewers, not clinicians and analyses used DSM-IV diagnoses (rather than DSM-5). These findings support the conceptualization of agoraphobia as a distinct diagnostic entity and the independent classification of both disorders in DSM-5, suggesting future multivariate comorbidity studies should not assume various panic/agoraphobia diagnoses are invariably fear disorders. Copyright © 2015 Elsevier B.V. All rights reserved.
Optical biopsy using fluorescence spectroscopy for prostate cancer diagnosis

NASA Astrophysics Data System (ADS)

Wu, Binlin; Gao, Xin; Smith, Jason; Bailin, Jacob

2017-02-01

Native fluorescence spectra are acquired from fresh normal and cancerous human prostate tissues. The fluorescence data are analyzed using a multivariate analysis algorithm such as non-negative matrix factorization. The nonnegative spectral components are retrieved and attributed to the native fluorophores such as collagen, reduced nicotinamide adenine dinucleotide (NADH), and flavin adenine dinucleotide (FAD) in tissue. The retrieved weights of the components, e.g. NADH and FAD are used to estimate the relative concentrations of the native fluorophores and the redox ratio. A machine learning algorithm such as support vector machine (SVM) is used for classification to distinguish normal and cancerous tissue samples based on either the relative concentrations of NADH and FAD or the redox ratio alone. The classification performance is shown based on statistical measures such as sensitivity, specificity, and accuracy, along with the area under receiver operating characteristic (ROC) curve. A cross validation method such as leave-one-out is used to evaluate the predictive performance of the SVM classifier to avoid bias due to overfitting.
Discrete Fourier Transform-Based Multivariate Image Analysis: Application to Modeling of Aromatase Inhibitory Activity.

PubMed

Barigye, Stephen J; Freitas, Matheus P; Ausina, Priscila; Zancan, Patricia; Sola-Penna, Mauro; Castillo-Garit, Juan A

2018-02-12

We recently generalized the formerly alignment-dependent multivariate image analysis applied to quantitative structure-activity relationships (MIA-QSAR) method through the application of the discrete Fourier transform (DFT), allowing for its application to noncongruent and structurally diverse chemical compound data sets. Here we report the first practical application of this method in the screening of molecular entities of therapeutic interest, with human aromatase inhibitory activity as the case study. We developed an ensemble classification model based on the two-dimensional (2D) DFT MIA-QSAR descriptors, with which we screened the NCI Diversity Set V (1593 compounds) and obtained 34 chemical compounds with possible aromatase inhibitory activity. These compounds were docked into the aromatase active site, and the 10 most promising compounds were selected for in vitro experimental validation. Of these compounds, 7419 (nonsteroidal) and 89 201 (steroidal) demonstrated satisfactory antiproliferative and aromatase inhibitory activities. The obtained results suggest that the 2D-DFT MIA-QSAR method may be useful in ligand-based virtual screening of new molecular entities of therapeutic utility.
Evaluation of Urinary Tract Dilation Classification System for Grading Postnatal Hydronephrosis.

PubMed

Hodhod, Amr; Capolicchio, John-Paul; Jednak, Roman; El-Sherif, Eid; El-Doray, Abd El-Alim; El-Sherbiny, Mohamed

2016-03-01

We assessed the reliability and validity of the Urinary Tract Dilation classification system as a new grading system for postnatal hydronephrosis. We retrospectively reviewed charts of patients who presented with hydronephrosis from 2008 to 2013. We included patients diagnosed prenatally and those with hydronephrosis discovered incidentally during the first year of life. We excluded cases involving urinary tract infection, neurogenic bladder and chromosomal anomalies, those associated with extraurinary congenital malformations and those with followup of less than 24 months without resolution. Hydronephrosis was graded postnatally using the Society for Fetal Urology system, and then the management protocol was chosen. All units were regraded using the Urinary Tract Dilation classification system and compared to the Society for Fetal Urology system to assess reliability. Univariate and multivariate analyses were performed to assess the validity of the Urinary Tract Dilation classification system in predicting hydronephrosis resolution and surgical intervention. A total of 490 patients (730 renal units) were eligible to participate. The Urinary Tract Dilation classification system was reliable in the assessment of hydronephrosis (parallel forms 0.92). Hydronephrosis resolved in 357 units (49%), and 86 units (12%) were managed by surgical intervention. The remainder of renal units demonstrated stable or improved hydronephrosis. Multivariate analysis revealed that the likelihood of surgical intervention was predicted independently by Urinary Tract Dilation classification system risk group, while Society for Fetal Urology grades were predictive of likelihood of resolution. The Urinary Tract Dilation classification system is reliable for evaluation of postnatal hydronephrosis and is valid in predicting surgical intervention. Copyright © 2016 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes.

PubMed

Yates, Katherine L; Mellin, Camille; Caley, M Julian; Radford, Ben T; Meeuwig, Jessica J

2016-01-01

Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability.
Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes

PubMed Central

Yates, Katherine L.; Mellin, Camille; Caley, M. Julian; Radford, Ben T.; Meeuwig, Jessica J.

2016-01-01

Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability. PMID:27333202
BANYAN. XI. The BANYAN Σ Multivariate Bayesian Algorithm to Identify Members of Young Associations with 150 pc

NASA Astrophysics Data System (ADS)

Gagné, Jonathan; Mamajek, Eric E.; Malo, Lison; Riedel, Adric; Rodriguez, David; Lafrenière, David; Faherty, Jacqueline K.; Roy-Loubier, Olivier; Pueyo, Laurent; Robin, Annie C.; Doyon, René

2018-03-01

BANYAN Σ is a new Bayesian algorithm to identify members of young stellar associations within 150 pc of the Sun. It includes 27 young associations with ages in the range ∼1–800 Myr, modeled with multivariate Gaussians in six-dimensional (6D) XYZUVW space. It is the first such multi-association classification tool to include the nearest sub-groups of the Sco-Cen OB star-forming region, the IC 2602, IC 2391, Pleiades and Platais 8 clusters, and the ρ Ophiuchi, Corona Australis, and Taurus star formation regions. A model of field stars is built from a mixture of multivariate Gaussians based on the Besançon Galactic model. The algorithm can derive membership probabilities for objects with only sky coordinates and proper motion, but can also include parallax and radial velocity measurements, as well as spectrophotometric distance constraints from sequences in color–magnitude or spectral type–magnitude diagrams. BANYAN Σ benefits from an analytical solution to the Bayesian marginalization integrals over unknown radial velocities and distances that makes it more accurate and significantly faster than its predecessor BANYAN II. A contamination versus hit rate analysis is presented and demonstrates that BANYAN Σ achieves a better classification performance than other moving group tools available in the literature, especially in terms of cross-contamination between young associations. An updated list of bona fide members in the 27 young associations, augmented by the Gaia-DR1 release, as well as all parameters for the 6D multivariate Gaussian models for each association and the Galactic field neighborhood within 300 pc are presented. This new tool will make it possible to analyze large data sets such as the upcoming Gaia-DR2 to identify new young stars. IDL and Python versions of BANYAN Σ are made available with this publication, and a more limited online web tool is available at http://www.exoplanetes.umontreal.ca/banyan/banyansigma.php.
Comprehensive analysis of Polygoni Multiflori Radix of different geographical origins using ultra-high-performance liquid chromatography fingerprints and multivariate chemometric methods.

PubMed

Sun, Li-Li; Wang, Meng; Zhang, Hui-Jie; Liu, Ya-Nan; Ren, Xiao-Liang; Deng, Yan-Ru; Qi, Ai-Di

2018-01-01

Polygoni Multiflori Radix (PMR) is increasingly being used not just as a traditional herbal medicine but also as a popular functional food. In this study, multivariate chemometric methods and mass spectrometry were combined to analyze the ultra-high-performance liquid chromatograph (UPLC) fingerprints of PMR from six different geographical origins. A chemometric strategy based on multivariate curve resolution-alternating least squares (MCR-ALS) and three classification methods is proposed to analyze the UPLC fingerprints obtained. Common chromatographic problems, including the background contribution, baseline contribution, and peak overlap, were handled by the established MCR-ALS model. A total of 22 components were resolved. Moreover, relative species concentrations were obtained from the MCR-ALS model, which was used for multivariate classification analysis. Principal component analysis (PCA) and Ward's method have been applied to classify 72 PMR samples from six different geographical regions. The PCA score plot showed that the PMR samples fell into four clusters, which related to the geographical location and climate of the source areas. The results were then corroborated by Ward's method. In addition, according to the variance-weighted distance between cluster centers obtained from Ward's method, five components were identified as the most significant variables (chemical markers) for cluster discrimination. A counter-propagation artificial neural network has been applied to confirm and predict the effects of chemical markers on different samples. Finally, the five chemical markers were identified by UPLC-quadrupole time-of-flight mass spectrometer. Components 3, 12, 16, 18, and 19 were identified as 2,3,5,4'-tetrahydroxy-stilbene-2-O-β-d-glucoside, emodin-8-O-β-d-glucopyranoside, emodin-8-O-(6'-O-acetyl)-β-d-glucopyranoside, emodin, and physcion, respectively. In conclusion, the proposed method can be applied for the comprehensive analysis of natural samples. Copyright © 2016. Published by Elsevier B.V.
Multivariate evaluation of Thyroid Imaging Reporting and Data System (TI-RADS) in diagnosis malignant thyroid nodule: application to PCA and PLS-DA analysis.

PubMed

Zhang, Tan; Li, Fangxuan; Mu, Jiali; Liu, Juntian; Zhang, Sheng

2017-06-01

To explore the significance of ultrasonic features in differential diagnosis of thyroid nodules via combining the thyroid imaging reporting and data system (TI-RADS) and multivariate statistical analysis. Patients who received surgical treatment and was diagnosed with single thyroid nodule by postoperative pathology and preoperative ultrasound were enrolled in this study. Multivariate analysis was applied to assess the significant ultrasonic features which correlated with identifying benign or malignance and grading the TI-RADS classification of thyroid nodule. There were significant differences in the nodule size, aspect ratio, internal, echogenicity, boundary, presence or absence of calcifications, calcification type and CDFI between benign and malignant thyroid nodules. Multivariate analysis showed clear-cut distinction both between benign and malignance and among different TI-RADS categories of malignancy nodules. The shape and calcification of the nodule were important factors for distinguish the benign and malignance. Height of the nodule, aspect and calcification was important factors for grading TI-RADS categories of malignancy thyroid nodules. Ill-defined boundary, irregular shape and presence of calcification related with highly malignant risk for thyroid nodule. The larger height and aspect and presence of calcification related with higher TI-RADS classification of malignancy thyroid nodule.
Screening analysis of biodiesel feedstock using UV-vis, NIR and synchronous fluorescence spectrometries and the successive projections algorithm.

PubMed

Insausti, Matías; Gomes, Adriano A; Cruz, Fernanda V; Pistonesi, Marcelo F; Araujo, Mario C U; Galvão, Roberto K H; Pereira, Claudete F; Band, Beatriz S F

2012-08-15

This paper investigates the use of UV-vis, near infrared (NIR) and synchronous fluorescence (SF) spectrometries coupled with multivariate classification methods to discriminate biodiesel samples with respect to the base oil employed in their production. More specifically, the present work extends previous studies by investigating the discrimination of corn-based biodiesel from two other biodiesel types (sunflower and soybean). Two classification methods are compared, namely full-spectrum SIMCA (soft independent modelling of class analogies) and SPA-LDA (linear discriminant analysis with variables selected by the successive projections algorithm). Regardless of the spectrometric technique employed, full-spectrum SIMCA did not provide an appropriate discrimination of the three biodiesel types. In contrast, all samples were correctly classified on the basis of a reduced number of wavelengths selected by SPA-LDA. It can be concluded that UV-vis, NIR and SF spectrometries can be successfully employed to discriminate corn-based biodiesel from the two other biodiesel types, but wavelength selection by SPA-LDA is key to the proper separation of the classes. Copyright © 2012 Elsevier B.V. All rights reserved.
Impact of FAB classification on predicting outcome in acute myeloid leukemia, not otherwise specified, patients undergoing allogeneic stem cell transplantation in CR1: An analysis of 1690 patients from the acute leukemia working party of EBMT.

PubMed

Canaani, Jonathan; Beohou, Eric; Labopin, Myriam; Socié, Gerard; Huynh, Anne; Volin, Liisa; Cornelissen, Jan; Milpied, Noel; Gedde-Dahl, Tobias; Deconinck, Eric; Fegueux, Nathalie; Blaise, Didier; Mohty, Mohamad; Nagler, Arnon

2017-04-01

The French, American, and British (FAB) classification system for acute myeloid leukemia (AML) is extensively used and is incorporated into the AML, not otherwise specified (NOS) category in the 2016 WHO edition of myeloid neoplasm classification. While recent data proposes that FAB classification does not provide additional prognostic information for patients for whom NPM1 status is available, it is unknown whether FAB still retains a current prognostic role in predicting outcome of AML patients undergoing allogeneic stem cell transplantation. Using the European Society of Blood and Bone Marrow Transplantation registry we analyzed outcome of 1690 patients transplanted in CR1 to determine if FAB classification provides additional prognostic value. Multivariate analysis revealed that M6/M7 patients had decreased leukemia free survival (hazard ratio (HR) of 1.41, 95% confidence interval (CI), 1.01-1.99; P = .046) in addition to increased nonrelapse mortality (NRM) rates (HR, 1.79; 95% CI, 1.06-3.01; P = .028) compared with other FAB types. In the NPM1 wt AML, NOS cohort, FAB M6/M7 was also associated with increased NRM (HR, 2.17; 95% CI, 1.14-4.16; P = .019). Finally, in FLT3-ITD + patients, multivariate analyses revealed that specific FAB types were tightly associated with adverse outcome. In conclusion, FAB classification may predict outcome following transplantation in AML, NOS patients. © 2017 Wiley Periodicals, Inc.
Dual-modal cancer detection based on optical pH sensing and Raman spectroscopy

NASA Astrophysics Data System (ADS)

Kim, Soogeun; Lee, Seung Ho; Min, Sun Young; Byun, Kyung Min; Lee, Soo Yeol

2017-10-01

A dual-modal approach using Raman spectroscopy and optical pH sensing was investigated to discriminate between normal and cancerous tissues. Raman spectroscopy has demonstrated the potential for in vivo cancer detection. However, Raman spectroscopy has suffered from strong fluorescence background of biological samples and subtle spectral differences between normal and disease tissues. To overcome those issues, pH sensing is adopted to Raman spectroscopy as a dual-modal approach. Based on the fact that the pH level in cancerous tissues is lower than that in normal tissues due to insufficient vasculature formation, the dual-modal approach combining the chemical information of Raman spectrum and the metabolic information of pH level can improve the specificity of cancer diagnosis. From human breast tissue samples, Raman spectra and pH levels are measured using fiber-optic-based Raman and pH probes, respectively. The pH sensing is based on the dependence of pH level on optical transmission spectrum. Multivariate statistical analysis is performed to evaluate the classification capability of the dual-modal method. The analytical results show that the dual-modal method based on Raman spectroscopy and optical pH sensing can improve the performance of cancer classification.
Photometric redshift estimation based on data mining with PhotoRApToR

NASA Astrophysics Data System (ADS)

Cavuoti, S.; Brescia, M.; De Stefano, V.; Longo, G.

2015-03-01

Photometric redshifts (photo-z) are crucial to the scientific exploitation of modern panchromatic digital surveys. In this paper we present PhotoRApToR (Photometric Research Application To Redshift): a Java/C ++ based desktop application capable to solve non-linear regression and multi-variate classification problems, in particular specialized for photo-z estimation. It embeds a machine learning algorithm, namely a multi-layer neural network trained by the Quasi Newton learning rule, and special tools dedicated to pre- and post-processing data. PhotoRApToR has been successfully tested on several scientific cases. The application is available for free download from the DAME Program web site.
Use of multivariate analysis to suggest a new molecular classification of colorectal cancer

PubMed Central

Domingo, Enric; Ramamoorthy, Rajarajan; Oukrif, Dahmane; Rosmarin, Daniel; Presz, Michal; Wang, Haitao; Pulker, Hannah; Lockstone, Helen; Hveem, Tarjei; Cranston, Treena; Danielsen, Havard; Novelli, Marco; Davidson, Brian; Xu, Zheng-Zhou; Molloy, Peter; Johnstone, Elaine; Holmes, Christopher; Midgley, Rachel; Kerr, David; Sieber, Oliver; Tomlinson, Ian

2013-01-01

Abstract Molecular classification of colorectal cancer (CRC) is currently based on microsatellite instability (MSI), KRAS or BRAF mutation and, occasionally, chromosomal instability (CIN). Whilst useful, these categories may not fully represent the underlying molecular subgroups. We screened 906 stage II/III CRCs from the VICTOR clinical trial for somatic mutations. Multivariate analyses (logistic regression, clustering, Bayesian networks) identified the primary molecular associations. Positive associations occurred between: CIN and TP53 mutation; MSI and BRAF mutation; and KRAS and PIK3CA mutations. Negative associations occurred between: MSI and CIN; MSI and NRAS mutation; and KRAS mutation, and each of NRAS, TP53 and BRAF mutations. Some complex relationships were elucidated: KRAS and TP53 mutations had both a direct negative association and a weaker, confounding, positive association via TP53–CIN–MSI–BRAF–KRAS. Our results suggested a new molecular classification of CRCs: (1) MSI+ and/or BRAF-mutant; (2) CIN+ and/or TP53– mutant, with wild-type KRAS and PIK3CA; (3) KRAS- and/or PIK3CA-mutant, CIN+, TP53-wild-type; (4) KRAS– and/or PIK3CA-mutant, CIN–, TP53-wild-type; (5) NRAS-mutant; (6) no mutations; (7) others. As expected, group 1 cancers were mostly proximal and poorly differentiated, usually occurring in women. Unexpectedly, two different types of CIN+ CRC were found: group 2 cancers were usually distal and occurred in men, whereas group 3 showed neither of these associations but were of higher stage. CIN+ cancers have conventionally been associated with all three of these variables, because they have been tested en masse. Our classification also showed potentially improved prognostic capabilities, with group 3, and possibly group 1, independently predicting disease-free survival. Copyright © 2012 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. PMID:23165447
Single-particle cryo-EM using alignment by classification (ABC): the structure of Lumbricus terrestris haemoglobin.

PubMed

Afanasyev, Pavel; Seer-Linnemayr, Charlotte; Ravelli, Raimond B G; Matadeen, Rishi; De Carlo, Sacha; Alewijnse, Bart; Portugal, Rodrigo V; Pannu, Navraj S; Schatz, Michael; van Heel, Marin

2017-09-01

Single-particle cryogenic electron microscopy (cryo-EM) can now yield near-atomic resolution structures of biological complexes. However, the reference-based alignment algorithms commonly used in cryo-EM suffer from reference bias, limiting their applicability (also known as the 'Einstein from random noise' problem). Low-dose cryo-EM therefore requires robust and objective approaches to reveal the structural information contained in the extremely noisy data, especially when dealing with small structures. A reference-free pipeline is presented for obtaining near-atomic resolution three-dimensional reconstructions from heterogeneous ('four-dimensional') cryo-EM data sets. The methodologies integrated in this pipeline include a posteriori camera correction, movie-based full-data-set contrast transfer function determination, movie-alignment algorithms, (Fourier-space) multivariate statistical data compression and unsupervised classification, 'random-startup' three-dimensional reconstructions, four-dimensional structural refinements and Fourier shell correlation criteria for evaluating anisotropic resolution. The procedures exclusively use information emerging from the data set itself, without external 'starting models'. Euler-angle assignments are performed by angular reconstitution rather than by the inherently slower projection-matching approaches. The comprehensive 'ABC-4D' pipeline is based on the two-dimensional reference-free 'alignment by classification' (ABC) approach, where similar images in similar orientations are grouped by unsupervised classification. Some fundamental differences between X-ray crystallography versus single-particle cryo-EM data collection and data processing are discussed. The structure of the giant haemoglobin from Lumbricus terrestris at a global resolution of ∼3.8 Å is presented as an example of the use of the ABC-4D procedure.
Sample classification for improved performance of PLS models applied to the quality control of deep-frying oils of different botanic origins analyzed using ATR-FTIR spectroscopy.

PubMed

Kuligowski, Julia; Carrión, David; Quintás, Guillermo; Garrigues, Salvador; de la Guardia, Miguel

2011-01-01

The selection of an appropriate calibration set is a critical step in multivariate method development. In this work, the effect of using different calibration sets, based on a previous classification of unknown samples, on the partial least squares (PLS) regression model performance has been discussed. As an example, attenuated total reflection (ATR) mid-infrared spectra of deep-fried vegetable oil samples from three botanical origins (olive, sunflower, and corn oil), with increasing polymerized triacylglyceride (PTG) content induced by a deep-frying process were employed. The use of a one-class-classifier partial least squares-discriminant analysis (PLS-DA) and a rooted binary directed acyclic graph tree provided accurate oil classification. Oil samples fried without foodstuff could be classified correctly, independent of their PTG content. However, class separation of oil samples fried with foodstuff, was less evident. The combined use of double-cross model validation with permutation testing was used to validate the obtained PLS-DA classification models, confirming the results. To discuss the usefulness of the selection of an appropriate PLS calibration set, the PTG content was determined by calculating a PLS model based on the previously selected classes. In comparison to a PLS model calculated using a pooled calibration set containing samples from all classes, the root mean square error of prediction could be improved significantly using PLS models based on the selected calibration sets using PLS-DA, ranging between 1.06 and 2.91% (w/w).
Multivariate geometry as an approach to algal community analysis

USGS Publications Warehouse

Allen, T.F.H.; Skagen, S.

1973-01-01

Multivariate analyses are put in the context of more usual approaches to phycological investigations. The intuitive common-sense involved in methods of ordination, classification and discrimination are emphasised by simple geometric accounts which avoid jargon and matrix algebra. Warnings are given that artifacts result from technique abuses by the naive or over-enthusiastic. An analysis of a simple periphyton data set is presented as an example of the approach. Suggestions are made as to situations in phycological investigations, where the techniques could be appropriate. The discipline is reprimanded for its neglect of the multivariate approach.
Accuracy assessments and areal estimates using two-phase stratified random sampling, cluster plots, and the multivariate composite estimator

Treesearch

Raymond L. Czaplewski

2000-01-01

Consider the following example of an accuracy assessment. Landsat data are used to build a thematic map of land cover for a multicounty region. The map classifier (e.g., a supervised classification algorithm) assigns each pixel into one category of land cover. The classification system includes 12 different types of forest and land cover: black spruce, balsam fir,...
Multivariate spline methods in surface fitting

NASA Technical Reports Server (NTRS)

Guseman, L. F., Jr. (Principal Investigator); Schumaker, L. L.

1984-01-01

The use of spline functions in the development of classification algorithms is examined. In particular, a method is formulated for producing spline approximations to bivariate density functions where the density function is decribed by a histogram of measurements. The resulting approximations are then incorporated into a Bayesiaan classification procedure for which the Bayes decision regions and the probability of misclassification is readily computed. Some preliminary numerical results are presented to illustrate the method.

Serum C-reactive protein level in COPD patients stratified according to GOLD 2011 grading classification

PubMed Central

Lin, Yi-Hua; Wang, Wan-Yu; Hu, Su-Xian; Shi, Yong-Hong

2016-01-01

Background and Objective: The Global Initiative for Chronic Obstructive Lung Disease (GOLD) 2011 grading classification has been used to evaluate the severity of patients with chronic obstructive pulmonary disease (COPD). However, little is known about the relationship between the systemic inflammation and this classification. We aimed to study the relationship between serum CRP and the components of the GOLD 2011 grading classification. Methods: C-reactive protein (CRP) levels were measured in 391 clinically stable COPD patients and in 50 controls from June 2, 2015 to October 31, 2015 in the First Affiliated Hospital of Xiamen University. The association between CRP levels and the components of the GOLD 2011 grading classification were assessed. Results: Correlation was found with the following variables: GOLD 2011 group (0.240), age (0.227), pack year (0.136), forced expiratory volume in one second % predicted (FEV1%; -0.267), forced vital capacity % predicted (-0.210), number of acute exacerbations in the past year (0.265), number of hospitalized exacerbations in the past year (0.165), British medical Research Council dyspnoea scale (0.121), COPD assessment test score (CAT, 0.233). Using multivariate analysis, FEV1% and CAT score manifested the strongest negative association with CRP levels. Conclusions: CRP levels differ in COPD patients among groups A-D based on GOLD 2011 grading classification. CRP levels are associated with several important clinical variables, of which FEV1% and CAT score manifested the strongest negative correlation. PMID:28083044
FDG-PET and CSF biomarker accuracy in prediction of conversion to different dementias in a large multicentre MCI cohort.

PubMed

Caminiti, Silvia Paola; Ballarini, Tommaso; Sala, Arianna; Cerami, Chiara; Presotto, Luca; Santangelo, Roberto; Fallanca, Federico; Vanoli, Emilia Giovanna; Gianolli, Luigi; Iannaccone, Sandro; Magnani, Giuseppe; Perani, Daniela

2018-01-01

In this multicentre study in clinical settings, we assessed the accuracy of optimized procedures for FDG-PET brain metabolism and CSF classifications in predicting or excluding the conversion to Alzheimer's disease (AD) dementia and non-AD dementias. We included 80 MCI subjects with neurological and neuropsychological assessments, FDG-PET scan and CSF measures at entry, all with clinical follow-up. FDG-PET data were analysed with a validated voxel-based SPM method. Resulting single-subject SPM maps were classified by five imaging experts according to the disease-specific patterns, as "typical-AD", "atypical-AD" (i.e. posterior cortical atrophy, asymmetric logopenic AD variant, frontal-AD variant), "non-AD" (i.e. behavioural variant FTD, corticobasal degeneration, semantic variant FTD; dementia with Lewy bodies) or "negative" patterns. To perform the statistical analyses, the individual patterns were grouped either as "AD dementia vs. non-AD dementia (all diseases)" or as "FTD vs. non-FTD (all diseases)". Aβ42, total and phosphorylated Tau CSF-levels were classified dichotomously, and using the Erlangen Score algorithm. Multivariate logistic models tested the prognostic accuracy of FDG-PET-SPM and CSF dichotomous classifications. Accuracy of Erlangen score and Erlangen Score aided by FDG-PET SPM classification was evaluated. The multivariate logistic model identified FDG-PET "AD" SPM classification (Expβ = 19.35, 95% C.I. 4.8-77.8, p < 0.001) and CSF Aβ42 (Expβ = 6.5, 95% C.I. 1.64-25.43, p < 0.05) as the best predictors of conversion from MCI to AD dementia. The "FTD" SPM pattern significantly predicted conversion to FTD dementias at follow-up (Expβ = 14, 95% C.I. 3.1-63, p < 0.001). Overall, FDG-PET-SPM classification was the most accurate biomarker, able to correctly differentiate either the MCI subjects who converted to AD or FTD dementias, and those who remained stable or reverted to normal cognition (Expβ = 17.9, 95% C.I. 4.55-70.46, p < 0.001). Our results support the relevant role of FDG-PET-SPM classification in predicting progression to different dementia conditions in prodromal MCI phase, and in the exclusion of progression, outperforming CSF biomarkers.
Multivariate qualitative analysis of banned additives in food safety using surface enhanced Raman scattering spectroscopy.

PubMed

He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei

2015-02-25

A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety. Copyright © 2014 Elsevier B.V. All rights reserved.
(13)C NMR pattern recognition techniques for the classification of Atlantic salmon (Salmo salar L.) according to their wild, farmed, and geographical origin.

PubMed

Aursand, Marit; Standal, Inger B; Praël, Angelika; McEvoy, Lesley; Irvine, Joe; Axelson, David E

2009-05-13

(13)C nuclear magnetic resonance (NMR) in combination with multivariate data analysis was used to (1) discriminate between farmed and wild Atlantic salmon ( Salmo salar L.), (2) discriminate between different geographical origins, and (3) verify the origin of market samples. Muscle lipids from 195 Atlantic salmon of known origin (wild and farmed salmon from Norway, Scotland, Canada, Iceland, Ireland, the Faroes, and Tasmania) in addition to market samples were analyzed by (13)C NMR spectroscopy and multivariate analysis. Both probabilistic neural networks (PNN) and support vector machines (SVM) provided excellent discrimination (98.5 and 100.0%, respectively) between wild and farmed salmon. Discrimination with respect to geographical origin was somewhat more difficult, with correct classification rates ranging from 82.2 to 99.3% by PNN and SVM, respectively. In the analysis of market samples, five fish labeled and purchased as wild salmon were classified as farmed salmon (indicating mislabeling), and there were also some discrepancies between the classification and the product declaration with regard to geographical origin.
Optimal statistical damage detection and classification in an experimental wind turbine blade using minimum instrumentation

NASA Astrophysics Data System (ADS)

Hoell, Simon; Omenzetter, Piotr

2017-04-01

The increasing demand for carbon neutral energy in a challenging economic environment is a driving factor for erecting ever larger wind turbines in harsh environments using novel wind turbine blade (WTBs) designs characterized by high flexibilities and lower buckling capacities. To counteract resulting increasing of operation and maintenance costs, efficient structural health monitoring systems can be employed to prevent dramatic failures and to schedule maintenance actions according to the true structural state. This paper presents a novel methodology for classifying structural damages using vibrational responses from a single sensor. The method is based on statistical classification using Bayes' theorem and an advanced statistic, which allows controlling the performance by varying the number of samples which represent the current state. This is done for multivariate damage sensitive features defined as partial autocorrelation coefficients (PACCs) estimated from vibrational responses and principal component analysis scores from PACCs. Additionally, optimal DSFs are composed not only for damage classification but also for damage detection based on binary statistical hypothesis testing, where features selections are found with a fast forward procedure. The method is applied to laboratory experiments with a small scale WTB with wind-like excitation and non-destructive damage scenarios. The obtained results demonstrate the advantages of the proposed procedure and are promising for future applications of vibration-based structural health monitoring in WTBs.
Application of a Novel S3 Nanowire Gas Sensor Device in Parallel with GC-MS for the Identification of Rind Percentage of Grated Parmigiano Reggiano.

PubMed

Abbatangelo, Marco; Núñez-Carmona, Estefanía; Sberveglieri, Veronica; Zappa, Dario; Comini, Elisabetta; Sberveglieri, Giorgio

2018-05-18

Parmigiano Reggiano cheese is one of the most appreciated and consumed foods worldwide, especially in Italy, for its high content of nutrients and taste. However, these characteristics make this product subject to counterfeiting in different forms. In this study, a novel method based on an electronic nose has been developed to investigate the potentiality of this tool to distinguish rind percentages in grated Parmigiano Reggiano packages that should be lower than 18%. Different samples, in terms of percentage, seasoning and rind working process, were considered to tackle the problem at 360°. In parallel, GC-MS technique was used to give a name to the compounds that characterize Parmigiano and to relate them to sensors responses. Data analysis consisted of two stages: Multivariate analysis (PLS) and classification made in a hierarchical way with PLS-DA ad ANNs. Results were promising, in terms of correct classification of the samples. The correct classification rate (%) was higher for ANNs than PLS-DA, with correct identification approaching 100 percent.
Nonlinear multivariate and time series analysis by neural network methods

NASA Astrophysics Data System (ADS)

Hsieh, William W.

2004-03-01

Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.
Geomorphic Classification and Evaluation of Channel Width and Emergent Sandbar Habitat Relations on the Lower Platte River, Nebraska

USGS Publications Warehouse

Elliott, Caroline M.

2011-01-01

This report presents a summary of geomorphic characteristics extracted from aerial imagery for three broad segments of the Lower Platte River. This report includes a summary of the longitudinal multivariate classification in Elliott and others (2009) and presents a new analysis of total channel width and habitat variables. Three segments on the lower 102.8 miles of the Lower Platte River are addressed in this report: the Loup River to the Elkhorn River (70 miles long), the Elkhorn River to Salt Creek (6.9 miles long), and Salt Creek to the Missouri River (25.9 miles long). The locations of these segments were determined by the locations of tributaries potentially significant to the hydrology or sediment supply of the Lower Platte River. This report summarizes channel characteristics as mapped from July 2006 aerial imagery including river width, valley width, channel curvature, and in-channel habitat features. In-channel habitat measurements were not made under consistent hydrologic conditions and must be considered general estimates of channel condition in late July 2006. Longitudinal patterns in these features are explored and are summarized in the context of the longitudinal multivariate classification in Elliott and others (2009) for the three Lower Platte River segments. Detailed descriptions of data collection and classification methods are described in Elliott and others (2009). Nesting data for the endangered interior least tern (Sternula antillarum) and threatened piping plover (Charadrius melodus) from 2006 through 2009 are examined within the context of the multivariate classification and Lower Platte River segments. The widest reaches of the Lower Platte River are located in the segment downstream from the Loup River to the Elkhorn River. This segment also has the widest valley and highest degree of braiding of the three segments and many large vegetated islands. The short segment of river between the Elkhorn River and Salt Creek has a fairly low valley width and high channel sinuosities at larger scales. The segment from Salt Creek to the Missouri River has narrow valleys and generally low channel sinuosity. Tern and plover nest sites from 2006 through 2009 in the multi-scale multivariate classification indicated relative nesting selection of cluster 2 reaches among the four-cluster classification and reaches containing clusters 2, 3, and 6 from the seven-cluster classification. These classes, with the exception of cluster 6 are common downstream from the Elkhorn River. Trends in total channel width indicated that reaches dominated by dark vegetation (islands) are the widest on the Lower Platte River. Reaches with high percentages of dry sand and dry sand plus light vegetation were the narrowest reaches. This suggests that narrow channel reaches have sufficient transport capacity to maintain sandbars under recent (2006) flow regimes and are likely to be most amenable to maintaining tern and plover habitat in the Lower Platte River. Further investigations into the dynamics of emergent sandbar habitat and the effects of bank stabilization on in-channel habitats will require the collection and analysis of new data, particularly detailed elevation information and an assessment of existing bank stabilization structures.
Delay differential analysis of time series.

PubMed

Lainscsek, Claudia; Sejnowski, Terrence J

2015-03-01

Nonlinear dynamical system analysis based on embedding theory has been used for modeling and prediction, but it also has applications to signal detection and classification of time series. An embedding creates a multidimensional geometrical object from a single time series. Traditionally either delay or derivative embeddings have been used. The delay embedding is composed of delayed versions of the signal, and the derivative embedding is composed of successive derivatives of the signal. The delay embedding has been extended to nonuniform embeddings to take multiple timescales into account. Both embeddings provide information on the underlying dynamical system without having direct access to all the system variables. Delay differential analysis is based on functional embeddings, a combination of the derivative embedding with nonuniform delay embeddings. Small delay differential equation (DDE) models that best represent relevant dynamic features of time series data are selected from a pool of candidate models for detection or classification. We show that the properties of DDEs support spectral analysis in the time domain where nonlinear correlation functions are used to detect frequencies, frequency and phase couplings, and bispectra. These can be efficiently computed with short time windows and are robust to noise. For frequency analysis, this framework is a multivariate extension of discrete Fourier transform (DFT), and for higher-order spectra, it is a linear and multivariate alternative to multidimensional fast Fourier transform of multidimensional correlations. This method can be applied to short or sparse time series and can be extended to cross-trial and cross-channel spectra if multiple short data segments of the same experiment are available. Together, this time-domain toolbox provides higher temporal resolution, increased frequency and phase coupling information, and it allows an easy and straightforward implementation of higher-order spectra across time compared with frequency-based methods such as the DFT and cross-spectral analysis.
Metabolite profiling in retinoblastoma identifies novel clinicopathological subgroups

PubMed Central

Kohe, Sarah; Brundler, Marie-Anne; Jenkinson, Helen; Parulekar, Manoj; Wilson, Martin; Peet, Andrew C; McConville, Carmel M

2015-01-01

Background: Tumour classification, based on histopathology or molecular pathology, is of value to predict tumour behaviour and to select appropriate treatment. In retinoblastoma, pathology information is not available at diagnosis and only exists for enucleated tumours. Alternative methods of tumour classification, using noninvasive techniques such as magnetic resonance spectroscopy, are urgently required to guide treatment decisions at the time of diagnosis. Methods: High-resolution magic-angle spinning magnetic resonance spectroscopy (HR-MAS MRS) was undertaken on enucleated retinoblastomas. Principal component analysis and cluster analysis of the HR-MAS MRS data was used to identify tumour subgroups. Individual metabolite concentrations were determined and were correlated with histopathological risk factors for each group. Results: Multivariate analysis identified three metabolic subgroups of retinoblastoma, with the most discriminatory metabolites being taurine, hypotaurine, total-choline and creatine. Metabolite concentrations correlated with specific histopathological features: taurine was correlated with differentiation, total-choline and phosphocholine with retrolaminar optic nerve invasion, and total lipids with necrosis. Conclusions: We have demonstrated that a metabolite-based classification of retinoblastoma can be obtained using ex vivo magnetic resonance spectroscopy, and that the subgroups identified correlate with histopathological features. This result justifies future studies to validate the clinical relevance of these subgroups and highlights the potential of in vivo MRS as a noninvasive diagnostic tool for retinoblastoma patient stratification. PMID:26348444
Nearest clusters based partial least squares discriminant analysis for the classification of spectral data.

PubMed

Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar

2018-06-07

Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.
Improved classification and visualization of healthy and pathological hard dental tissues by modeling specular reflections in NIR hyperspectral images

NASA Astrophysics Data System (ADS)

Usenik, Peter; Bürmen, Miran; Fidler, Aleš; Pernuš, Franjo; Likar, Boštjan

2012-03-01

Despite major improvements in dental healthcare and technology, dental caries remains one of the most prevalent chronic diseases of modern society. The initial stages of dental caries are characterized by demineralization of enamel crystals, commonly known as white spots, which are difficult to diagnose. Near-infrared (NIR) hyperspectral imaging is a new promising technique for early detection of demineralization which can classify healthy and pathological dental tissues. However, due to non-ideal illumination of the tooth surface the hyperspectral images can exhibit specular reflections, in particular around the edges and the ridges of the teeth. These reflections significantly affect the performance of automated classification and visualization methods. Cross polarized imaging setup can effectively remove the specular reflections, however is due to the complexity and other imaging setup limitations not always possible. In this paper, we propose an alternative approach based on modeling the specular reflections of hard dental tissues, which significantly improves the classification accuracy in the presence of specular reflections. The method was evaluated on five extracted human teeth with corresponding gold standard for 6 different healthy and pathological hard dental tissues including enamel, dentin, calculus, dentin caries, enamel caries and demineralized regions. Principal component analysis (PCA) was used for multivariate local modeling of healthy and pathological dental tissues. The classification was performed by employing multiple discriminant analysis. Based on the obtained results we believe the proposed method can be considered as an effective alternative to the complex cross polarized imaging setups.
Near Infrared Spectroscopy Detection and Quantification of Herbal Medicines Adulterated with Sibutramine.

PubMed

da Silva, Neirivaldo Cavalcante; Honorato, Ricardo Saldanha; Pimentel, Maria Fernanda; Garrigues, Salvador; Cervera, Maria Luisa; de la Guardia, Miguel

2015-09-01

There is an increasing demand for herbal medicines in weight loss treatment. Some synthetic chemicals, such as sibutramine (SB), have been detected as adulterants in herbal formulations. In this study, two strategies using near infrared (NIR) spectroscopy have been developed to evaluate potential adulteration of herbal medicines with SB: a qualitative screening approach and a quantitative methodology based on multivariate calibration. Samples were composed by products commercialized as herbal medicines, as well as by laboratory adulterated samples. Spectra were obtained in the range of 14,000-4000 per cm. Using PLS-DA, a correct classification of 100% was achieved for the external validation set. In the quantitative approach, the root mean squares error of prediction (RMSEP), for both PLS and MLR models, was 0.2% w/w. The results prove the potential of NIR spectroscopy and multivariate calibration in quantifying sibutramine in adulterated herbal medicines samples. © 2015 American Academy of Forensic Sciences.
Classification of adulterated honeys by multivariate analysis.

PubMed

Amiry, Saber; Esmaiili, Mohsen; Alizadeh, Mohammad

2017-06-01

In this research, honey samples were adulterated with date syrup (DS) and invert sugar syrup (IS) at three concentrations (7%, 15% and 30%). 102 adulterated samples were prepared in six batches with 17 replications for each batch. For each sample, 32 parameters including color indices, rheological, physical, and chemical parameters were determined. To classify the samples, based on type and concentrations of adulterant, a multivariate analysis was applied using principal component analysis (PCA) followed by a linear discriminant analysis (LDA). Then, 21 principal components (PCs) were selected in five sets. Approximately two-thirds were identified correctly using color indices (62.75%) or rheological properties (67.65%). A power discrimination was obtained using physical properties (97.06%), and the best separations were achieved using two sets of chemical properties (set 1: lactone, diastase activity, sucrose - 100%) (set 2: free acidity, HMF, ash - 95%). Copyright © 2016 Elsevier Ltd. All rights reserved.
Reverse inference of memory retrieval processes underlying metacognitive monitoring of learning using multivariate pattern analysis.

PubMed

Stiers, Peter; Falbo, Luciana; Goulas, Alexandros; van Gog, Tamara; de Bruin, Anique

2016-05-15

Monitoring of learning is only accurate at some time after learning. It is thought that immediate monitoring is based on working memory, whereas later monitoring requires re-activation of stored items, yielding accurate judgements. Such interpretations are difficult to test because they require reverse inference, which presupposes specificity of brain activity for the hidden cognitive processes. We investigated whether multivariate pattern classification can provide this specificity. We used a word recall task to create single trial examples of immediate and long term retrieval and trained a learning algorithm to discriminate them. Next, participants performed a similar task involving monitoring instead of recall. The recall-trained classifier recognized the retrieval patterns underlying immediate and long term monitoring and classified delayed monitoring examples as long-term retrieval. This result demonstrates the feasibility of decoding cognitive processes, instead of their content. Copyright © 2016 Elsevier Inc. All rights reserved.
Influence of microclimatic ammonia levels on productive performance of different broilers' breeds estimated with univariate and multivariate approaches.

PubMed

Soliman, Essam S; Moawed, Sherif A; Hassan, Rania A

2017-08-01

Birds litter contains unutilized nitrogen in the form of uric acid that is converted into ammonia; a fact that does not only affect poultry performance but also has a negative effect on people's health around the farm and contributes in the environmental degradation. The influence of microclimatic ammonia emissions on Ross and Hubbard broilers reared in different housing systems at two consecutive seasons (fall and winter) was evaluated using a discriminant function analysis to differentiate between Ross and Hubbard breeds. A total number of 400 air samples were collected and analyzed for ammonia levels during the experimental period. Data were analyzed using univariate and multivariate statistical methods. Ammonia levels were significantly higher (p< 0.01) in the Ross compared to the Hubbard breed farm, although no significant differences (p>0.05) were found between the two farms in body weight, body weight gain, feed intake, feed conversion ratio, and performance index (PI) of broilers. Body weight; weight gain and PI had increased values (p< 0.01) during fall compared to winter irrespective of broiler breed. Ammonia emissions were positively (although weekly) correlated with the ambient relative humidity (r=0.383; p< 0.01), but not with the ambient temperature (r=-0.045; p>0.05). Test of significance of discriminant function analysis did not show a classification based on the studied traits suggesting that they cannot been used as predictor variables. The percentage of correct classification was 52% and it was improved after deletion of highly correlated traits to 57%. The study revealed that broiler's growth was negatively affected by increased microclimatic ammonia concentrations and recommended the analysis of broilers' growth performance parameters data using multivariate discriminant function analysis.
Influence of microclimatic ammonia levels on productive performance of different broilers’ breeds estimated with univariate and multivariate approaches

PubMed Central

Soliman, Essam S.; Moawed, Sherif A.; Hassan, Rania A.

2017-01-01

Background and Aim: Birds litter contains unutilized nitrogen in the form of uric acid that is converted into ammonia; a fact that does not only affect poultry performance but also has a negative effect on people’s health around the farm and contributes in the environmental degradation. The influence of microclimatic ammonia emissions on Ross and Hubbard broilers reared in different housing systems at two consecutive seasons (fall and winter) was evaluated using a discriminant function analysis to differentiate between Ross and Hubbard breeds. Materials and Methods: A total number of 400 air samples were collected and analyzed for ammonia levels during the experimental period. Data were analyzed using univariate and multivariate statistical methods. Results: Ammonia levels were significantly higher (p< 0.01) in the Ross compared to the Hubbard breed farm, although no significant differences (p>0.05) were found between the two farms in body weight, body weight gain, feed intake, feed conversion ratio, and performance index (PI) of broilers. Body weight; weight gain and PI had increased values (p< 0.01) during fall compared to winter irrespective of broiler breed. Ammonia emissions were positively (although weekly) correlated with the ambient relative humidity (r=0.383; p< 0.01), but not with the ambient temperature (r=−0.045; p>0.05). Test of significance of discriminant function analysis did not show a classification based on the studied traits suggesting that they cannot been used as predictor variables. The percentage of correct classification was 52% and it was improved after deletion of highly correlated traits to 57%. Conclusion: The study revealed that broiler’s growth was negatively affected by increased microclimatic ammonia concentrations and recommended the analysis of broilers’ growth performance parameters data using multivariate discriminant function analysis. PMID:28919677
Suggestions for Lymph Node Classification of UICC/AJCC Staging System: A Retrospective Study Based on 1197 Nasopharyngeal Carcinoma Patients Treated With Intensity-Modulated Radiation Therapy

PubMed Central

Guo, Qiaojuan; Pan, Jianji; Zong, Jingfeng; Zheng, Wei; Zhang, Chun; Tang, Linbo; Chen, Bijuan; Cui, Xiaofei; Xiao, Youping; Chen, Yunbin; Lin, Shaojun

2015-01-01

Abstract This article provides suggestions for N classification of Union for International Cancer Control/American Joint Committee on Cancer (UICC/AJCC) staging system of nasopharyngeal carcinoma (NPC), purely based on magnetic resonance imaging (MRI) in intensity-modulated radiation therapy (IMRT) era. A total of 1197 nonmetastatic NPC patients treated with IMRT were enrolled, and all were scanned by MRI at nasopharynx and neck before treatment. MRI-based nodal variables including level, laterality, maximal axial diameter (MAD), extracapsular spread (ECS), and necrosis were analyzed as potential prognostic factors. Modifications of N classification were then proposed and verified. Only nodal level and laterality were considered to be significant variables affecting the treatment outcome. N classification was thus proposed accordingly: N0, no regional lymph node (LN) metastasis; N1, retropharyngeal LNs involvement (regardless of laterality), and/or unilateral levels I, II, III, and/or Va involvement; N2, bilateral levels I, II, III, and/or Va involvement; and N3, levels IV, Vb, and Vc involvement. This proposal showed significant predicting value in multivariate analysis. N3 patients indicated relatively inferior overall survival (OS) and distant metastasis-free survival (DMFS) than N2 patients; however, the difference showed no statistical significance (P = 0.673 and 0.265 for OS and DMFS, respectively), and this was considered to be correlated with the small sample sizes of N3 patients (79 patients, 6.6%). Nodal level and laterality, but not MAD, ECS, and necrosis, were considered to be significant predicting factors for NPC. The proposed N classification was proved to be powerfully predictive in our cohort; however, treatment outcome of the proposed N2 and N3 patients could not differ significantly from each other. This insignificance may be because of the small sample sizes of N3 patients. Our results are based on a single-center data, to develop a new N classification that is universally acceptable; further verification by data from multicenter is warranted. PMID:25997052
LSST Astroinformatics And Astrostatistics: Data-oriented Astronomical Research

NASA Astrophysics Data System (ADS)

Borne, Kirk D.; Stassun, K.; Brunner, R. J.; Djorgovski, S. G.; Graham, M.; Hakkila, J.; Mahabal, A.; Paegert, M.; Pesenson, M.; Ptak, A.; Scargle, J.; Informatics, LSST; Statistics Team

2011-01-01

The LSST Informatics and Statistics Science Collaboration (ISSC) focuses on research and scientific discovery challenges posed by the very large and complex data collection that LSST will generate. Application areas include astroinformatics, machine learning, data mining, astrostatistics, visualization, scientific data semantics, time series analysis, and advanced signal processing. Research problems to be addressed with these methodologies include transient event characterization and classification, rare class discovery, correlation mining, outlier/anomaly/surprise detection, improved estimators (e.g., for photometric redshift or early onset supernova classification), exploration of highly dimensional (multivariate) data catalogs, and more. We present sample science results from these data-oriented approaches to large-data astronomical research. We present results from LSST ISSC team members, including the EB (Eclipsing Binary) Factory, the environmental variations in the fundamental plane of elliptical galaxies, and outlier detection in multivariate catalogs.
Multivariate analysis of volatile compounds detected by headspace solid-phase microextraction/gas chromatography: A tool for sensory classification of cork stoppers.

PubMed

Prat, Chantal; Besalú, Emili; Bañeras, Lluís; Anticó, Enriqueta

2011-06-15

The volatile fraction of aqueous cork macerates of tainted and non-tainted agglomerate cork stoppers was analysed by headspace solid-phase microextraction (HS-SPME)/gas chromatography. Twenty compounds containing terpenoids, aliphatic alcohols, lignin-related compounds and others were selected and analysed in individual corks. Cork stoppers were previously classified in six different classes according to sensory descriptions including, 2,4,6-trichloroanisole taint and other frequent, non-characteristic odours found in cork. A multivariate analysis of the chromatographic data of 20 selected chemical compounds using linear discriminant analysis models helped in the differentiation of the a priori made groups. The discriminant model selected five compounds as the best combination. Selected compounds appear in the model in the following order; 2,4,6 TCA, fenchyl alcohol, 1-octen-3-ol, benzyl alcohol and benzothiazole. Unfortunately, not all six a priori differentiated sensory classes were clearly discriminated in the model, probably indicating that no measurable differences exist in the chromatographic data for some categories. The predictive analyses of a refined model in which two sensory classes were fused together resulted in a good classification. Prediction rates of control (non-tainted), TCA, musty-earthy-vegetative, vegetative and chemical descriptions were 100%, 100%, 85%, 67.3% and 100%, respectively, when the modified model was used. The multivariate analysis of chromatographic data will help in the classification of stoppers and provide a perfect complement to sensory analyses. Copyright © 2010 Elsevier Ltd. All rights reserved.

A Novel Acoustic Sensor Approach to Classify Seeds Based on Sound Absorption Spectra

PubMed Central

Gasso-Tortajada, Vicent; Ward, Alastair J.; Mansur, Hasib; Brøchner, Torben; Sørensen, Claus G.; Green, Ole

2010-01-01

A non-destructive and novel in situ acoustic sensor approach based on the sound absorption spectra was developed for identifying and classifying different seed types. The absorption coefficient spectra were determined by using the impedance tube measurement method. Subsequently, a multivariate statistical analysis, i.e., principal component analysis (PCA), was performed as a way to generate a classification of the seeds based on the soft independent modelling of class analogy (SIMCA) method. The results show that the sound absorption coefficient spectra of different seed types present characteristic patterns which are highly dependent on seed size and shape. In general, seed particle size and sphericity were inversely related with the absorption coefficient. PCA presented reliable grouping capabilities within the diverse seed types, since the 95% of the total spectral variance was described by the first two principal components. Furthermore, the SIMCA classification model based on the absorption spectra achieved optimal results as 100% of the evaluation samples were correctly classified. This study contains the initial structuring of an innovative method that will present new possibilities in agriculture and industry for classifying and determining physical properties of seeds and other materials. PMID:22163455
Classification of Malaysia aromatic rice using multivariate statistical analysis

NASA Astrophysics Data System (ADS)

Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.

2015-05-01

Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Atypia and DNA methylation in nipple duct lavage in relation to predicted breast cancer risk.

PubMed

Euhus, David M; Bu, Dawei; Ashfaq, Raheela; Xie, Xian-Jin; Bian, Aihua; Leitch, A Marilyn; Lewis, Cheryl M

2007-09-01

Tumor suppressor gene (TSG) methylation is identified more frequently in random periareolar fine needle aspiration samples from women at high risk for breast cancer than women at lower risk. It is not known whether TSG methylation or atypia in nipple duct lavage (NDL) samples is related to predicted breast cancer risk. 514 NDL samples obtained from 150 women selected to represent a wide range of breast cancer risk were evaluated cytologically and by quantitative multiplex methylation-specific PCR for methylation of cyclin D2, APC, HIN1, RASSF1A, and RAR-beta2. Based on methylation patterns and cytology, NDL retrieved cancer cells from only 9% of breasts ipsilateral to a breast cancer. Methylation of >/=2 genes correlated with marked atypia by univariate analysis, but not multivariate analysis, that adjusted for sample cellularity and risk group classification. Both marked atypia and TSG methylation independently predicted abundant cellularity in multivariate analyses. Discrimination between Gail lower-risk ducts and Gail high-risk ducts was similar for marked atypia [odds ratio (OR), 3.48; P = 0.06] and measures of TSG methylation (OR, 3.51; P = 0.03). However, marked atypia provided better discrimination between Gail lower-risk ducts and ducts contralateral to a breast cancer (OR, 6.91; P = 0.003, compared with methylation OR, 4.21; P = 0.02). TSG methylation in NDL samples does not predict marked atypia after correcting for sample cellularity and risk group classification. Rather, both methylation and marked atypia are independently associated with highly cellular samples, Gail model risk classifications, and a personal history of breast cancer. This suggests the existence of related, but independent, pathogenic pathways in breast epithelium.
Distinguishing early and late brain aging from the Alzheimer's disease spectrum: consistent morphological patterns across independent samples.

PubMed

Doan, Nhat Trung; Engvig, Andreas; Zaske, Krystal; Persson, Karin; Lund, Martina Jonette; Kaufmann, Tobias; Cordova-Palomera, Aldo; Alnæs, Dag; Moberget, Torgeir; Brækhus, Anne; Barca, Maria Lage; Nordvik, Jan Egil; Engedal, Knut; Agartz, Ingrid; Selbæk, Geir; Andreassen, Ole A; Westlye, Lars T

2017-09-01

Alzheimer's disease (AD) is a debilitating age-related neurodegenerative disorder. Accurate identification of individuals at risk is complicated as AD shares cognitive and brain features with aging. We applied linked independent component analysis (LICA) on three complementary measures of gray matter structure: cortical thickness, area and gray matter density of 137 AD, 78 mild (MCI) and 38 subjective cognitive impairment patients, and 355 healthy adults aged 18-78 years to identify dissociable multivariate morphological patterns sensitive to age and diagnosis. Using the lasso classifier, we performed group classification and prediction of cognition and age at different age ranges to assess the sensitivity and diagnostic accuracy of the LICA patterns in relation to AD, as well as early and late healthy aging. Three components showed high sensitivity to the diagnosis and cognitive status of AD, with different relationships with age: one reflected an anterior-posterior gradient in thickness and gray matter density and was uniquely related to diagnosis, whereas the other two, reflecting widespread cortical thickness and medial temporal lobe volume, respectively, also correlated significantly with age. Repeating the LICA decomposition and between-subject analysis on ADNI data, including 186 AD, 395 MCI and 220 age-matched healthy controls, revealed largely consistent brain patterns and clinical associations across samples. Classification results showed that multivariate LICA-derived brain characteristics could be used to predict AD and age with high accuracy (area under ROC curve up to 0.93 for classification of AD from controls). Comparison between classifiers based on feature ranking and feature selection suggests both common and unique feature sets implicated in AD and aging, and provides evidence of distinct age-related differences in early compared to late aging. Copyright © 2017 Elsevier Inc. All rights reserved.
Automatic Cell Segmentation Using a Shape-Classification Model in Immunohistochemically Stained Cytological Images

NASA Astrophysics Data System (ADS)

Shah, Shishir

This paper presents a segmentation method for detecting cells in immunohistochemically stained cytological images. A two-phase approach to segmentation is used where an unsupervised clustering approach coupled with cluster merging based on a fitness function is used as the first phase to obtain a first approximation of the cell locations. A joint segmentation-classification approach incorporating ellipse as a shape model is used as the second phase to detect the final cell contour. The segmentation model estimates a multivariate density function of low-level image features from training samples and uses it as a measure of how likely each image pixel is to be a cell. This estimate is constrained by the zero level set, which is obtained as a solution to an implicit representation of an ellipse. Results of segmentation are presented and compared to ground truth measurements.
Clustering of the human skeletal muscle fibers using linear programming and angular Hilbertian metrics.

PubMed

Neji, Radhouène; Besbes, Ahmed; Komodakis, Nikos; Deux, Jean-François; Maatouk, Mezri; Rahmouni, Alain; Bassez, Guillaume; Fleury, Gilles; Paragios, Nikos

2009-01-01

In this paper, we present a manifold clustering method fo the classification of fibers obtained from diffusion tensor images (DTI) of the human skeletal muscle. Using a linear programming formulation of prototype-based clustering, we propose a novel fiber classification algorithm over manifolds that circumvents the necessity to embed the data in low dimensional spaces and determines automatically the number of clusters. Furthermore, we propose the use of angular Hilbertian metrics between multivariate normal distributions to define a family of distances between tensors that we generalize to fibers. These metrics are used to approximate the geodesic distances over the fiber manifold. We also discuss the case where only geodesic distances to a reduced set of landmark fibers are available. The experimental validation of the method is done using a manually annotated significant dataset of DTI of the calf muscle for healthy and diseased subjects.
Simultaneous Force Regression and Movement Classification of Fingers via Surface EMG within a Unified Bayesian Framework.

PubMed

Baldacchino, Tara; Jacobs, William R; Anderson, Sean R; Worden, Keith; Rowson, Jennifer

2018-01-01

This contribution presents a novel methodology for myolectric-based control using surface electromyographic (sEMG) signals recorded during finger movements. A multivariate Bayesian mixture of experts (MoE) model is introduced which provides a powerful method for modeling force regression at the fingertips, while also performing finger movement classification as a by-product of the modeling algorithm. Bayesian inference of the model allows uncertainties to be naturally incorporated into the model structure. This method is tested using data from the publicly released NinaPro database which consists of sEMG recordings for 6 degree-of-freedom force activations for 40 intact subjects. The results demonstrate that the MoE model achieves similar performance compared to the benchmark set by the authors of NinaPro for finger force regression. Additionally, inherent to the Bayesian framework is the inclusion of uncertainty in the model parameters, naturally providing confidence bounds on the force regression predictions. Furthermore, the integrated clustering step allows a detailed investigation into classification of the finger movements, without incurring any extra computational effort. Subsequently, a systematic approach to assessing the importance of the number of electrodes needed for accurate control is performed via sensitivity analysis techniques. A slight degradation in regression performance is observed for a reduced number of electrodes, while classification performance is unaffected.
Guidelines to classification and nomenclature of Arabian felsic plutonic rocks

USGS Publications Warehouse

Ramsay, C.R.; Stoeser, D.B.; Drysdall, A.R.

1986-01-01

Well-defined procedures for classifying the felsic plutonic rocks of the Arabian Shield on the basis of petrographic, chemical and lithostratigraphic criteria and mineral-resource potential have been adopted and developed in the Saudi Arabian Deputy Ministry for Mineral Resources over the past decade. A number of problems with conventional classification schemes have been identified and resolved; others, notably those arising from difficulties in identifying precise mineral compositions, continue to present difficulties. The petrographic nomenclature used is essentially that recommended by the International Union of Geological Sciences. Problems that have arisen include the definition of: (1) rocks with sodic, zoned or perthitic feldspar, (2) trondhjemites, and (3) alkali granites. Chemical classification has been largely based on relative molar amounts of alumina, lime and alkalis, and the use of conventional variation diagrams, but pilot studies utilizing univariate and multivariate statistical techniques have been made. The classification used in Saudi Arabia for stratigraphic purposes is a hierarchy of formation-rank units, suites and super-suites as defined in the Saudi Arabian stratigraphic code. For genetic and petrological studies, a grouping as 'associations' of similar and genetically related lithologies is commonly used. In order to indicate mineral-resource potential, the felsic plutons are classed as common, precursor, specialized or mineralized, in order of increasing exploration significance. ?? 1986.
Is the Robson's classification system burdened by obstetric pathologies, maternal characteristics and assistential levels in comparing hospitals cesarean rates? A regional analysis of class 1 and 3.

PubMed

Gerli, Sandro; Favilli, Alessandro; Franchini, David; De Giorgi, Marcello; Casucci, Paola; Parazzini, Fabio

2018-01-01

To assess if maternal risk profile and Hospital assistential levels were able to influence the inter-Hospitals comparison in the class 1 and 3 of the "The Ten Group Classification System" (TGCS). A population-based analysis using data from Institutional data-base of an Italian Region was carried out. The 11 maternity wards were divided into two categories: second-level hospitals (SLH), and first-level hospitals (FLH). The recorded deliveries were classified according to the TGCS. To analyze if different maternal characteristics and the hospitals assistential level could influence the cesarean section (CS) risk, a multivariate analysis was done considering separately women in the TGCS class 1 and 3. From January 2011 to December 2013 were recorded 19,987 deliveries. Of those 7,693 were in the TGCS class 1 and 4,919 in the class 3. The CS rates were 20.8% and 14.7% in class 1 (p < 0.0001) and 6.9% and 5.3% (p < 0.0230) in class 3, respectively in the FLH and SLH. The multivariate logistic regression showed that the FLH, older maternal age and gestational diabetes were independent risk factors for CS in groups 1 and 3. Obesity and gestational hypertension were also independent risk factors for group 1. TGCS is a useful tool to analyze the incidence of CS in a single center but in comparing different Hospitals, maternal characteristics and different assistential levels should be considered as potential bias.
Correlation of Biomarker Expression in Colonic Mucosa with Disease Phenotype in Crohn's Disease and Ulcerative Colitis.

PubMed

Bruno, Maria E C; Rogier, Eric W; Arsenescu, Razvan I; Flomenhoft, Deborah R; Kurkjian, Cathryn J; Ellis, Gavin I; Kaetzel, Charlotte S

2015-10-01

Inflammatory bowel diseases (IBD), including Crohn's disease (CD) and ulcerative colitis (UC), are characterized by chronic intestinal inflammation due to immunological, microbial, and environmental factors in genetically predisposed individuals. Advances in the diagnosis, prognosis, and treatment of IBD require the identification of robust biomarkers that can be used for molecular classification of diverse disease presentations. We previously identified five genes, RELA, TNFAIP3 (A20), PIGR, TNF, and IL8, whose mRNA levels in colonic mucosal biopsies could be used in a multivariate analysis to classify patients with CD based on disease behavior and responses to therapy. We compared expression of these five biomarkers in IBD patients classified as having CD or UC, and in healthy controls. Patients with CD were characterized as having decreased median expression of TNFAIP3, PIGR, and TNF in non-inflamed colonic mucosa as compared to healthy controls. By contrast, UC patients exhibited decreased expression of PIGR and elevated expression of IL8 in colonic mucosa compared to healthy controls. A multivariate analysis combining mRNA levels for all five genes resulted in segregation of individuals based on disease presentation (CD vs. UC) as well as severity, i.e., patients in remission versus those with acute colitis at the time of biopsy. We propose that this approach could be used as a model for molecular classification of IBD patients, which could further be enhanced by the inclusion of additional genes that are identified by functional studies, global gene expression analyses, and genome-wide association studies.
Canonical Measure of Correlation (CMC) and Canonical Measure of Distance (CMD) between sets of data. Part 3. Variable selection in classification.

PubMed

Ballabio, Davide; Consonni, Viviana; Mauri, Andrea; Todeschini, Roberto

2010-01-11

In multivariate regression and classification issues variable selection is an important procedure used to select an optimal subset of variables with the aim of producing more parsimonious and eventually more predictive models. Variable selection is often necessary when dealing with methodologies that produce thousands of variables, such as Quantitative Structure-Activity Relationships (QSARs) and highly dimensional analytical procedures. In this paper a novel method for variable selection for classification purposes is introduced. This method exploits the recently proposed Canonical Measure of Correlation between two sets of variables (CMC index). The CMC index is in this case calculated for two specific sets of variables, the former being comprised of the independent variables and the latter of the unfolded class matrix. The CMC values, calculated by considering one variable at a time, can be sorted and a ranking of the variables on the basis of their class discrimination capabilities results. Alternatively, CMC index can be calculated for all the possible combinations of variables and the variable subset with the maximal CMC can be selected, but this procedure is computationally more demanding and classification performance of the selected subset is not always the best one. The effectiveness of the CMC index in selecting variables with discriminative ability was compared with that of other well-known strategies for variable selection, such as the Wilks' Lambda, the VIP index based on the Partial Least Squares-Discriminant Analysis, and the selection provided by classification trees. A variable Forward Selection based on the CMC index was finally used in conjunction of Linear Discriminant Analysis. This approach was tested on several chemical data sets. Obtained results were encouraging.
Multivariate analysis of the volatile components in tobacco based on infrared-assisted extraction coupled to headspace solid-phase microextraction and gas chromatography-mass spectrometry.

PubMed

Yang, Yanqin; Pan, Yuanjiang; Zhou, Guojun; Chu, Guohai; Jiang, Jian; Yuan, Kailong; Xia, Qian; Cheng, Changhe

2016-11-01

A novel infrared-assisted extraction coupled to headspace solid-phase microextraction followed by gas chromatography with mass spectrometry method has been developed for the rapid determination of the volatile components in tobacco. The optimal extraction conditions for maximizing the extraction efficiency were as follows: 65 μm polydimethylsiloxane-divinylbenzene fiber, extraction time of 20 min, infrared power of 175 W, and distance between the infrared lamp and the headspace vial of 2 cm. Under the optimum conditions, 50 components were found to exist in all ten tobacco samples from different geographical origins. Compared with conventional water-bath heating and nonheating extraction methods, the extraction efficiency of infrared-assisted extraction was greatly improved. Furthermore, multivariate analysis including principal component analysis, hierarchical cluster analysis, and similarity analysis were performed to evaluate the chemical information of these samples and divided them into three classifications, including rich, moderate, and fresh flavors. The above-mentioned classification results were consistent with the sensory evaluation, which was pivotal and meaningful for tobacco discrimination. As a simple, fast, cost-effective, and highly efficient method, the infrared-assisted extraction coupled to headspace solid-phase microextraction technique is powerful and promising for distinguishing the geographical origins of the tobacco samples coupled to suitable chemometrics. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Investigating the sex-related geometric variation of the human cranium.

PubMed

Bertsatos, Andreas; Papageorgopoulou, Christina; Valakos, Efstratios; Chovalopoulou, Maria-Eleni

2018-01-29

Accurate sexing methods are of great importance in forensic anthropology since sex assessment is among the principal tasks when examining human skeletal remains. The present study explores a novel approach in assessing the most accurate metric traits of the human cranium for sex estimation based on 80 ectocranial landmarks from 176 modern individuals of known age and sex from the Athens Collection. The purpose of the study is to identify those distance and angle measurements that can be most effectively used in sex assessment. Three-dimensional landmark coordinates were digitized with a Microscribe 3DX and analyzed in GNU Octave. An iterative linear discriminant analysis of all possible combinations of landmarks was performed for each unique set of the 3160 distances and 246,480 angles. Cross-validated correct classification as well as multivariate DFA on top performing variables reported 13 craniometric distances with over 85% classification accuracy, 7 angles over 78%, as well as certain multivariate combinations yielding over 95%. Linear regression of these variables with the centroid size was used to assess their relation to the size of the cranium. In contrast to the use of generalized procrustes analysis (GPA) and principal component analysis (PCA), which constitute the common analytical work flow for such data, our method, although computational intensive, produced easily applicable discriminant functions of high accuracy, while at the same time explored the maximum of cranial variability.
The classification of secondary colorectal liver cancer in human biopsy samples using angular dispersive x-ray diffraction and multivariate analysis

NASA Astrophysics Data System (ADS)

Theodorakou, Chrysoula; Farquharson, Michael J.

2009-08-01

The motivation behind this study is to assess whether angular dispersive x-ray diffraction (ADXRD) data, processed using multivariate analysis techniques, can be used for classifying secondary colorectal liver cancer tissue and normal surrounding liver tissue in human liver biopsy samples. The ADXRD profiles from a total of 60 samples of normal liver tissue and colorectal liver metastases were measured using a synchrotron radiation source. The data were analysed for 56 samples using nonlinear peak-fitting software. Four peaks were fitted to all of the ADXRD profiles, and the amplitude, area, amplitude and area ratios for three of the four peaks were calculated and used for the statistical and multivariate analysis. The statistical analysis showed that there are significant differences between all the peak-fitting parameters and ratios between the normal and the diseased tissue groups. The technique of soft independent modelling of class analogy (SIMCA) was used to classify normal liver tissue and colorectal liver metastases resulting in 67% of the normal tissue samples and 60% of the secondary colorectal liver tissue samples being classified correctly. This study has shown that the ADXRD data of normal and secondary colorectal liver cancer are statistically different and x-ray diffraction data analysed using multivariate analysis have the potential to be used as a method of tissue classification.
Multivariate neuroanatomical classification of cognitive subtypes in schizophrenia: A support vector machine learning approach

PubMed Central

Gould, Ian C.; Shepherd, Alana M.; Laurens, Kristin R.; Cairns, Murray J.; Carr, Vaughan J.; Green, Melissa J.

2014-01-01

Heterogeneity in the structural brain abnormalities associated with schizophrenia has made identification of reliable neuroanatomical markers of the disease difficult. The use of more homogenous clinical phenotypes may improve the accuracy of predicting psychotic disorder/s on the basis of observable brain disturbances. Here we investigate the utility of cognitive subtypes of schizophrenia – ‘cognitive deficit’ and ‘cognitively spared’ – in determining whether multivariate patterns of volumetric brain differences can accurately discriminate these clinical subtypes from healthy controls, and from each other. We applied support vector machine classification to grey- and white-matter volume data from 126 schizophrenia patients previously allocated to the cognitive spared subtype, 74 cognitive deficit schizophrenia patients, and 134 healthy controls. Using this method, cognitive subtypes were distinguished from healthy controls with up to 72% accuracy. Cross-validation analyses between subtypes achieved an accuracy of 71%, suggesting that some common neuroanatomical patterns distinguish both subtypes from healthy controls. Notably, cognitive subtypes were best distinguished from one another when the sample was stratified by sex prior to classification analysis: cognitive subtype classification accuracy was relatively low (<60%) without stratification, and increased to 83% for females with sex stratification. Distinct neuroanatomical patterns predicted cognitive subtype status in each sex: sex-specific multivariate patterns did not predict cognitive subtype status in the other sex above chance, and weight map analyses demonstrated negative correlations between the spatial patterns of weights underlying classification for each sex. These results suggest that in typical mixed-sex samples of schizophrenia patients, the volumetric brain differences between cognitive subtypes are relatively minor in contrast to the large common disease-associated changes. Volumetric differences that distinguish between cognitive subtypes on a case-by-case basis appear to occur in a sex-specific manner that is consistent with previous evidence of disrupted relationships between brain structure and cognition in male, but not female, schizophrenia patients. Consideration of sex-specific differences in brain organization is thus likely to assist future attempts to distinguish subgroups of schizophrenia patients on the basis of neuroanatomical features. PMID:25379435
Spatial scale and distribution of neurovascular signals underlying decoding of orientation and eye of origin from fMRI data

PubMed Central

Harrison, Charlotte; Jackson, Jade; Oh, Seung-Mock; Zeringyte, Vaida

2016-01-01

Multivariate pattern analysis of functional magnetic resonance imaging (fMRI) data is widely used, yet the spatial scales and origin of neurovascular signals underlying such analyses remain unclear. We compared decoding performance for stimulus orientation and eye of origin from fMRI measurements in human visual cortex with predictions based on the columnar organization of each feature and estimated the spatial scales of patterns driving decoding. Both orientation and eye of origin could be decoded significantly above chance in early visual areas (V1–V3). Contrary to predictions based on a columnar origin of response biases, decoding performance for eye of origin in V2 and V3 was not significantly lower than that in V1, nor did decoding performance for orientation and eye of origin differ significantly. Instead, response biases for both features showed large-scale organization, evident as a radial bias for orientation, and a nasotemporal bias for eye preference. To determine whether these patterns could drive classification, we quantified the effect on classification performance of binning voxels according to visual field position. Consistent with large-scale biases driving classification, binning by polar angle yielded significantly better decoding performance for orientation than random binning in V1–V3. Similarly, binning by hemifield significantly improved decoding performance for eye of origin. Patterns of orientation and eye preference bias in V2 and V3 showed a substantial degree of spatial correlation with the corresponding patterns in V1, suggesting that response biases in these areas originate in V1. Together, these findings indicate that multivariate classification results need not reflect the underlying columnar organization of neuronal response selectivities in early visual areas. NEW & NOTEWORTHY Large-scale response biases can account for decoding of orientation and eye of origin in human early visual areas V1–V3. For eye of origin this pattern is a nasotemporal bias; for orientation it is a radial bias. Differences in decoding performance across areas and stimulus features are not well predicted by differences in columnar-scale organization of each feature. Large-scale biases in extrastriate areas are spatially correlated with those in V1, suggesting biases originate in primary visual cortex. PMID:27903637
Improved sparse decomposition based on a smoothed L0 norm using a Laplacian kernel to select features from fMRI data.

PubMed

Zhang, Chuncheng; Song, Sutao; Wen, Xiaotong; Yao, Li; Long, Zhiying

2015-04-30

Feature selection plays an important role in improving the classification accuracy of multivariate classification techniques in the context of fMRI-based decoding due to the "few samples and large features" nature of functional magnetic resonance imaging (fMRI) data. Recently, several sparse representation methods have been applied to the voxel selection of fMRI data. Despite the low computational efficiency of the sparse representation methods, they still displayed promise for applications that select features from fMRI data. In this study, we proposed the Laplacian smoothed L0 norm (LSL0) approach for feature selection of fMRI data. Based on the fast sparse decomposition using smoothed L0 norm (SL0) (Mohimani, 2007), the LSL0 method used the Laplacian function to approximate the L0 norm of sources. Results of the simulated and real fMRI data demonstrated the feasibility and robustness of LSL0 for the sparse source estimation and feature selection. Simulated results indicated that LSL0 produced more accurate source estimation than SL0 at high noise levels. The classification accuracy using voxels that were selected by LSL0 was higher than that by SL0 in both simulated and real fMRI experiment. Moreover, both LSL0 and SL0 showed higher classification accuracy and required less time than ICA and t-test for the fMRI decoding. LSL0 outperformed SL0 in sparse source estimation at high noise level and in feature selection. Moreover, LSL0 and SL0 showed better performance than ICA and t-test for feature selection. Copyright © 2015 Elsevier B.V. All rights reserved.
Novel high-resolution computed tomography-based radiomic classifier for screen-identified pulmonary nodules in the National Lung Screening Trial.

PubMed

Peikert, Tobias; Duan, Fenghai; Rajagopalan, Srinivasan; Karwoski, Ronald A; Clay, Ryan; Robb, Richard A; Qin, Ziling; Sicks, JoRean; Bartholmai, Brian J; Maldonado, Fabien

2018-01-01

Optimization of the clinical management of screen-detected lung nodules is needed to avoid unnecessary diagnostic interventions. Herein we demonstrate the potential value of a novel radiomics-based approach for the classification of screen-detected indeterminate nodules. Independent quantitative variables assessing various radiologic nodule features such as sphericity, flatness, elongation, spiculation, lobulation and curvature were developed from the NLST dataset using 726 indeterminate nodules (all ≥ 7 mm, benign, n = 318 and malignant, n = 408). Multivariate analysis was performed using least absolute shrinkage and selection operator (LASSO) method for variable selection and regularization in order to enhance the prediction accuracy and interpretability of the multivariate model. The bootstrapping method was then applied for the internal validation and the optimism-corrected AUC was reported for the final model. Eight of the originally considered 57 quantitative radiologic features were selected by LASSO multivariate modeling. These 8 features include variables capturing Location: vertical location (Offset carina centroid z), Size: volume estimate (Minimum enclosing brick), Shape: flatness, Density: texture analysis (Score Indicative of Lesion/Lung Aggression/Abnormality (SILA) texture), and surface characteristics: surface complexity (Maximum shape index and Average shape index), and estimates of surface curvature (Average positive mean curvature and Minimum mean curvature), all with P<0.01. The optimism-corrected AUC for these 8 features is 0.939. Our novel radiomic LDCT-based approach for indeterminate screen-detected nodule characterization appears extremely promising however independent external validation is needed.
Decoding of visual activity patterns from fMRI responses using multivariate pattern analyses and convolutional neural network.

PubMed

Zafar, Raheel; Kamel, Nidal; Naufal, Mohamad; Malik, Aamir Saeed; Dass, Sarat C; Ahmad, Rana Fayyaz; Abdullah, Jafri M; Reza, Faruque

2017-01-01

Decoding of human brain activity has always been a primary goal in neuroscience especially with functional magnetic resonance imaging (fMRI) data. In recent years, Convolutional neural network (CNN) has become a popular method for the extraction of features due to its higher accuracy, however it needs a lot of computation and training data. In this study, an algorithm is developed using Multivariate pattern analysis (MVPA) and modified CNN to decode the behavior of brain for different images with limited data set. Selection of significant features is an important part of fMRI data analysis, since it reduces the computational burden and improves the prediction performance; significant features are selected using t-test. MVPA uses machine learning algorithms to classify different brain states and helps in prediction during the task. General linear model (GLM) is used to find the unknown parameters of every individual voxel and the classification is done using multi-class support vector machine (SVM). MVPA-CNN based proposed algorithm is compared with region of interest (ROI) based method and MVPA based estimated values. The proposed method showed better overall accuracy (68.6%) compared to ROI (61.88%) and estimation values (64.17%).
Importance of recurrence rating, morphology, hernial gap size, and risk factors in ventral and incisional hernia classification.

PubMed

Dietz, U A; Winkler, M S; Härtel, R W; Fleischhacker, A; Wiegering, A; Isbert, C; Jurowich, Ch; Heuschmann, P; Germer, C-T

2014-02-01

There is limited evidence on the natural course of ventral and incisional hernias and the results of hernia repair, what might partially be explained by the lack of an accepted classification system. The aim of the present study is to investigate the association of the criteria included in the Wuerzburg classification system of ventral and incisional hernias with postoperative complications and long-term recurrence. In a retrospective cohort study, the data on 330 consecutive patients who underwent surgery to repair ventral and incisional hernias were analyzed. The following four classification criteria were applied: (a) recurrence rating (ventral, incisional or incisional recurrent); (b) morphology (location); (c) size of the hernial gap; and (d) risk factors. The primary endpoint was the occurrence of a recurrence during follow-up. Secondary endpoints were incidence of postoperative complications. Independent association between classification criteria, type of surgical procedures and postoperative complications was calculated by multivariate logistic regression analysis and between classification criteria, type of surgical procedures and risk of long-term recurrence by Cox regression analysis. Follow-up lasted a mean 47.7 ± 23.53 months (median 45 months) or 3.9 ± 1.96 years. The criterion "recurrence rating" was found as predictive factor for postoperative complications in the multivariate analysis (OR 2.04; 95 % CI 1.09-3.84; incisional vs. ventral hernia). The criterion "morphology" had influence neither on the incidence of the critical event "recurrence during follow-up" nor on the incidence of postoperative complications. Hernial gap "width" predicted postoperative complications in the multivariate analysis (OR 1.98; 95 % CI 1.19-3.29; ≤5 vs. >5 cm). Length of the hernial gap was found to be an independent prognostic factor for the critical event "recurrence during follow-up" (HR 2.05; 95 % CI 1.25-3.37; ≤5 vs. >5 cm). The presence of 3 or more risk factors was a consistent predictor for "recurrence during follow-up" (HR 2.25; 95 % CI 1.28-9.92). Mesh repair was an independent protective factor for "recurrence during follow-up" compared to suture (HR 0.53; 95 % CI 0.32-0.86). The ventral and incisional hernia classification of Dietz et al. employs a clinically proven terminology and has an open classification structure. Hernial gap size and the number of risk factors are independent predictors for "recurrence during follow-up", whereas recurrence rating and hernial gap size correlated significantly with the incidence of postoperative complications. We propose the application of these criteria for future clinical research, as larger patient numbers will be needed to refine the results.

From sensation to perception: Using multivariate classification of visual illusions to identify neural correlates of conscious awareness in space and time.

PubMed

Hogendoorn, Hinze

2015-01-01

An important goal of cognitive neuroscience is understanding the neural underpinnings of conscious awareness. Although the low-level processing of sensory input is well understood in most modalities, it remains a challenge to understand how the brain translates such input into conscious awareness. Here, I argue that the application of multivariate pattern classification techniques to neuroimaging data acquired while observers experience perceptual illusions provides a unique way to dissociate sensory mechanisms from mechanisms underlying conscious awareness. Using this approach, it is possible to directly compare patterns of neural activity that correspond to the contents of awareness, independent from changes in sensory input, and to track these neural representations over time at high temporal resolution. I highlight five recent studies using this approach, and provide practical considerations and limitations for future implementations.
Fast discrimination of hydroxypropyl methyl cellulose using portable Raman spectrometer and multivariate methods

NASA Astrophysics Data System (ADS)

Song, Biao; Lu, Dan; Peng, Ming; Li, Xia; Zou, Ye; Huang, Meizhen; Lu, Feng

2017-02-01

Raman spectroscopy is developed as a fast and non-destructive method for the discrimination and classification of hydroxypropyl methyl cellulose (HPMC) samples. 44 E series and 41 K series of HPMC samples are measured by a self-developed portable Raman spectrometer (Hx-Raman) which is excited by a 785 nm diode laser and the spectrum range is 200-2700 cm-1 with a resolution (FWHM) of 6 cm-1. Multivariate analysis is applied for discrimination of E series from K series. By methods of principal components analysis (PCA) and Fisher discriminant analysis (FDA), a discrimination result with sensitivity of 90.91% and specificity of 95.12% is achieved. The corresponding receiver operating characteristic (ROC) is 0.99, indicting the accuracy of the predictive model. This result demonstrates the prospect of portable Raman spectrometer for rapid, non-destructive classification and discrimination of E series and K series samples of HPMC.
Differences in chewing sounds of dry-crisp snacks by multivariate data analysis

NASA Astrophysics Data System (ADS)

De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.

2003-09-01

Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.
A machine learning approach to identify functional biomarkers in human prefrontal cortex for individuals with traumatic brain injury using functional near-infrared spectroscopy.

PubMed

Karamzadeh, Nader; Amyot, Franck; Kenney, Kimbra; Anderson, Afrouz; Chowdhry, Fatima; Dashtestani, Hadis; Wassermann, Eric M; Chernomordik, Victor; Boccara, Claude; Wegman, Edward; Diaz-Arrastia, Ramon; Gandjbakhche, Amir H

2016-11-01

We have explored the potential prefrontal hemodynamic biomarkers to characterize subjects with Traumatic Brain Injury (TBI) by employing the multivariate machine learning approach and introducing a novel task-related hemodynamic response detection followed by a heuristic search for optimum set of hemodynamic features. To achieve this goal, the hemodynamic response from a group of 31 healthy controls and 30 chronic TBI subjects were recorded as they performed a complexity task. To determine the optimum hemodynamic features, we considered 11 features and their combinations in characterizing TBI subjects. We investigated the significance of the features by utilizing a machine learning classification algorithm to score all the possible combinations of features according to their predictive power. The identified optimum feature elements resulted in classification accuracy, sensitivity, and specificity of 85%, 85%, and 84%, respectively. Classification improvement was achieved for TBI subject classification through feature combination. It signified the major advantage of the multivariate analysis over the commonly used univariate analysis suggesting that the features that are individually irrelevant in characterizing the data may become relevant when used in combination. We also conducted a spatio-temporal classification to identify regions within the prefrontal cortex (PFC) that contribute in distinguishing between TBI and healthy subjects. As expected, Brodmann areas (BA) 10 within the PFC were isolated as the region that healthy subjects (unlike subjects with TBI), showed major hemodynamic activity in response to the High Complexity task. Overall, our results indicate that identified temporal and spatio-temporal features from PFC's hemodynamic activity are promising biomarkers in classifying subjects with TBI.
Dual-modal cancer detection based on optical pH sensing and Raman spectroscopy.

PubMed

Kim, Soogeun; Lee, Seung Ho; Min, Sun Young; Byun, Kyung Min; Lee, Soo Yeol

2017-10-01

A dual-modal approach using Raman spectroscopy and optical pH sensing was investigated to discriminate between normal and cancerous tissues. Raman spectroscopy has demonstrated the potential for in vivo cancer detection. However, Raman spectroscopy has suffered from strong fluorescence background of biological samples and subtle spectral differences between normal and disease tissues. To overcome those issues, pH sensing is adopted to Raman spectroscopy as a dual-modal approach. Based on the fact that the pH level in cancerous tissues is lower than that in normal tissues due to insufficient vasculature formation, the dual-modal approach combining the chemical information of Raman spectrum and the metabolic information of pH level can improve the specificity of cancer diagnosis. From human breast tissue samples, Raman spectra and pH levels are measured using fiber-optic-based Raman and pH probes, respectively. The pH sensing is based on the dependence of pH level on optical transmission spectrum. Multivariate statistical analysis is performed to evaluate the classification capability of the dual-modal method. The analytical results show that the dual-modal method based on Raman spectroscopy and optical pH sensing can improve the performance of cancer classification. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Application of ¹H NMR for the characterisation and authentication of ''Tonda Gentile Trilobata" hazelnuts from Piedmont (Italy).

PubMed

Caligiani, Augusta; Coisson, Jean Daniel; Travaglia, Fabiano; Acquotti, Domenico; Palla, Gerardo; Palla, Luigi; Arlorio, Marco

2014-04-01

The Italian hazelnut (Corylus avellana L.) cultivar "Tonda Gentile Trilobata" (TGT) is covered by protected geographical indication "Nocciola Piemonte" and is well-known as the best-suited hazelnut for the industrial transformation into roasted kernel. The hazelnut cultivar identification is primarily based on morphological characteristics, so there is the need for more objective analytical methods for high quality hazelnut authentication. This study reports the (1)H NMR fingerprinting of raw and roasted hazelnut, with the aim of obtaining hazelnut classification based on their spectroscopic pattern. (1)H NMR analyses were carried out on polar extracts of TGT and other cultivars: the data were analysed with multivariate statistical methods. Results showed that (1)H NMR combined with chemometrics is useful to characterise the hazelnuts as a function of the cultivars, both on raw and roasted form. The classification models allowed identifying molecular markers useful to distinguish TGT from other types, among these trigonelline, amino acids and an unidentified orto-disubstituted aromatic compound. Copyright © 2013 Elsevier Ltd. All rights reserved.
Discrimination and characterization of strawberry juice based on electronic nose and tongue: comparison of different juice processing approaches by LDA, PLSR, RF, and SVM.

PubMed

Qiu, Shanshan; Wang, Jun; Gao, Liping

2014-07-09

An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.
Classification of white wine aromas with an electronic nose.

PubMed

Lozano, J; Santos, J P; Horrillo, M C

2005-09-15

This paper reports the use of a tin dioxide multisensor array based electronic nose for recognition of 29 typical aromas in white wine. Headspace technique has been used to extract aroma of the wine. Multivariate analysis, including principal component analysis (PCA) as well as probabilistic neural networks (PNNs), has been used to identify the main aroma added to the wine. The results showed that in spite of the strong influence of ethanol and other majority compounds of wine, the system could discriminate correctly the aromatic compounds added to the wine with a minimum accuracy of 97.2%.
Objective classification of ecological status in marine water bodies using ecotoxicological information and multivariate analysis.

PubMed

Beiras, Ricardo; Durán, Iria

2014-12-01

Some relevant shortcomings have been identified in the current approach for the classification of ecological status in marine water bodies, leading to delays in the fulfillment of the Water Framework Directive objectives. Natural variability makes difficult to settle fixed reference values and boundary values for the Ecological Quality Ratios (EQR) for the biological quality elements. Biological responses to environmental degradation are frequently of nonmonotonic nature, hampering the EQR approach. Community structure traits respond only once ecological damage has already been done and do not provide early warning signals. An alternative methodology for the classification of ecological status integrating chemical measurements, ecotoxicological bioassays and community structure traits (species richness and diversity), and using multivariate analyses (multidimensional scaling and cluster analysis), is proposed. This approach does not depend on the arbitrary definition of fixed reference values and EQR boundary values, and it is suitable to integrate nonlinear, sensitive signals of ecological degradation. As a disadvantage, this approach demands the inclusion of sampling sites representing the full range of ecological status in each monitoring campaign. National or international agencies in charge of coastal pollution monitoring have comprehensive data sets available to overcome this limitation.
Workshop on Algorithms for Time-Series Analysis

NASA Astrophysics Data System (ADS)

Protopapas, Pavlos

2012-04-01

abstract-type="normal">SummaryThis Workshop covered the four major subjects listed below in two 90-minute sessions. Each talk or tutorial allowed questions, and concluded with a discussion. Classification: Automatic classification using machine-learning methods is becoming a standard in surveys that generate large datasets. Ashish Mahabal (Caltech) reviewed various methods, and presented examples of several applications. Time-Series Modelling: Suzanne Aigrain (Oxford University) discussed autoregressive models and multivariate approaches such as Gaussian Processes. Meta-classification/mixture of expert models: Karim Pichara (Pontificia Universidad Católica, Chile) described the substantial promise which machine-learning classification methods are now showing in automatic classification, and discussed how the various methods can be combined together. Event Detection: Pavlos Protopapas (Harvard) addressed methods of fast identification of events with low signal-to-noise ratios, enlarging on the characterization and statistical issues of low signal-to-noise ratios and rare events.
Automated classification and visualization of healthy and pathological dental tissues based on near-infrared hyper-spectral imaging

NASA Astrophysics Data System (ADS)

Usenik, Peter; Bürmen, Miran; Vrtovec, Tomaž; Fidler, Aleš; Pernuš, Franjo; Likar, Boštjan

2011-03-01

Despite major improvements in dental healthcare and technology, dental caries remains one of the most prevalent chronic diseases of modern society. The initial stages of dental caries are characterized by demineralization of enamel crystals, commonly known as white spots which are difficult to diagnose. If detected early enough, such demineralization can be arrested and reversed by non-surgical means through well established dental treatments (fluoride therapy, anti-bacterial therapy, low intensity laser irradiation). Near-infrared (NIR) hyper-spectral imaging is a new promising technique for early detection of demineralization based on distinct spectral features of healthy and pathological dental tissues. In this study, we apply NIR hyper-spectral imaging to classify and visualize healthy and pathological dental tissues including enamel, dentin, calculus, dentin caries, enamel caries and demineralized areas. For this purpose, a standardized teeth database was constructed consisting of 12 extracted human teeth with different degrees of natural dental lesions imaged by NIR hyper-spectral system, X-ray and digital color camera. The color and X-ray images of teeth were presented to a clinical expert for localization and classification of the dental tissues, thereby obtaining the gold standard. Principal component analysis was used for multivariate local modeling of healthy and pathological dental tissues. Finally, the dental tissues were classified by employing multiple discriminant analysis. High agreement was observed between the resulting classification and the gold standard with the classification sensitivity and specificity exceeding 85 % and 97 %, respectively. This study demonstrates that NIR hyper-spectral imaging has considerable diagnostic potential for imaging hard dental tissues.
Climate Classification is an Important Factor in Assessing Hospital Performance Metrics

NASA Astrophysics Data System (ADS)

Boland, M. R.; Parhi, P.; Gentine, P.; Tatonetti, N. P.

2017-12-01

Context/Purpose: Climate is a known modulator of disease, but its impact on hospital performance metrics remains unstudied. Methods: We assess the relationship between Köppen-Geiger climate classification and hospital performance metrics, specifically 30-day mortality, as reported in Hospital Compare, and collected for the period July 2013 through June 2014 (7/1/2013 - 06/30/2014). A hospital-level multivariate linear regression analysis was performed while controlling for known socioeconomic factors to explore the relationship between all-cause mortality and climate. Hospital performance scores were obtained from 4,524 hospitals belonging to 15 distinct Köppen-Geiger climates and 2,373 unique counties. Results: Model results revealed that hospital performance metrics for mortality showed significant climate dependence (p<0.001) after adjusting for socioeconomic factors. Interpretation: Currently, hospitals are reimbursed by Governmental agencies using 30-day mortality rates along with 30-day readmission rates. These metrics allow Government agencies to rank hospitals according to their `performance' along these metrics. Various socioeconomic factors are taken into consideration when determining individual hospitals performance. However, no climate-based adjustment is made within the existing framework. Our results indicate that climate-based variability in 30-day mortality rates does exist even after socioeconomic confounder adjustment. Use of standardized high-level climate classification systems (such as Koppen-Geiger) would be useful to incorporate in future metrics. Conclusion: Climate is a significant factor in evaluating hospital 30-day mortality rates. These results demonstrate that climate classification is an important factor when comparing hospital performance across the United States.
Photoacoustic discrimination of vascular and pigmented lesions using classical and Bayesian methods

NASA Astrophysics Data System (ADS)

Swearingen, Jennifer A.; Holan, Scott H.; Feldman, Mary M.; Viator, John A.

2010-01-01

Discrimination of pigmented and vascular lesions in skin can be difficult due to factors such as size, subungual location, and the nature of lesions containing both melanin and vascularity. Misdiagnosis may lead to precancerous or cancerous lesions not receiving proper medical care. To aid in the rapid and accurate diagnosis of such pathologies, we develop a photoacoustic system to determine the nature of skin lesions in vivo. By irradiating skin with two laser wavelengths, 422 and 530 nm, we induce photoacoustic responses, and the relative response at these two wavelengths indicates whether the lesion is pigmented or vascular. This response is due to the distinct absorption spectrum of melanin and hemoglobin. In particular, pigmented lesions have ratios of photoacoustic amplitudes of approximately 1.4 to 1 at the two wavelengths, while vascular lesions have ratios of about 4.0 to 1. Furthermore, we consider two statistical methods for conducting classification of lesions: standard multivariate analysis classification techniques and a Bayesian-model-based approach. We study 15 human subjects with eight vascular and seven pigmented lesions. Using the classical method, we achieve a perfect classification rate, while the Bayesian approach has an error rate of 20%.
Semi-supervised anomaly detection - towards model-independent searches of new physics

NASA Astrophysics Data System (ADS)

Kuusela, Mikael; Vatanen, Tommi; Malmi, Eric; Raiko, Tapani; Aaltonen, Timo; Nagai, Yoshikazu

2012-06-01

Most classification algorithms used in high energy physics fall under the category of supervised machine learning. Such methods require a training set containing both signal and background events and are prone to classification errors should this training data be systematically inaccurate for example due to the assumed MC model. To complement such model-dependent searches, we propose an algorithm based on semi-supervised anomaly detection techniques, which does not require a MC training sample for the signal data. We first model the background using a multivariate Gaussian mixture model. We then search for deviations from this model by fitting to the observations a mixture of the background model and a number of additional Gaussians. This allows us to perform pattern recognition of any anomalous excess over the background. We show by a comparison to neural network classifiers that such an approach is a lot more robust against misspecification of the signal MC than supervised classification. In cases where there is an unexpected signal, a neural network might fail to correctly identify it, while anomaly detection does not suffer from such a limitation. On the other hand, when there are no systematic errors in the training data, both methods perform comparably.
Job titles classified into socioeconomic and occupational groups identify subjects with increased risk for respiratory symptoms independent of occupational exposure to vapour, gas, dust, or fumes.

PubMed

Schyllert, Christian; Andersson, Martin; Hedman, Linnea; Ekström, Magnus; Backman, Helena; Lindberg, Anne; Rönmark, Eva

2018-01-01

Objectives : To evaluate the ability of three different job title classification systems to identify subjects at risk for respiratory symptoms and asthma by also taking the effect of exposure to vapours, gas, dust, and fumes (VGDF) into account. Background : Respiratory symptoms and asthma may be caused by occupational factors. There are different ways to classify occupational exposure. In this study, self-reported occupational exposure to vapours, gas, dust and fumes was used as well as job titles classifed into occupational and socioeconomic Groups according to three different systems. Design: This was a large population-based study of adults aged 30-69 years in Northern Sweden ( n = 9,992, 50% women). Information on job titles, VGDF-exposure, smoking habits, asthma and respiratory symptoms was collected by a postal survey. Job titles were used for classification into socioeconomic and occupational groups based on three classification systems; Socioeconomic classification (SEI), the Nordic Occupations Classification 1983 (NYK), and the Swedish Standard Classification of Occupations 2012 (SSYK). Associations were analysed by multivariable logistic regression. Results : Occupational exposure to VGDF was a risk factor for all respiratory symptoms and asthma (odds ratios (ORs) 1.3-2.4). Productive cough was associated with the socioeconomic groups of manual workers (ORs 1.5-2.1) and non-manual employees (ORs 1.6-1.9). These groups include occupations such as construction and transportation workers, service workers, nurses, teachers and administration clerks which by the SSYK classification were associated with productive cough (ORs 2.4-3.7). Recurrent wheeze was significantly associated with the SEI group manual workers (ORs 1.5-1.7). After adjustment for also VGDF, productive cough remained significantly associated with the SEI groups manual workers in service and non-manual employees, and the SSYK-occupational groups administration, service, and elementary occupations. Conclusions : In this cross-sectional study, two of the three different classification systems, SSYK and SEI gave similar results and identified groups with increased risk for respiratory symptoms while NYK did not give conclusive results. Furthermore, several associations were independent of exposure to VGDF indicating that also other job-related factors than VGDF are of importance.
Pyelocaliceal Distribution of Kidney Stones Used as an Outcome Predictor in Percutaneous Nephrolithotomy After Being Evaluated with Preoperative and Postoperative CT Scan.

PubMed

Tirapegui, Federico Ignacio; González, Mariano Sebastian; González, Ignacio Pablo Tobía; Daels, Francisco P

2015-06-01

To identify kidney stone characteristics that will determine either success or failure of a percutaneous nephrolithotomy (PCNL) and design a classification system to predict results according to these characteristics. One hundred thirty-eight patients were assessed with multislice abdominal and pelvic CT before and after PCNL. With regard to pyelocaliceal stone distribution, we classified our patients in two groups that we called "no extra stone in middle calix" (NESMC) and "extra stone in middle calix" (ESMC), according to the difficulty in reaching the stones. We did a univariate and a multivariate analysis, as well as a receiving operating curve (ROC) of the proposed classification, based on the foreseen probabilities, to determine the diagnostic yield. Global residual lithiasis (RL) was 26.08%. The proportion of patients with RL according to classification was NESMC 11.5% and ESMC 59.5%. In the univariate logistic regression analysis of the distribution, number, total volumetry, side, type, radio-opacity of stones, and the presence or not of preoperatory urinary tract infection, the variables related to RL were the distribution (11.3; 95% confidence interval [95% CI] 4.7, 27.4), volumetry (odds ratio [OR] 1.01; 95% CI 1.004, 1.014), and the presence of staghorn stones (OR 6.64; 95% CI 2.463, 17.905). In the multivariate analysis, distribution was statistically significant (OR 8.687; 95% CI 2.69, 28.06), whereas total volumetry and the presence of staghorn stones were not (OR 1; 95% CI 1.000, 1.000 and OR 2.7; 95% CI 0.35, 20.57, respectively). The ROC showed an area under the curve of 0.77. In our experience, the distribution of kidney stones is the most important predictor of RL after PCNL. The results also suggest that the presence of stones in the middle calix has a direct impact on the stone-free rate. We put forward a simple and reproducible classification, easy to apply, and useful to estimate the chances of success of the procedure using preoperatory CT scans.
Alterations of functional connectivities from early to middle adulthood: Clues from multivariate pattern analysis of resting-state fMRI data.

PubMed

Tian, Lixia; Ma, Lin; Wang, Linlin

2016-04-01

In contrast to extended research interests in the maturation and aging of human brain, alterations of brain structure and function from early to middle adulthood have been much less studied. The aim of the present study was to investigate the extent and pattern of the alterations of functional interactions between brain regions from early to middle adulthood. We carried out the study by multivariate pattern analysis of resting-state fMRI (RS-fMRI) data of 63 adults aged 18 to 45 years. Specifically, using elastic net, we performed brain age estimation and age-group classification (young adults aged 18-28 years vs. middle-aged adults aged 35-45 years) based on the resting-state functional connectivities (RSFCs) between 160 regions of interest (ROIs) evaluated on the RS-fMRI data of each subject. The results indicate that the estimated brain ages were significantly correlated with the chronological age (R=0.78, MAE=4.81), and a classification rate of 94.44% and area under the receiver operating characteristic curve (AUC) of 0.99 were obtained when classifying the young and middle-aged adults. These results provide strong evidence that functional interactions between brain regions undergo notable alterations from early to middle adulthood. By analyzing the RSFCs that contribute to brain age estimation/age-group classification, we found that a majority of the RSFCs were inter-network, and we speculate that inter-network RSFCs might mature late but age early as compared to intra-network ones. In addition, the strengthening/weakening of the RSFCs associated with the left/right hemispheric ROIs, the weakening of cortico-cerebellar RSFCs and the strengthening of the RSFCs between the default mode network and other networks contributed much to both brain age estimation and age-group classification. All these alterations might reflect that aging of brain function is already in progress in middle adulthood. Overall, the present study indicated that the RSFCs undergo notable alterations from early to middle adulthood and highlighted the necessity of careful considerations of possible influences of these alterations in related studies. Copyright © 2016 Elsevier Inc. All rights reserved.
Study of the questionnaire of the Polytechnic University of Valencia (UPV) teaching staff, using students opinion survey. Statistical treatment

NASA Astrophysics Data System (ADS)

Martinez Gomez, Monica

Quality improvement of university institutions represents the most important challenge in the next years, and the potential tool to achieve it is based on the institutional evaluation in general, and specially the evaluation of the teaching performance. The opinion questionnaire from the students is the most generalised tool used to evaluate the teaching performance at Spanish universities. The general objective of this thesis is to develop a statistical methodology suitable to extract, analyse and interpret the information contained in the Questionnaire of Teaching Evaluation from Student Opinion (CEDA) of the UPV, aimed at optimising its practical use. The study is centred in the application of different multivariate techniques and has been structured in three parts: (1) Evaluation of the reliability, validity and dimensionality of the tool. The multivariate method used for this purpose is the Factorial Analysis. (2) Determination of the capacity of the questionnaire to identify different profiles of lecturers based on the quality perceived by students. This target is conducted with different multivariate classification techniques: hierarchical cluster analysis, non-hierarchical and two-stage analysis. Moreover, those items that best discriminate among the teaching typologies obtained are identified in the questionnaire. (3) Identification of the teaching typologies according to different descriptive characteristics referent to the subject and lecturer, with the use of decision trees. Once identified these typologies, a new discriminant analysis is conducted aimed at identifying those items that best characterise each typology. Finally, a study is carried out with the classification method SIMCA (Soft Independent Modelling of Class Analogy) in order to determine the discriminant loading of every item among the identified teaching typologies, allowing the identification of those that best distinguish the different classes obtained. With the combined use of the proposed techniques, it is expected to optimise the use of CEDA as a measuring tool and an indicator of the teaching quality at the university, that would allow the introduction of actions for the continuous improvement in the teaching processes of the UPV.
Evaluation of the 7(th) edition of the UICC-AJCC tumor, node, metastasis classification for esophageal cancer in a Chinese cohort.

PubMed

Huang, Yan; Guo, Weigang; Shi, Shiming; He, Jian

2016-07-01

To assess and evaluate the prognostic value of the 7(th) edition of the Union for International Cancer Control-American Joint Committee on Cancer (UICC-AJCC) tumor, node, metastasis (TNM) staging system for Chinese patients with esophageal cancer in comparison with the 6(th) edition. A retrospective review was performed on 766 consecutive esophageal cancer patients treated with esophagectomy between 2008 and 2012. Patients were staged according to the 6(th) and 7(th) editions for esophageal cancer respectively. Survival was calculated by the Kaplan-Meier method, and multivariate analysis was performed using Cox regression model. Overall 3-year survival rate was 59.5%. There were significant differences in 3-year survival rates among T stages both according to the 6(th) edition and the 7(th) edition (P<0.001). According to the 7(th) edition, the 3-year survival rates of N0 (75.4%), N1 (65.2%), N2 (39.7%) and N3 (27.3%) patients were significant differences (P<0.001). Kaplan-Meier curve revealed a good discriminatory ability from stage I to IV, except for stage IB, IIA and IIB in the 7(th) edition staging system. Based on the 7(th) edition, the degree of differentiation, tumor length and tumor location were not independent prognostic factors on multivariate analysis. The multivariate analyses suggested that pT-, pN-, pTNM-category were all the independent prognostic factors based on the 6(th) and 7(th) edition staging system. The 7(th) edition of AJCC TNM staging system of esophageal cancer should discriminate pT2-3N0M0 (stage IB, IIA and IIB) better when considering the esophageal squamous cell cancer patients. Therefore, to improve and optimize the AJCC TNM classification for Chinese patients with esophageal cancer, more considerations about the value of tumor grade and tumor location in pT2-3N0M0 esophageal squamous cell cancer should be taken in the next new TNM staging system.
Multivariate pattern analysis of obsessive-compulsive disorder using structural neuroanatomy.

PubMed

Hu, Xinyu; Liu, Qi; Li, Bin; Tang, Wanjie; Sun, Huaiqiang; Li, Fei; Yang, Yanchun; Gong, Qiyong; Huang, Xiaoqi

2016-02-01

Magnetic resonance imaging (MRI) studies have revealed brain structural abnormalities in obsessive-compulsive disorder (OCD) patients, involving both gray matter (GM) and white matter (WM). However, the results of previous publications were based on average differences between groups, which limited their usages in clinical practice. Therefore, the aim of this study was to examine whether the application of multivariate pattern analysis (MVPA) to high-dimensional structural images would allow accurate discrimination between OCD patients and healthy control subjects (HCS). High-resolution T1-weighted images were acquired from 33 OCD patients and 33 demographically matched HCS in a 3.0 T scanner. Differences in GM and WM volume between OCD and HCS were examined using two types of well-established MVPA techniques: support vector machine (SVM) and Gaussian process classifier (GPC). We also drew a receiver operating characteristic (ROC) curve to evaluate the performance of each classifier. The classification accuracies for both classifiers using GM and WM anatomy were all above 75%. The highest classification accuracy (81.82%, P<0.001) was achieved with the SVM classifier using WM information. Regional brain anomalies with high discriminative power were based on three distributed networks including the fronto-striatal circuit, the temporo-parieto-occipital junction and the cerebellum. Our study illustrated that both GM and WM anatomical features may be useful in differentiating OCD patients from HCS. WM volume using the SVM approach showed the highest accuracy in our population for revealing group differences, which suggested its potential diagnostic role in detecting highly enriched OCD patients at the level of the individual. Copyright © 2015 Elsevier B.V. and ECNP. All rights reserved.

Development and validation of a Partial Least Squares-Discriminant Analysis (PLS-DA) model based on the determination of ethyl glucuronide (EtG) and fatty acid ethyl esters (FAEEs) in hair for the diagnosis of chronic alcohol abuse.

PubMed

Alladio, E; Giacomelli, L; Biosa, G; Corcia, D Di; Gerace, E; Salomone, A; Vincenti, M

2018-01-01

The chronic intake of an excessive amount of alcohol is currently ascertained by determining the concentration of direct alcohol metabolites in the hair samples of the alleged abusers, including ethyl glucuronide (EtG) and, less frequently, fatty acid ethyl esters (FAEEs). Indirect blood biomarkers of alcohol abuse are still determined to support hair EtG results and diagnose a consequent liver impairment. In the present study, the supporting role of hair FAEEs is compared with indirect blood biomarkers with respect to the contexts in which hair EtG interpretation is uncertain. Receiver Operating Characteristics (ROC) curves and multivariate Principal Component Analysis (PCA) demonstrated much stronger correlation of EtG results with FAEEs than with any single indirect biomarker or their combinations. Partial Least Squares Discriminant Analysis (PLS-DA) models based on hair EtG and FAEEs were developed to maximize the biomarkers information content on a multivariate background. The final PLS-DA model yielded 100% correct classification on a training/evaluation dataset of 155 subjects, including both chronic alcohol abusers and social drinkers. Then, the PLS-DA model was validated on an external dataset of 81 individual providing optimal discrimination ability between chronic alcohol abusers and social drinkers, in terms of specificity and sensitivity. The PLS-DA scores obtained for each subject, with respect to the PLS-DA model threshold that separates the probabilistic distributions for the two classes, furnished a likelihood ratio value, which in turn conveys the strength of the experimental data support to the classification decision, within a Bayesian logic. Typical boundary real cases from daily work are discussed, too. Copyright © 2017 Elsevier B.V. All rights reserved.
Prognostic value of Ki-67 index in adult medulloblastoma after accounting for molecular subgroup: a retrospective clinical and molecular analysis.

PubMed

Zhao, Fu; Zhang, Jing; Li, Peng; Zhou, Qiangyi; Zhang, Shun; Zhao, Chi; Wang, Bo; Yang, Zhijun; Li, Chunde; Liu, Pinan

2018-04-23

Medulloblastoma (MB) is a rare primary brain tumor in adults. We previously evaluated that combining both clinical and molecular classification could improve current risk stratification for adult MB. In this study, we aimed to identify the prognostic value of Ki-67 index in adult MB. Ki-67 index of 51 primary adult MBs was reassessed using a computer-based image analysis (Image-Pro Plus). All patients were followed up ranging from 12 months up to 15 years. Gene expression profiling and immunochemistry were used to establish the molecular subgroups in adult MB. Combined risk stratification models were designed based on clinical characteristics, molecular classification and Ki-67 index, and identified by multivariable Cox proportional hazards analysis. In our cohort, the mean Ki-67 value was 30.0 ± 11.3% (range 6.56-63.55%). The average Ki-67 value was significantly higher in LC/AMB than in CMB and DNMB (P = .001). Among three molecular subgroups, Group 4-tumors had the highest average Ki-67 value compared with WNT- and SHH-tumors (P = .004). Patients with Ki-67 index large than 30% displayed poorer overall survival (OS) and progression free survival (PFS) than those with Ki-67 less than 30% (OS: P = .001; PFS: P = .006). Ki-67 index (i.e. > 30%, < 30%) was identified as an independent significant prognostic factor (OS: P = .017; PFS: P = .024) by using multivariate Cox proportional hazards model. In conclusion, Ki-67 index can be considered as a valuable independent prognostic biomarker for adult patients with MB.
Application of a MRI based index to longitudinal atrophy change in Alzheimer disease, mild cognitive impairment and healthy older individuals in the AddNeuroMed cohort

PubMed Central

Aguilar, Carlos; Muehlboeck, J-Sebastian; Mecocci, Patrizia; Vellas, Bruno; Tsolaki, Magda; Kloszewska, Iwona; Soininen, Hilkka; Lovestone, Simon; Wahlund, Lars-Olof; Simmons, Andrew; Westman, Eric

2014-01-01

Cross sectional studies of patients at risk of developing Alzheimer disease (AD) have identified several brain regions known to be prone to degeneration suitable as biomarkers, including hippocampal, ventricular, and whole brain volume. The aim of this study was to longitudinally evaluate an index based on morphometric measures derived from MRI data that could be used for classification of AD and healthy control subjects, as well as prediction of conversion from mild cognitive impairment (MCI) to AD. Patients originated from the AddNeuroMed project at baseline (119 AD, 119 MCI, 110 controls (CTL)) and 1-year follow-up (62 AD, 73 MCI, 79 CTL). Data consisted of 3D T1-weighted MR images, demographics, MMSE, ADAS-Cog, CERAD and CDR scores, and APOE e4 status. We computed an index using a multivariate classification model (AD vs. CTL), using orthogonal partial least squares to latent structures (OPLS). Sensitivity, specificity and AUC were determined. Performance of the classifier (AD vs. CTL) was high at baseline (10-fold cross-validation, 84% sensitivity, 91% specificity, 0.93 AUC) and at 1-year follow-up (92% sensitivity, 74% specificity, 0.93 AUC). Predictions of conversion from MCI to AD were good at baseline (77% of MCI converters) and at follow-up (91% of MCI converters). MCI carriers of the APOE e4 allele manifested more atrophy and presented a faster cognitive decline when compared to non-carriers. The derived index demonstrated a steady increase in atrophy over time, yielding higher accuracy in prediction at the time of clinical conversion. Neuropsychological tests appeared less sensitive to changes over time. However, taking the average of the two time points yielded better correlation between the index and cognitive scores as opposed to using cross-sectional data only. Thus, multivariate classification seemed to detect patterns of AD changes before conversion from MCI to AD and including longitudinal information is of great importance. PMID:25071554
Combination of multivariate curve resolution and multivariate classification techniques for comprehensive high-performance liquid chromatography-diode array absorbance detection fingerprints analysis of Salvia reuterana extracts.

PubMed

Hakimzadeh, Neda; Parastar, Hadi; Fattahi, Mohammad

2014-01-24

In this study, multivariate curve resolution (MCR) and multivariate classification methods are proposed to develop a new chemometric strategy for comprehensive analysis of high-performance liquid chromatography-diode array absorbance detection (HPLC-DAD) fingerprints of sixty Salvia reuterana samples from five different geographical regions. Different chromatographic problems occurred during HPLC-DAD analysis of S. reuterana samples, such as baseline/background contribution and noise, low signal-to-noise ratio (S/N), asymmetric peaks, elution time shifts, and peak overlap are handled using the proposed strategy. In this way, chromatographic fingerprints of sixty samples are properly segmented to ten common chromatographic regions using local rank analysis and then, the corresponding segments are column-wise augmented for subsequent MCR analysis. Extended multivariate curve resolution-alternating least squares (MCR-ALS) is used to obtain pure component profiles in each segment. In general, thirty-one chemical components were resolved using MCR-ALS in sixty S. reuterana samples and the lack of fit (LOF) values of MCR-ALS models were below 10.0% in all cases. Pure spectral profiles are considered for identification of chemical components by comparing their resolved spectra with the standard ones and twenty-four components out of thirty-one components were identified. Additionally, pure elution profiles are used to obtain relative concentrations of chemical components in different samples for multivariate classification analysis by principal component analysis (PCA) and k-nearest neighbors (kNN). Inspection of the PCA score plot (explaining 76.1% of variance accounted for three PCs) showed that S. reuterana samples belong to four clusters. The degree of class separation (DCS) which quantifies the distance separating clusters in relation to the scatter within each cluster is calculated for four clusters and it was in the range of 1.6-5.8. These results are then confirmed by kNN. In addition, according to the PCA loading plot and kNN dendrogram of thirty-one variables, five chemical constituents of luteolin-7-o-glucoside, salvianolic acid D, rosmarinic acid, lithospermic acid and trijuganone A are identified as the most important variables (i.e., chemical markers) for clusters discrimination. Finally, the effect of different chemical markers on samples differentiation is investigated using counter-propagation artificial neural network (CP-ANN) method. It is concluded that the proposed strategy can be successfully applied for comprehensive analysis of chromatographic fingerprints of complex natural samples. Copyright © 2013 Elsevier B.V. All rights reserved.
Proposal for a new risk stratification classification for meningioma based on patient age, WHO tumor grade, size, localization, and karyotype

PubMed Central

Domingues, Patrícia Henriques; Sousa, Pablo; Otero, Álvaro; Gonçalves, Jesus Maria; Ruiz, Laura; de Oliveira, Catarina; Lopes, Maria Celeste; Orfao, Alberto; Tabernero, Maria Dolores

2014-01-01

Background Tumor recurrence remains the major clinical complication of meningiomas, the majority of recurrences occurring among WHO grade I/benign tumors. In the present study, we propose a new scoring system for the prognostic stratification of meningioma patients based on analysis of a large series of meningiomas followed for a median of >5 years. Methods Tumor cytogenetics were systematically investigated by interphase fluorescence in situ hybridization in 302 meningioma samples, and the proposed classification was further validated in an independent series of cases (n = 132) analyzed by high-density (500K) single-nucleotide polymorphism (SNP) arrays. Results Overall, we found an adverse impact on patient relapse-free survival (RFS) for males, presence of brain edema, younger patients (<55 years), tumor size >50 mm, tumor localization at intraventricular and anterior cranial base areas, WHO grade II/III meningiomas, and complex karyotypes; the latter 5 variables showed an independent predictive value in multivariate analysis. Based on these parameters, a prognostic score was established for each individual case, and patients were stratified into 4 risk categories with significantly different (P < .001) outcomes. These included a good prognosis group, consisting of approximately 20% of cases, that showed a RFS of 100% ± 0% at 10 years and a very poor-prognosis group with a RFS rate of 0% ± 0% at 10 years. The prognostic impact of the scoring system proposed here was also retained when WHO grade I cases were considered separately (P < .001). Conclusions Based on this risk-stratification classification, different strategies may be adopted for follow-up, and eventually also for treatment, of meningioma patients at different risks for relapse. PMID:24536048
Increased prognostic accuracy of TBI when a brain electrical activity biomarker is added to loss of consciousness (LOC).

PubMed

Hack, Dallas; Huff, J Stephen; Curley, Kenneth; Naunheim, Roseanne; Ghosh Dastidar, Samanwoy; Prichep, Leslie S

2017-07-01

Extremely high accuracy for predicting CT+ traumatic brain injury (TBI) using a quantitative EEG (QEEG) based multivariate classification algorithm was demonstrated in an independent validation trial, in Emergency Department (ED) patients, using an easy to use handheld device. This study compares the predictive power using that algorithm (which includes LOC and amnesia), to the predictive power of LOC alone or LOC plus traumatic amnesia. ED patients 18-85years presenting within 72h of closed head injury, with GSC 12-15, were study candidates. 680 patients with known absence or presence of LOC were enrolled (145 CT+ and 535 CT- patients). 5-10min of eyes closed EEG was acquired using the Ahead 300 handheld device, from frontal and frontotemporal regions. The same classification algorithm methodology was used for both the EEG based and the LOC based algorithms. Predictive power was evaluated using area under the ROC curve (AUC) and odds ratios. The QEEG based classification algorithm demonstrated significant improvement in predictive power compared with LOC alone, both in improved AUC (83% improvement) and odds ratio (increase from 4.65 to 16.22). Adding RGA and/or PTA to LOC was not improved over LOC alone. Rapid triage of TBI relies on strong initial predictors. Addition of an electrophysiological based marker was shown to outperform report of LOC alone or LOC plus amnesia, in determining risk of an intracranial bleed. In addition, ease of use at point-of-care, non-invasive, and rapid result using such technology suggests significant value added to standard clinical prediction. Copyright © 2017 Elsevier Inc. All rights reserved.
REGIONAL-SCALE WIND FIELD CLASSIFICATION EMPLOYING CLUSTER ANALYSIS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Glascoe, L G; Glaser, R E; Chin, H S

2004-06-17

The classification of time-varying multivariate regional-scale wind fields at a specific location can assist event planning as well as consequence and risk analysis. Further, wind field classification involves data transformation and inference techniques that effectively characterize stochastic wind field variation. Such a classification scheme is potentially useful for addressing overall atmospheric transport uncertainty and meteorological parameter sensitivity issues. Different methods to classify wind fields over a location include the principal component analysis of wind data (e.g., Hardy and Walton, 1978) and the use of cluster analysis for wind data (e.g., Green et al., 1992; Kaufmann and Weber, 1996). The goalmore » of this study is to use a clustering method to classify the winds of a gridded data set, i.e, from meteorological simulations generated by a forecast model.« less
Resolution of co-eluting compounds of Cannabis Sativa in comprehensive two-dimensional gas chromatography/mass spectrometry detection with Multivariate Curve Resolution-Alternating Least Squares.

PubMed

Omar, Jone; Olivares, Maitane; Amigo, José Manuel; Etxebarria, Nestor

2014-04-01

Comprehensive Two Dimensional Gas Chromatography - Mass Spectrometry (GC × GC/qMS) analysis of Cannabis sativa extracts shows a high complexity due to the large variety of terpenes and cannabinoids and to the fact that the complete resolution of the peaks is not straightforwardly achieved. In order to support the resolution of the co-eluted peaks in the sesquiterpene and the cannabinoid chromatographic region the combination of Multivariate Curve Resolution and Alternating Least Squares algorithms was satisfactorily applied. As a result, four co-eluting areas were totally resolved in the sesquiterpene region and one in the cannabinoid region in different samples of Cannabis sativa. The comparison of the mass spectral profiles obtained for each resolved peak with theoretical mass spectra allowed the identification of some of the co-eluted peaks. Finally, the classification of the studied samples was achieved based on the relative concentrations of the resolved peaks. Copyright © 2014 Elsevier B.V. All rights reserved.
Multivariate Pattern Classification of Facial Expressions Based on Large-Scale Functional Connectivity.

PubMed

Liang, Yin; Liu, Baolin; Li, Xianglin; Wang, Peiyuan

2018-01-01

It is an important question how human beings achieve efficient recognition of others' facial expressions in cognitive neuroscience, and it has been identified that specific cortical regions show preferential activation to facial expressions in previous studies. However, the potential contributions of the connectivity patterns in the processing of facial expressions remained unclear. The present functional magnetic resonance imaging (fMRI) study explored whether facial expressions could be decoded from the functional connectivity (FC) patterns using multivariate pattern analysis combined with machine learning algorithms (fcMVPA). We employed a block design experiment and collected neural activities while participants viewed facial expressions of six basic emotions (anger, disgust, fear, joy, sadness, and surprise). Both static and dynamic expression stimuli were included in our study. A behavioral experiment after scanning confirmed the validity of the facial stimuli presented during the fMRI experiment with classification accuracies and emotional intensities. We obtained whole-brain FC patterns for each facial expression and found that both static and dynamic facial expressions could be successfully decoded from the FC patterns. Moreover, we identified the expression-discriminative networks for the static and dynamic facial expressions, which span beyond the conventional face-selective areas. Overall, these results reveal that large-scale FC patterns may also contain rich expression information to accurately decode facial expressions, suggesting a novel mechanism, which includes general interactions between distributed brain regions, and that contributes to the human facial expression recognition.
Multivariate Pattern Classification of Facial Expressions Based on Large-Scale Functional Connectivity

PubMed Central

Liang, Yin; Liu, Baolin; Li, Xianglin; Wang, Peiyuan

2018-01-01

It is an important question how human beings achieve efficient recognition of others’ facial expressions in cognitive neuroscience, and it has been identified that specific cortical regions show preferential activation to facial expressions in previous studies. However, the potential contributions of the connectivity patterns in the processing of facial expressions remained unclear. The present functional magnetic resonance imaging (fMRI) study explored whether facial expressions could be decoded from the functional connectivity (FC) patterns using multivariate pattern analysis combined with machine learning algorithms (fcMVPA). We employed a block design experiment and collected neural activities while participants viewed facial expressions of six basic emotions (anger, disgust, fear, joy, sadness, and surprise). Both static and dynamic expression stimuli were included in our study. A behavioral experiment after scanning confirmed the validity of the facial stimuli presented during the fMRI experiment with classification accuracies and emotional intensities. We obtained whole-brain FC patterns for each facial expression and found that both static and dynamic facial expressions could be successfully decoded from the FC patterns. Moreover, we identified the expression-discriminative networks for the static and dynamic facial expressions, which span beyond the conventional face-selective areas. Overall, these results reveal that large-scale FC patterns may also contain rich expression information to accurately decode facial expressions, suggesting a novel mechanism, which includes general interactions between distributed brain regions, and that contributes to the human facial expression recognition. PMID:29615882
Magnetic Resonance Imaging Findings Predict the Recurrence of Chronic Subdural Hematoma

PubMed Central

GOTO, Haruo; ISHIKAWA, Osamu; NOMURA, Masashi; TANAKA, Kentaro; NOMURA, Seiji; MAEDA, Keiichiro

2015-01-01

The exact predictive factors for postoperative recurrence of chronic subdural hematoma (CSDH) are still unknown. Based on the preoperative magnetic resonance imaging (MRI), low recurrence rate of T1-hyperintensity hematoma was previously reported. We investigated the other types of radiological findings which are related to the recurrence rate of CSDH in large number of patients analyzed by multivariate logistic regression model. Preoperative MRI and postoperative computed tomography (CT) were performed and the influence of the preoperative use of antiplatelet or anticoagulant drugs was also studied. The overall recurrence rate was 9.3% (47 of 505 hematomas). The MRI T1-iso/hypointensity group showed a significantly higher recurrence rate (18.2%, 29 of 159) compared to the other groups (5.2%, 18 of 346; p < 0.001). Multivariate logistic regression analysis showed T1 classification was the solo significant prognostic predictor among various factors such as bilateral hematoma, antiplatelet or anticoagulant drug usage, residual hematoma on postoperative CT, and MRI classification (p < 0.001): adjusted odds ratio for the recurrence in T1-iso/hypointensity group relative to the T1-hyperintensity group was 5.58 [95% confidence interval (CI), 2.09–14.86] (p = 0.001). Postoperative residual hematoma and antiplatelet or anticoagulant drug usage did not increase the recurrence risk. The preoperative MRI findings, especially T1WI findings, have predictive value for postoperative recurrence of CSDH and the T1-iso/hypointensity group can be assumed to be a high recurrence risk group. PMID:25746312
Comparison of the 7(th) and proposed 8(th) editions of the AJCC/UICC TNM staging system for non-small cell lung cancer undergoing radical surgery.

PubMed

Jin, Ying; Chen, Ming; Yu, Xinmin

2016-09-19

The present study aims to compare the 7(th) and the proposed 8(th) edition of the AJCC/UICC TNM staging system for NSCLC in a cohort of patients from a single institution. A total of 408 patients with NSCLC who underwent radical surgery were analyzed retrospectively. Survivals were analyzed using the Kaplan -Meier method and were compared using the log-rank test. Multivariate analysis was performed by the Cox proportional hazard model. The Akaike information criterion (AIC) and C-index were applied to compare the two prognostic systems with different numbers of stages. The 7(th) AJCC T categories, the proposed 8(th) AJCC T categories, N categories, visceral pleural invasion, and vessel invasion were found to have statistically significant associations with disease-free survival (DFS) on univariate analysis. In the 7(th) edition staging system as well as in the proposed 8(th) edition, T categories, N categories, and pleural invasion were independent factors for DFS on multivariate analysis. The AIC value was smaller for the 8(th) edition compared to the 7(th) edition staging system. The C-index value was larger for the 8(th) edition compared to the 7(th) edition staging system. Based on the data from our single center, the proposed 8(th) AJCC T classification seems to be superior to the 7(th) AJCC T classification in terms of DFS for patients with NSCLC underwent radical surgery.
Profiling and classification of French propolis by combined multivariate data analysis of planar chromatograms and scanning direct analysis in real time mass spectra.

PubMed

Chasset, Thibaut; Häbe, Tim T; Ristivojevic, Petar; Morlock, Gertrud E

2016-09-23

Quality control of propolis is challenging, as it is a complex natural mixture of compounds, and thus, very difficult to analyze and standardize. Shown on the example of 30 French propolis samples, a strategy for an improved quality control was demonstrated in which high-performance thin-layer chromatography (HPTLC) fingerprints were evaluated in combination with selected mass signals obtained by desorption-based scanning mass spectrometry (MS). The French propolis sample extracts were separated by a newly developed reversed phase (RP)-HPTLC method. The fingerprints obtained by two different detection modes, i.e. after (1) derivatization and fluorescence detection (FLD) at UV 366nm and (2) scanning direct analysis in real time (DART)-MS, were analyzed by multivariate data analysis. Thus, RP-HPTLC-FLD and RP-HPTLC-DART-MS fingerprints were explored and the best classification was obtained using both methods in combination with pattern recognition techniques, such as principal component analysis. All investigated French propolis samples were divided in two types and characteristic patterns were observed. Phenolic compounds such as caffeic acid, p-coumaric acid, chrysin, pinobanksin, pinobanksin-3-acetate, galangin, kaempferol, tectochrysin and pinocembrin were identified as characteristic marker compounds of French propolis samples. This study expanded the research on the European poplar type of propolis and confirmed the presence of two botanically different types of propolis, known as the blue and orange types. Copyright © 2016 Elsevier B.V. All rights reserved.
A Novel approach to monitor chlorophyll-a concentration using an adaptive model from MODIS data at 250 metres spatial resolution

NASA Astrophysics Data System (ADS)

El Alem, A.; Chokmani, K.; Laurion, I.; El Adlouni, S.

2013-12-01

Occurrence and extent of Harmful Algal Bloom (HAB) has increased in inland water bodies around the world. The appearance of these blooms reflects the advanced state of eutrophication of several aquatic systems caused by urban, agricultural, and industrial development. Algal blooms, especially those cyanobacterial origins, are capable to produce and release toxins, threatening human and animal health, quality of drinking water, and recreational water bodies. Conventional monitoring networks, based on infrequent sampling in a few fixed monitoring stations, cannot provide the information needed as HABs are spatially and temporally heterogeneous. Remote sensing represents an interesting alternative to provide the required spatial and temporal coverage. The usefulness of air-borne and satellite remote sensing data to detect HABs was demonstrated since three decades ago, and since several empirical and semi-empirical models, using satellite imagery, were developed to estimate chlorophyll-a concentration [Chl-a] as a proxy to detect bloom proliferations. However, most of those models presented several weaknesses that are generally linked to the range of [Chl-a] to be estimated. Indeed, models originally calibrated for high [Chl-a] fail to estimate low concentrations and vice versa. In this study, an adaptive model to estimate [Chl-a], spread over a wide range of concentrations, is developed for optically complex inland water bodies based on combination of water spectral response classification and three developed semi-empirical algorithms using a multivariate regression. Three distinct water types (low, medium, and high [Chl-a]) are first identified using the Classification and Regression Tree (CART) method performed on remote sensing reflectance over a dataset of 44 [Chl-a] samples collected from Lakes over Quebec province. Based on the water classification, a specific multivariate model to each water type is developed using the same dataset and the MODIS data at 250-m spatial resolution. By pre-clustering inland water bodies, the results were very interesting as the determination coefficients as well as the relative RMSE of the cross-validation were of 0.99, 0.98 and 0.95 and of 0.5%, 8% and 17% for high, medium, and low [Chl-a], respectively. On the other hand, the adaptive model reached a global success rate of 92% using an independent, semi-qualitative, [Chl-a] samples collected over more than twenty inland water bodies for the years 2009 and 2010 over the Quebec province.
Dietary characterization of terrestrial mammals.

PubMed

Pineda-Munoz, Silvia; Alroy, John

2014-08-22

Understanding the feeding behaviour of the species that make up any ecosystem is essential for designing further research. Mammals have been studied intensively, but the criteria used for classifying their diets are far from being standardized. We built a database summarizing the dietary preferences of terrestrial mammals using published data regarding their stomach contents. We performed multivariate analyses in order to set up a standardized classification scheme. Ideally, food consumption percentages should be used instead of qualitative classifications. However, when highly detailed information is not available we propose classifying animals based on their main feeding resources. They should be classified as generalists when none of the feeding resources constitute over 50% of the diet. The term 'omnivore' should be avoided because it does not communicate all the complexity inherent to food choice. Moreover, the so-called omnivore diets actually involve several distinctive adaptations. Our dataset shows that terrestrial mammals are generally highly specialized and that some degree of food mixing may even be required for most species.
[Value of the albumin to globulin ratio in predicting severity and prognosis in myasthenia gravis patients].

PubMed

Yang, D H; Su, Z Q; Chen, Y; Chen, Z B; Ding, Z N; Weng, Y Y; Li, J; Li, X; Tong, Q L; Han, Y X; Zhang, X

2016-03-08

To assess the predictive value of the albumin to globulin ratio (AGR) in evaluation of disease severity and prognosis in myasthenia gravis patients. A total of 135 myasthenia gravis (MG) patients were enrolled between February 2009 and March 2015. The AGR was detected on the first day of hospitalization and ranked from lowest to highest, and the patients were divided into three equal tertiles according to the AGR values, which were T1 (AGR <1.34), T2 (1.34≤AGR≤1.53) and T3 (AGR>1.53). The Kaplan-Meier curve was used to evaluate the prognostic value of AGR. Cox model analysis was used to evaluate the relevant factors. Multivariate Logistic regression analysis was used to find the predictors of myasthenia crisis during hospitalization. The median length of hospital stay for each tertile was: for the T1 21 days (15-35.5), T2 18 days (14-27.5), and T3 16 days (12-22.5) (P<0.01), and Kaplan-Meier curves showed significant difference among the three groups. In the univariate model, serum albumin, creatinine, AGR and MGFA clinical classification were related to prognosis of myasthenia gravis. At the multivariate Cox regression analysis, the AGR (P<0.001) and MGFA clinical classification (P<0.001) were independent predictive factors of disease severity and prognosis in myasthenia gravis patients. Respectively, the hazard ratio (HR) were 4.655 (95% CI: 2.355-9.202) and 0.596 (95% CI: 0.492-0.723). Multivariate Logistic regression analysis showed the AGR (P<0.001) and MGFA clinical classification were related to myasthenia crisis. The AGR may represent a simple, potentially useful predictive biomarker for evaluating the disease severity and prognosis of patients with myasthenia gravis.
Information spreading by a combination of MEG source estimation and multivariate pattern classification.

PubMed

Sato, Masashi; Yamashita, Okito; Sato, Masa-Aki; Miyawaki, Yoichi

2018-01-01

To understand information representation in human brain activity, it is important to investigate its fine spatial patterns at high temporal resolution. One possible approach is to use source estimation of magnetoencephalography (MEG) signals. Previous studies have mainly quantified accuracy of this technique according to positional deviations and dispersion of estimated sources, but it remains unclear how accurately MEG source estimation restores information content represented by spatial patterns of brain activity. In this study, using simulated MEG signals representing artificial experimental conditions, we performed MEG source estimation and multivariate pattern analysis to examine whether MEG source estimation can restore information content represented by patterns of cortical current in source brain areas. Classification analysis revealed that the corresponding artificial experimental conditions were predicted accurately from patterns of cortical current estimated in the source brain areas. However, accurate predictions were also possible from brain areas whose original sources were not defined. Searchlight decoding further revealed that this unexpected prediction was possible across wide brain areas beyond the original source locations, indicating that information contained in the original sources can spread through MEG source estimation. This phenomenon of "information spreading" may easily lead to false-positive interpretations when MEG source estimation and classification analysis are combined to identify brain areas that represent target information. Real MEG data analyses also showed that presented stimuli were able to be predicted in the higher visual cortex at the same latency as in the primary visual cortex, also suggesting that information spreading took place. These results indicate that careful inspection is necessary to avoid false-positive interpretations when MEG source estimation and multivariate pattern analysis are combined.
Information spreading by a combination of MEG source estimation and multivariate pattern classification

PubMed Central

Sato, Masashi; Yamashita, Okito; Sato, Masa-aki

2018-01-01

To understand information representation in human brain activity, it is important to investigate its fine spatial patterns at high temporal resolution. One possible approach is to use source estimation of magnetoencephalography (MEG) signals. Previous studies have mainly quantified accuracy of this technique according to positional deviations and dispersion of estimated sources, but it remains unclear how accurately MEG source estimation restores information content represented by spatial patterns of brain activity. In this study, using simulated MEG signals representing artificial experimental conditions, we performed MEG source estimation and multivariate pattern analysis to examine whether MEG source estimation can restore information content represented by patterns of cortical current in source brain areas. Classification analysis revealed that the corresponding artificial experimental conditions were predicted accurately from patterns of cortical current estimated in the source brain areas. However, accurate predictions were also possible from brain areas whose original sources were not defined. Searchlight decoding further revealed that this unexpected prediction was possible across wide brain areas beyond the original source locations, indicating that information contained in the original sources can spread through MEG source estimation. This phenomenon of “information spreading” may easily lead to false-positive interpretations when MEG source estimation and classification analysis are combined to identify brain areas that represent target information. Real MEG data analyses also showed that presented stimuli were able to be predicted in the higher visual cortex at the same latency as in the primary visual cortex, also suggesting that information spreading took place. These results indicate that careful inspection is necessary to avoid false-positive interpretations when MEG source estimation and multivariate pattern analysis are combined. PMID:29912968
Association between gastric cancer and the Kyoto classification of gastritis.

PubMed

Shichijo, Satoki; Hirata, Yoshihiro; Niikura, Ryota; Hayakawa, Yoku; Yamada, Atsuo; Koike, Kazuhiko

2017-09-01

Histological gastritis is associated with gastric cancer, but its diagnosis requires biopsy. Many classifications of endoscopic gastritis are available, but not all are useful for risk stratification of gastric cancer. The Kyoto Classification of Gastritis was proposed at the 85th Congress of the Japan Gastroenterological Endoscopy Society. This cross-sectional study evaluated the usefulness of the Kyoto Classification of Gastritis for risk stratification of gastric cancer. From August 2013 to September 2014, esophagogastroduodenoscopy was performed and the gastric findings evaluated according to the Kyoto Classification of Gastritis in a total of 4062 patients. The following five endoscopic findings were selected based on previous reports: atrophy, intestinal metaplasia, enlarged folds, nodularity, and diffuse redness. A total of 3392 patients (1746 [51%] men and 1646 [49%] women) were analyzed. Among them, 107 gastric cancers were diagnosed. Atrophy was found in 2585 (78%) and intestinal metaplasia in 924 (27%). Enlarged folds, nodularity, and diffuse redness were found in 197 (5.8%), 22 (0.6%), and 573 (17%), respectively. In univariate analyses, the severity of atrophy, intestinal metaplasia, diffuse redness, age, and male sex were associated with gastric cancer. In a multivariate analysis, atrophy and male sex were found to be independent risk factors. Younger age and severe atrophy were determined to be associated with diffuse-type gastric cancer. Endoscopic detection of atrophy was associated with the risk of gastric cancer. Thus, patients with severe atrophy should be examined carefully and may require intensive follow-up. © 2017 Journal of Gastroenterology and Hepatology Foundation and John Wiley & Sons Australia, Ltd.
A comparative evaluation of Raman and fluorescence spectroscopy for optical diagnosis of oral neoplasia

NASA Astrophysics Data System (ADS)

Majumder, S. K.; Krishna, H.; Sidramesh, M.; Chaturvedi, P.; Gupta, P. K.

2011-08-01

We report the results of a comparative evaluation of in vivo fluorescence and Raman spectroscopy for diagnosis of oral neoplasia. The study carried out at Tata Memorial Hospital, Mumbai, involved 26 healthy volunteers and 138 patients being screened for neoplasm of oral cavity. Spectral measurements were taken from multiple sites of abnormal as well as apparently uninvolved contra-lateral regions of the oral cavity in each patient. The different tissue sites investigated belonged to one of the four histopathology categories: 1) squamous cell carcinoma (SCC), 2) oral sub-mucous fibrosis (OSMF), 3) leukoplakia (LP) and 4) normal squamous tissue. A probability based multivariate statistical algorithm utilizing nonlinear Maximum Representation and Discrimination Feature for feature extraction and Sparse Multinomial Logistic Regression for classification was developed for direct multi-class classification in a leave-one-patient-out cross validation mode. The results reveal that the performance of Raman spectroscopy is considerably superior to that of fluorescence in stratifying the oral tissues into respective histopathologic categories. The best classification accuracy was observed to be 90%, 93%, 94%, and 89% for SCC, SMF, leukoplakia, and normal oral tissues, respectively, on the basis of leave-one-patient-out cross-validation, with an overall accuracy of 91%. However, when a binary classification was employed to distinguish spectra from all the SCC, SMF and leukoplakik tissue sites together from normal, fluorescence and Raman spectroscopy were seen to have almost comparable performances with Raman yielding marginally better classification accuracy of 98.5% as compared to 94% of fluorescence.

Fourier Transform Infrared (FT-IR) and Laser Ablation Inductively Coupled Plasma-Mass Spectrometry (LA-ICP-MS) Imaging of Cerebral Ischemia: Combined Analysis of Rat Brain Thin Cuts Toward Improved Tissue Classification.

PubMed

Balbekova, Anna; Lohninger, Hans; van Tilborg, Geralda A F; Dijkhuizen, Rick M; Bonta, Maximilian; Limbeck, Andreas; Lendl, Bernhard; Al-Saad, Khalid A; Ali, Mohamed; Celikic, Minja; Ofner, Johannes

2018-02-01

Microspectroscopic techniques are widely used to complement histological studies. Due to recent developments in the field of chemical imaging, combined chemical analysis has become attractive. This technique facilitates a deepened analysis compared to single techniques or side-by-side analysis. In this study, rat brains harvested one week after induction of photothrombotic stroke were investigated. Adjacent thin cuts from rats' brains were imaged using Fourier transform infrared (FT-IR) microspectroscopy and laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS). The LA-ICP-MS data were normalized using an internal standard (a thin gold layer). The acquired hyperspectral data cubes were fused and subjected to multivariate analysis. Brain regions affected by stroke as well as unaffected gray and white matter were identified and classified using a model based on either partial least squares discriminant analysis (PLS-DA) or random decision forest (RDF) algorithms. The RDF algorithm demonstrated the best results for classification. Improved classification was observed in the case of fused data in comparison to individual data sets (either FT-IR or LA-ICP-MS). Variable importance analysis demonstrated that both molecular and elemental content contribute to the improved RDF classification. Univariate spectral analysis identified biochemical properties of the assigned tissue types. Classification of multisensor hyperspectral data sets using an RDF algorithm allows access to a novel and in-depth understanding of biochemical processes and solid chemical allocation of different brain regions.
Natural Resources Inventory and Land Evaluation in Switzerland

NASA Technical Reports Server (NTRS)

Haefner, H. (Principal Investigator)

1975-01-01

The author has identified the following significant results. A system was developed to operationally map and measure the areal extent of various land use categories for updating existing and producing new and actual thematic maps showing the latest state of rural and urban landscapes and its changes. The processing system includes: (1) preprocessing steps for radiometric and geometric corrections; (2) classification of the data by a multivariate procedure, using a stepwise linear discriminant analysis based on carefully selected training cells; and (3) output in form of color maps by printing black and white theme overlays of a selected scale with photomation system and its coloring and combination into a color composite.
Classification of Physical Activity: Information to Artificial Pancreas Control Systems in Real Time.

PubMed

Turksoy, Kamuran; Paulino, Thiago Marques Luz; Zaharieva, Dessi P; Yavelberg, Loren; Jamnik, Veronica; Riddell, Michael C; Cinar, Ali

2015-10-06

Physical activity has a wide range of effects on glucose concentrations in type 1 diabetes (T1D) depending on the type (ie, aerobic, anaerobic, mixed) and duration of activity performed. This variability in glucose responses to physical activity makes the development of artificial pancreas (AP) systems challenging. Automatic detection of exercise type and intensity, and its classification as aerobic or anaerobic would provide valuable information to AP control algorithms. This can be achieved by using a multivariable AP approach where biometric variables are measured and reported to the AP at high frequency. We developed a classification system that identifies, in real time, the exercise intensity and its reliance on aerobic or anaerobic metabolism and tested this approach using clinical data collected from 5 persons with T1D and 3 individuals without T1D in a controlled laboratory setting using a variety of common types of physical activity. The classifier had an average sensitivity of 98.7% for physiological data collected over a range of exercise modalities and intensities in these subjects. The classifier will be added as a new module to the integrated multivariable adaptive AP system to enable the detection of aerobic and anaerobic exercise for enhancing the accuracy of insulin infusion strategies during and after exercise. © 2015 Diabetes Technology Society.
Classification of Physical Activity

PubMed Central

Turksoy, Kamuran; Paulino, Thiago Marques Luz; Zaharieva, Dessi P.; Yavelberg, Loren; Jamnik, Veronica; Riddell, Michael C.; Cinar, Ali

2015-01-01

Physical activity has a wide range of effects on glucose concentrations in type 1 diabetes (T1D) depending on the type (ie, aerobic, anaerobic, mixed) and duration of activity performed. This variability in glucose responses to physical activity makes the development of artificial pancreas (AP) systems challenging. Automatic detection of exercise type and intensity, and its classification as aerobic or anaerobic would provide valuable information to AP control algorithms. This can be achieved by using a multivariable AP approach where biometric variables are measured and reported to the AP at high frequency. We developed a classification system that identifies, in real time, the exercise intensity and its reliance on aerobic or anaerobic metabolism and tested this approach using clinical data collected from 5 persons with T1D and 3 individuals without T1D in a controlled laboratory setting using a variety of common types of physical activity. The classifier had an average sensitivity of 98.7% for physiological data collected over a range of exercise modalities and intensities in these subjects. The classifier will be added as a new module to the integrated multivariable adaptive AP system to enable the detection of aerobic and anaerobic exercise for enhancing the accuracy of insulin infusion strategies during and after exercise. PMID:26443291
FT-Raman and NIR spectroscopy data fusion strategy for multivariate qualitative analysis of food fraud.

PubMed

Márquez, Cristina; López, M Isabel; Ruisánchez, Itziar; Callao, M Pilar

2016-12-01

Two data fusion strategies (high- and mid-level) combined with a multivariate classification approach (Soft Independent Modelling of Class Analogy, SIMCA) have been applied to take advantage of the synergistic effect of the information obtained from two spectroscopic techniques: FT-Raman and NIR. Mid-level data fusion consists of merging some of the previous selected variables from the spectra obtained from each spectroscopic technique and then applying the classification technique. High-level data fusion combines the SIMCA classification results obtained individually from each spectroscopic technique. Of the possible ways to make the necessary combinations, we decided to use fuzzy aggregation connective operators. As a case study, we considered the possible adulteration of hazelnut paste with almond. Using the two-class SIMCA approach, class 1 consisted of unadulterated hazelnut samples and class 2 of samples adulterated with almond. Models performance was also studied with samples adulterated with chickpea. The results show that data fusion is an effective strategy since the performance parameters are better than the individual ones: sensitivity and specificity values between 75% and 100% for the individual techniques and between 96-100% and 88-100% for the mid- and high-level data fusion strategies, respectively. Copyright © 2016 Elsevier B.V. All rights reserved.
Newer classification and regression tree techniques: Bagging and Random Forests for ecological prediction

Treesearch

Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw

2006-01-01

We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.
Study of archaeological coins of different dynasties using libs coupled with multivariate analysis

NASA Astrophysics Data System (ADS)

Awasthi, Shikha; Kumar, Rohit; Rai, G. K.; Rai, A. K.

2016-04-01

Laser Induced Breakdown Spectroscopy (LIBS) is an atomic emission spectroscopic technique having unique capability of an in-situ monitoring tool for detection and quantification of elements present in different artifacts. Archaeological coins collected form G.R. Sharma Memorial Museum; University of Allahabad, India has been analyzed using LIBS technique. These coins were obtained from excavation of Kausambi, Uttar Pradesh, India. LIBS system assembled in the laboratory (laser Nd:YAG 532 nm, 4 ns pulse width FWHM with Ocean Optics LIBS 2000+ spectrometer) is employed for spectral acquisition. The spectral lines of Ag, Cu, Ca, Sn, Si, Fe and Mg are identified in the LIBS spectra of different coins. LIBS along with Multivariate Analysis play an effective role for classification and contribution of spectral lines in different coins. The discrimination between five coins with Archaeological interest has been carried out using Principal Component Analysis (PCA). The results show the potential relevancy of the methodology used in the elemental identification and classification of artifacts with high accuracy and robustness.
Craters on Earth, Moon, and Mars: Multivariate classification and mode of origin

USGS Publications Warehouse

Pike, R.J.

1974-01-01

Testing extraterrestrial craters and candidate terrestrial analogs for morphologic similitude is treated as a problem in numerical taxonomy. According to a principal-components solution and a cluster analysis, 402 representative craters on the Earth, the Moon, and Mars divide into two major classes of contrasting shapes and modes of origin. Craters of net accumulation of material (cratered lunar domes, Martian "calderas," and all terrestrial volcanoes except maars and tuff rings) group apart from craters of excavation (terrestrial meteorite impact and experimental explosion craters, typical Martian craters, and all other lunar craters). Maars and tuff rings belong to neither group but are transitional. The classification criteria are four independent attributes of topographic geometry derived from seven descriptive variables by the principal-components transformation. Morphometric differences between crater bowl and raised rim constitute the strongest of the four components. Although single topographic variables cannot confidently predict the genesis of individual extraterrestrial craters, multivariate statistical models constructed from several variables can distinguish consistently between large impact craters and volcanoes. ?? 1974.
Neuroendocrine tumors of colon and rectum: validation of clinical and prognostic values of the World Health Organization 2010 grading classifications and European Neuroendocrine Tumor Society staging systems.

PubMed

Shen, Chaoyong; Yin, Yuan; Chen, Huijiao; Tang, Sumin; Yin, Xiaonan; Zhou, Zongguang; Zhang, Bo; Chen, Zhixin

2017-03-28

This study evaluated and compared the clinical and prognostic values of the grading criteria used by the World Health Organization (WHO) and the European Neuroendocrine Tumors Society (ENETS). Moreover, this work assessed the current best prognostic model for colorectal neuroendocrine tumors (CRNETs). The 2010 WHO classifications and the ENETS systems can both stratify the patients into prognostic groups, although the 2010 WHO criteria is more applicable to CRNET patients. Along with tumor location, the 2010 WHO criteria are important independent prognostic parameters for CRNETs in both univariate and multivariate analyses through Cox regression (P<0.05). Data from 192 consecutive patients histopathologically diagnosed with CRNETs and had undergone surgical resection from January 2009 to May 2016 in a single center were retrospectively analyzed. Findings suggest that the WHO classifications are superior over the ENETS classification system in predicting the prognosis of CRNETs. Additionally, the WHO classifications can be widely used in clinical practice.
The neural basis of visual word form processing: a multivariate investigation.

PubMed

Nestor, Adrian; Behrmann, Marlene; Plaut, David C

2013-07-01

Current research on the neurobiological bases of reading points to the privileged role of a ventral cortical network in visual word processing. However, the properties of this network and, in particular, its selectivity for orthographic stimuli such as words and pseudowords remain topics of significant debate. Here, we approached this issue from a novel perspective by applying pattern-based analyses to functional magnetic resonance imaging data. Specifically, we examined whether, where and how, orthographic stimuli elicit distinct patterns of activation in the human cortex. First, at the category level, multivariate mapping found extensive sensitivity throughout the ventral cortex for words relative to false-font strings. Secondly, at the identity level, the multi-voxel pattern classification provided direct evidence that different pseudowords are encoded by distinct neural patterns. Thirdly, a comparison of pseudoword and face identification revealed that both stimulus types exploit common neural resources within the ventral cortical network. These results provide novel evidence regarding the involvement of the left ventral cortex in orthographic stimulus processing and shed light on its selectivity and discriminability profile. In particular, our findings support the existence of sublexical orthographic representations within the left ventral cortex while arguing for the continuity of reading with other visual recognition skills.
On the effect of experimental noise on the classification of biological samples using Raman micro-spectroscopy

NASA Astrophysics Data System (ADS)

Barton, Sinead J.; Kerr, Laura T.; Domijan, Katarina; Hennelly, Bryan M.

2016-04-01

Raman micro-spectroscopy is an optoelectronic technique that can be used to evaluate the chemical composition of biological samples and has been shown to be a powerful diagnostic tool for the investigation of various cancer related diseases including bladder, breast, and cervical cancer. Raman scattering is an inherently weak process with approximately 1 in 107 photons undergoing scattering and for this reason, noise from the recording system can have a significant impact on the quality of the signal, and its suitability for diagnostic classification. The main sources of noise in the recorded signal are shot noise, CCD dark current, and CCD readout noise. Shot noise results from the low signal photon count while dark current results from thermally generated electrons in the semiconductor pixels. Both of these noise sources are time dependent; readout noise is time independent but is inherent in each individual recording and results in the fundamental limit of measurement, arising from the internal electronics of the camera. In this paper, each of the aforementioned noise sources are analysed in isolation, and used to experimentally validate a mathematical model. This model is then used to simulate spectra that might be acquired under various experimental conditions including the use of different cameras, different source wavelength, and power etc. Simulated noisy datasets of T24 and RT112 cell line spectra are generated based on true cell Raman spectrum irradiance values (recorded using very long exposure times) and the addition of simulated noise. These datasets are then input to multivariate classification using Principal Components Analysis and Linear Discriminant Analysis. This method enables an investigation into the effect of noise on the sensitivity and specificity of Raman based classification under various experimental conditions and using different equipment.
Evaluation of the biomarker candidate MFAP4 for non-invasive assessment of hepatic fibrosis in hepatitis C patients.

PubMed

Bracht, Thilo; Mölleken, Christian; Ahrens, Maike; Poschmann, Gereon; Schlosser, Anders; Eisenacher, Martin; Stühler, Kai; Meyer, Helmut E; Schmiegel, Wolff H; Holmskov, Uffe; Sorensen, Grith L; Sitek, Barbara

2016-07-04

The human microfibrillar-associated protein 4 (MFAP4) is located to extracellular matrix fibers and plays a role in disease-related tissue remodeling. Previously, we identified MFAP4 as a serum biomarker candidate for hepatic fibrosis and cirrhosis in hepatitis C patients. The aim of the present study was to elucidate the potential of MFAP4 as biomarker for hepatic fibrosis with a focus on the differentiation of no to moderate (F0-F2) and severe fibrosis stages and cirrhosis (F3 and F4, Desmet-Scheuer scoring system). MFAP4 levels were measured using an AlphaLISA immunoassay in a retrospective study including n = 542 hepatitis C patients. We applied a univariate logistic regression model based on MFAP4 serum levels and furthermore derived a multivariate model including also age and gender. Youden-optimal cutoffs for binary classification were determined for both models without restrictions and considering a lower limit of 80 % sensitivity (correct classification of F3 and F4), respectively. To assess the generalization error, leave-one-out cross validation (LOOCV) was performed. MFAP4 levels were shown to differ between no to moderate fibrosis stages F0-F2 and severe stages (F3 and F4) with high statistical significance (t test on log scale, p value <2.2·10(-16)). In the LOOCV, the univariate classification resulted in 85.8 % sensitivity and 54.9 % specificity while the multivariate model yielded 81.3 % sensitivity and 61.5 % specificity (restricted approaches). We confirmed the applicability of MFAP4 as a novel serum biomarker for assessment of hepatic fibrosis and identification of high-risk patients with severe fibrosis stages in hepatitis C. The combination of MFAP4 with existing tests might lead to a more accurate non-invasive diagnosis of hepatic fibrosis and allow a cost-effective disease management in the era of new direct acting antivirals.
Proposal of a new staging system for intrahepatic cholangiocarcinoma: Analysis of surgical patients from a nationwide survey of the Liver Cancer Study Group of Japan.

PubMed

Sakamoto, Yoshihiro; Kokudo, Norihiro; Matsuyama, Yutaka; Sakamoto, Michiie; Izumi, Namiki; Kadoya, Masumi; Kaneko, Shuichi; Ku, Yonson; Kudo, Masatoshi; Takayama, Tadatoshi; Nakashima, Osamu

2016-01-01

In the current American Joint Committee on Cancer/International Union Against Cancer staging system (seventh edition) for intrahepatic cholangiocarcinoma (ICC), tumor size was excluded, and periductal invasion was added as a new tumor classification-defining factor. The objective of the current report was to propose a new staging system for ICC that would be better for stratifying the survival of patients based on data from the nationwide Liver Cancer Study Group of Japan database. Of 756 patients who underwent surgical resection for ICC between 2000 and 2005, multivariate analyses of the clinicopathologic factors of 419 patients who had complete data sets were performed to elucidate relevant factors for inclusion in a new tumor classification and staging system. Overall survival data were best stratified using a cutoff value of 2 cm using a minimal P value approach to discriminate patient survival. The 5-year survival rate of 15 patients who had ICC measuring ≤ 2 cm in greatest dimension without lymph node metastasis or vascular invasion was 100%, and this cohort was defined as T1. Multivariate analysis of prognostic factors for 267 patients with lymph node-negative and metastasis-negative (N0M0) disease indicated that the number of tumors, the presence arterial invasion, and the presence major biliary invasion were independent and significant prognostic factors. The proposed new system, which included tumor number, tumor size, arterial invasion, and major biliary invasion for tumor classification, provided good stratification of overall patient survival according to disease stage. Macroscopic periductal invasion was associated with major biliary invasion and an inferior prognosis. The proposed new staging system, which includes a tumor cutoff size of 2 cm and major biliary invasion, may be useful for assigning patients to surgery. © 2015 The Authors. Cancer published by Wiley Periodicals, Inc. on behalf of American Cancer Society.
The Classification of Ground Roasted Decaffeinated Coffee Using UV-VIS Spectroscopy and SIMCA Method

NASA Astrophysics Data System (ADS)

Yulia, M.; Asnaning, A. R.; Suhandy, D.

2018-05-01

In this work, an investigation on the classification between decaffeinated and non- decaffeinated coffee samples using UV-VIS spectroscopy and SIMCA method was investigated. Total 200 samples of ground roasted coffee were used (100 samples for decaffeinated coffee and 100 samples for non-decaffeinated coffee). After extraction and dilution, the spectra of coffee samples solution were acquired using a UV-VIS spectrometer (Genesys™ 10S UV-VIS, Thermo Scientific, USA) in the range of 190-1100 nm. The multivariate analyses of the spectra were performed using principal component analysis (PCA) and soft independent modeling of class analogy (SIMCA). The SIMCA model showed that the classification between decaffeinated and non-decaffeinated coffee samples was detected with 100% sensitivity and specificity.
Update on Automated Classification of Interplanetary Dust Particles

NASA Technical Reports Server (NTRS)

Maroger, I.; Lasue, J.; Zolensky, M.

2018-01-01

Every year, the Earth accretes about 40,000 tons of extraterrestrial material less than 1 mm in size on its surface. These dust particles originate from active comets, from impacts between asteroids and may also be coming from interstellar space for the very small particles. Since 1981, NASA Jonhson Space Center (JSC) has been systematically collecting the dust from Earth's strastosphere by airborne collectors and gathered them into "Cosmic Dust Catalogs". In those catalogs, a preliminary analysis of the dust particles based on SEM images, some geological characteristics and X-ray energy-dispersive spectrometry (EDS) composition is compiled. Based on those properties, the IDPs are classified into four main groups: C (Cosmic), TCN (Natural Terrestrial Contaminant), TCA (Artificial Terrestrial Contaminant) and AOS (Aluminium Oxide Sphere). Nevertheless, 20% of those particles remain ambiguously classified. Lasue et al. presented a methodology to help automatically classify the particles published in the catalog 15 based on their EDS spectra and nonlinear multivariate projections (as shown in Fig. 1). This work allowed to relabel 155 particles out of the 467 particles in catalog 15 and reclassify some contaminants as potential cosmic dusts. Further analyses of three such particles indicated their probable cosmic origin. The current work aims to bring complementary information to the automatic classification of IDPs to improve identification criteria.
Structure/activity relationships for biodegradability and their role in environmental assessment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boethling, R.S.

1994-12-31

Assessment of biodegradability is an important part of the review process for both new and existing chemicals under the Toxic Substances Control Act. It is often necessary to estimate biodegradability because experimental data are unavailable. Structure/biodegradability relationships (SBR) are a means to this end. Quantitative SBR have been developed, but this approach has not been very useful because they apply only to a few narrowly defined classes of chemicals. In response to the need for more widely applicable methods, multivariate analysis has been used to develop biodegradability classification models. For example, recent efforts have produced four new models. Two calculatemore » the probability of rapid biodegradation and can be used for classification; the other two models allow semi-quantitative estimation of primary and ultimate biodegradation rates. All are based on multiple regressions against 36 preselected substructures plus molecular weight. Such efforts have been fairly successful by statistical criteria, but in general are hampered by a lack of large and consistent datasets. Knowledge-based expert systems may represent the next step in the evolution of SBR. In principle such systems need not be as severely limited by imperfect datasets. However, the codification of expert knowledge and reasoning is a critical prerequisite. Results of knowledge acquisition exercises and modeling based on them will also be described.« less
Characteristic fingerprinting based on macamides for discrimination of maca (Lepidium meyenii) by LC/MS/MS and multivariate statistical analysis.

PubMed

Pan, Yu; Zhang, Ji; Li, Hong; Wang, Yuan-Zhong; Li, Wan-Yi

2016-10-01

Macamides with a benzylalkylamide nucleus are characteristic and major bioactive compounds in the functional food maca (Lepidium meyenii Walp). The aim of this study was to explore variations in macamide content among maca from China and Peru. Twenty-seven batches of maca hypocotyls with different phenotypes, sampled from different geographical origins, were extracted and profiled by liquid chromatography with ultraviolet detection/tandem mass spectrometry (LC-UV/MS/MS). Twelve macamides were identified by MS operated in multiple scanning modes. Similarity analysis showed that maca samples differed significantly in their macamide fingerprinting. Partial least squares discriminant analysis (PLS-DA) was used to differentiate samples according to their geographical origin and to identify the most relevant variables in the classification model. The prediction accuracy for raw maca was 91% and five macamides were selected and considered as chemical markers for sample classification. When combined with a PLS-DA model, characteristic fingerprinting based on macamides could be recommended for labelling for the authentication of maca from different geographical origins. The results provided potential evidence for the relationships between environmental or other factors and distribution of macamides. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Classification of communication signals of the little brown bat

NASA Astrophysics Data System (ADS)

Melendez, Karla V.; Jones, Douglas L.; Feng, Albert S.

2005-09-01

Little brown bats, Myotis lucifugus, are known for their ability to echolocate and utilize their echolocation system to navigate, locate, and identify prey. Their echolocation signals have been characterized in detail, but their communication signals are poorly understood despite their widespread use during the social interactions. The goal of this study was to characterize the communication signals of little brown bats. Sound recordings were made overnight on five individual bats (housed separately from a large group of captive bats) for 7 nights, using a Pettersson ultrasound detector D240x bat detector and Nagra ARES-BB digital recorder. The spectral and temporal characteristics of recorded sounds were first analyzed using BATSOUND software from Pettersson. Sounds were first classified by visual observation of calls' temporal pattern and spectral composition, and later using an automatic classification scheme based on multivariate statistical parameters in MATLAB. Human- and machine-based analysis revealed five discrete classes of bat's communication signals: downward frequency-modulated calls, constant frequency calls, broadband noise bursts, broadband chirps, and broadband click trains. Future studies will focus on analysis of calls' spectrotemporal modulations to discriminate any subclasses that may exist. [Research supported by Grant R01-DC-04998 from the National Institute for Deafness and Communication Disorders.
Clinical presentation and outcome prediction of clinical, serological, and histopathological classification schemes in ANCA-associated vasculitis with renal involvement.

PubMed

Córdova-Sánchez, Bertha M; Mejía-Vilet, Juan M; Morales-Buenrostro, Luis E; Loyola-Rodríguez, Georgina; Uribe-Uribe, Norma O; Correa-Rotter, Ricardo

2016-07-01

Several classification schemes have been developed for anti-neutrophil cytoplasmic antibody (ANCA)-associated vasculitis (AAV), with actual debate focusing on their clinical and prognostic performance. Sixty-two patients with renal biopsy-proven AAV from a single center in Mexico City diagnosed between 2004 and 2013 were analyzed and classified under clinical (granulomatosis with polyangiitis [GPA], microscopic polyangiitis [MPA], renal limited vasculitis [RLV]), serological (proteinase 3 anti-neutrophil cytoplasmic antibodies [PR3-ANCA], myeloperoxidase anti-neutrophil cytoplasmic antibodies [MPO-ANCA], ANCA negative), and histopathological (focal, crescenteric, mixed-type, sclerosing) categories. Clinical presentation parameters were compared at baseline between classification groups, and the predictive value of different classification categories for disease and renal remission, relapse, renal, and patient survival was analyzed. Serological classification predicted relapse rate (PR3-ANCA hazard ratio for relapse 2.93, 1.20-7.17, p = 0.019). There were no differences in disease or renal remission, renal, or patient survival between clinical and serological categories. Histopathological classification predicted response to therapy, with a poorer renal remission rate for sclerosing group and those with less than 25 % normal glomeruli; in addition, it adequately delimited 24-month glomerular filtration rate (eGFR) evolution, but it did not predict renal nor patient survival. On multivariate models, renal replacement therapy (RRT) requirement (HR 8.07, CI 1.75-37.4, p = 0.008) and proteinuria (HR 1.49, CI 1.03-2.14, p = 0.034) at presentation predicted renal survival, while age (HR 1.10, CI 1.01-1.21, p = 0.041) and infective events during the induction phase (HR 4.72, 1.01-22.1, p = 0.049) negatively influenced patient survival. At present, ANCA-based serological classification may predict AAV relapses, but neither clinical nor serological categories predict renal or patient survival. Age, renal function and proteinuria at presentation, histopathology, and infectious complications constitute the main outcome predictors and should be considered for individualized management.
The classification of gunshot residue using laser electrospray mass spectrometry and offline multivariate statistical analysis

USDA-ARS?s Scientific Manuscript database

Nonresonant laser vaporization combined with high-resolution electrospray time-of-flight mass spectrometry enables analysis of a casing after discharge of a firearm revealing organic signature molecules including methyl centralite (MC), diphenylamine (DPA), N-nitrosodiphenylamine (N-NO-DPA), 4-nitro...

Mapping the Diversity among Runaways: A Descriptive Multivariate Analysis of Selected Social Psychological Background Conditions.

ERIC Educational Resources Information Center

Brennan, Tim

1980-01-01

A review of prior classification systems of runaways is followed by a descriptive taxonomy of runaways developed using cluster-analytic methods. The empirical types illustrate patterns of weakness in bonds between runaways and families, schools, or peer relationships. (Author)
Fatty acid methyl ester analysis to identify sources of soil in surface water.

PubMed

Banowetz, Gary M; Whittaker, Gerald W; Dierksen, Karen P; Azevedo, Mark D; Kennedy, Ann C; Griffith, Stephen M; Steiner, Jeffrey J

2006-01-01

Efforts to improve land-use practices to prevent contamination of surface waters with soil are limited by an inability to identify the primary sources of soil present in these waters. We evaluated the utility of fatty acid methyl ester (FAME) profiles of dry reference soils for multivariate statistical classification of soils collected from surface waters adjacent to agricultural production fields and a wooded riparian zone. Trials that compared approaches to concentrate soil from surface water showed that aluminum sulfate precipitation provided comparable yields to that obtained by vacuum filtration and was more suitable for handling large numbers of samples. Fatty acid methyl ester profiles were developed from reference soils collected from contrasting land uses in different seasons to determine whether specific fatty acids would consistently serve as variables in multivariate statistical analyses to permit reliable classification of soils. We used a Bayesian method and an independent iterative process to select appropriate fatty acids and found that variable selection was strongly impacted by the season during which soil was collected. The apparent seasonal variation in the occurrence of marker fatty acids in FAME profiles from reference soils prevented preparation of a standardized set of variables. Nevertheless, accurate classification of soil in surface water was achieved utilizing fatty acid variables identified in seasonally matched reference soils. Correlation analysis of entire chromatograms and subsequent discriminant analyses utilizing a restricted number of fatty acid variables showed that FAME profiles of soils exposed to the aquatic environment still had utility for classification at least 1 wk after submersion.
Multiclass fMRI data decoding and visualization using supervised self-organizing maps.

PubMed

Hausfeld, Lars; Valente, Giancarlo; Formisano, Elia

2014-08-01

When multivariate pattern decoding is applied to fMRI studies entailing more than two experimental conditions, a most common approach is to transform the multiclass classification problem into a series of binary problems. Furthermore, for decoding analyses, classification accuracy is often the only outcome reported although the topology of activation patterns in the high-dimensional features space may provide additional insights into underlying brain representations. Here we propose to decode and visualize voxel patterns of fMRI datasets consisting of multiple conditions with a supervised variant of self-organizing maps (SSOMs). Using simulations and real fMRI data, we evaluated the performance of our SSOM-based approach. Specifically, the analysis of simulated fMRI data with varying signal-to-noise and contrast-to-noise ratio suggested that SSOMs perform better than a k-nearest-neighbor classifier for medium and large numbers of features (i.e. 250 to 1000 or more voxels) and similar to support vector machines (SVMs) for small and medium numbers of features (i.e. 100 to 600voxels). However, for a larger number of features (>800voxels), SSOMs performed worse than SVMs. When applied to a challenging 3-class fMRI classification problem with datasets collected to examine the neural representation of three human voices at individual speaker level, the SSOM-based algorithm was able to decode speaker identity from auditory cortical activation patterns. Classification performances were similar between SSOMs and other decoding algorithms; however, the ability to visualize decoding models and underlying data topology of SSOMs promotes a more comprehensive understanding of classification outcomes. We further illustrated this visualization ability of SSOMs with a re-analysis of a dataset examining the representation of visual categories in the ventral visual cortex (Haxby et al., 2001). This analysis showed that SSOMs could retrieve and visualize topography and neighborhood relations of the brain representation of eight visual categories. We conclude that SSOMs are particularly suited for decoding datasets consisting of more than two classes and are optimally combined with approaches that reduce the number of voxels used for classification (e.g. region-of-interest or searchlight approaches). Copyright © 2014. Published by Elsevier Inc.
Big genomics and clinical data analytics strategies for precision cancer prognosis.

PubMed

Ow, Ghim Siong; Kuznetsov, Vladimir A

2016-11-07

The field of personalized and precise medicine in the era of big data analytics is growing rapidly. Previously, we proposed our model of patient classification termed Prognostic Signature Vector Matching (PSVM) and identified a 37 variable signature comprising 36 let-7b associated prognostic significant mRNAs and the age risk factor that stratified large high-grade serous ovarian cancer patient cohorts into three survival-significant risk groups. Here, we investigated the predictive performance of PSVM via optimization of the prognostic variable weights, which represent the relative importance of one prognostic variable over the others. In addition, we compared several multivariate prognostic models based on PSVM with classical machine learning techniques such as K-nearest-neighbor, support vector machine, random forest, neural networks and logistic regression. Our results revealed that negative log-rank p-values provides more robust weight values as opposed to the use of other quantities such as hazard ratios, fold change, or a combination of those factors. PSVM, together with the classical machine learning classifiers were combined in an ensemble (multi-test) voting system, which collectively provides a more precise and reproducible patient stratification. The use of the multi-test system approach, rather than the search for the ideal classification/prediction method, might help to address limitations of the individual classification algorithm in specific situation.
Land cover and land use mapping of the iSimangaliso Wetland Park, South Africa: comparison of oblique and orthogonal random forest algorithms

NASA Astrophysics Data System (ADS)

Bassa, Zaakirah; Bob, Urmilla; Szantoi, Zoltan; Ismail, Riyad

2016-01-01

In recent years, the popularity of tree-based ensemble methods for land cover classification has increased significantly. Using WorldView-2 image data, we evaluate the potential of the oblique random forest algorithm (oRF) to classify a highly heterogeneous protected area. In contrast to the random forest (RF) algorithm, the oRF algorithm builds multivariate trees by learning the optimal split using a supervised model. The oRF binary algorithm is adapted to a multiclass land cover and land use application using both the "one-against-one" and "one-against-all" combination approaches. Results show that the oRF algorithms are capable of achieving high classification accuracies (>80%). However, there was no statistical difference in classification accuracies obtained by the oRF algorithms and the more popular RF algorithm. For all the algorithms, user accuracies (UAs) and producer accuracies (PAs) >80% were recorded for most of the classes. Both the RF and oRF algorithms poorly classified the indigenous forest class as indicated by the low UAs and PAs. Finally, the results from this study advocate and support the utility of the oRF algorithm for land cover and land use mapping of protected areas using WorldView-2 image data.
Robust diagnosis of non-Hodgkin lymphoma phenotypes validated on gene expression data from different laboratories.

PubMed

Bhanot, Gyan; Alexe, Gabriela; Levine, Arnold J; Stolovitzky, Gustavo

2005-01-01

A major challenge in cancer diagnosis from microarray data is the need for robust, accurate, classification models which are independent of the analysis techniques used and can combine data from different laboratories. We propose such a classification scheme originally developed for phenotype identification from mass spectrometry data. The method uses a robust multivariate gene selection procedure and combines the results of several machine learning tools trained on raw and pattern data to produce an accurate meta-classifier. We illustrate and validate our method by applying it to gene expression datasets: the oligonucleotide HuGeneFL microarray dataset of Shipp et al. (www.genome.wi.mit.du/MPR/lymphoma) and the Hu95Av2 Affymetrix dataset (DallaFavera's laboratory, Columbia University). Our pattern-based meta-classification technique achieves higher predictive accuracies than each of the individual classifiers , is robust against data perturbations and provides subsets of related predictive genes. Our techniques predict that combinations of some genes in the p53 pathway are highly predictive of phenotype. In particular, we find that in 80% of DLBCL cases the mRNA level of at least one of the three genes p53, PLK1 and CDK2 is elevated, while in 80% of FL cases, the mRNA level of at most one of them is elevated.
Interannual rainfall variability and SOM-based circulation classification

NASA Astrophysics Data System (ADS)

Wolski, Piotr; Jack, Christopher; Tadross, Mark; van Aardenne, Lisa; Lennard, Christopher

2018-01-01

Self-Organizing Maps (SOM) based classifications of synoptic circulation patterns are increasingly being used to interpret large-scale drivers of local climate variability, and as part of statistical downscaling methodologies. These applications rely on a basic premise of synoptic climatology, i.e. that local weather is conditioned by the large-scale circulation. While it is clear that this relationship holds in principle, the implications of its implementation through SOM-based classification, particularly at interannual and longer time scales, are not well recognized. Here we use a SOM to understand the interannual synoptic drivers of climate variability at two locations in the winter and summer rainfall regimes of South Africa. We quantify the portion of variance in seasonal rainfall totals that is explained by year to year differences in the synoptic circulation, as schematized by a SOM. We furthermore test how different spatial domain sizes and synoptic variables affect the ability of the SOM to capture the dominant synoptic drivers of interannual rainfall variability. Additionally, we identify systematic synoptic forcing that is not captured by the SOM classification. The results indicate that the frequency of synoptic states, as schematized by a relatively disaggregated SOM (7 × 9) of prognostic atmospheric variables, including specific humidity, air temperature and geostrophic winds, captures only 20-45% of interannual local rainfall variability, and that the residual variance contains a strong systematic component. Utilising a multivariate linear regression framework demonstrates that this residual variance can largely be explained using synoptic variables over a particular location; even though they are used in the development of the SOM their influence, however, diminishes with the size of the SOM spatial domain. The influence of the SOM domain size, the choice of SOM atmospheric variables and grid-point explanatory variables on the levels of explained variance, is consistent with the general understanding of the dominant processes and atmospheric variables that affect rainfall variability at a particular location.
Classification of Spanish white wines using their electrophoretic profiles obtained by capillary zone electrophoresis with amperometric detection.

PubMed

Arribas, Alberto Sánchez; Martínez-Fernández, Marta; Moreno, Mónica; Bermejo, Esperanza; Zapardiel, Antonio; Chicharro, Manuel

2014-06-01

A method was developed for the simultaneous detection of eight polyphenols (t-resveratrol, (+)-catechin, quercetin and p-coumaric, caffeic, sinapic, ferulic, and gallic acids) by CZE with electrochemical detection. Separation of these polyphenols was achieved within 25 min using a 200 mM borate buffer (pH 9.4) containing 10% methanol as separation electrolyte. Amperometric detection of polyphenols was carried out with a glassy carbon electrode (GCE) modified with a multiwalled carbon nanotubes (CNT) layer obtained from a dispersion of CNT in polyethylenimine. The excellent electrochemical properties of this modified electrode allowed the detection and quantification of the selected polyphenols in white wines without any pretreatment step, showing remarkable signal stability despite the presence of potential fouling substances in wine. The electrophoretic profiles of white wines, obtained using this methodology, have proven to be useful for the classification of these wines by means of chemometric multivariate techniques. Principal component analysis and discriminant analysis allowed accurate classification of wine samples on the basis of their grape varietal (verdejo and airén) using the information contained in selected zones of the electropherogram. The utility of the proposed CZE methodology based on the electrochemical response of CNT-modified electrodes appears to be promising in the field of wine industry and it is expected to be successfully extended to classification of a wider range of wines made of other grape varietals. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis

PubMed Central

Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.

2017-01-01

Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571
Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis.

PubMed

Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L

2017-02-14

Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.
Artificial Neural Networks in Policy Research: A Current Assessment.

ERIC Educational Resources Information Center

Woelfel, Joseph

1993-01-01

Suggests that artificial neural networks (ANNs) exhibit properties that promise usefulness for policy researchers. Notes that ANNs have found extensive use in areas once reserved for multivariate statistical programs such as regression and multiple classification analysis and are developing an extensive community of advocates for processing text…
MMPI Modal Profiles in a Juvenile Delinquent Population.

ERIC Educational Resources Information Center

Pickett, Lawrence K., Jr.

1981-01-01

The MMPI results obtained from 245 adolescent males referred to the evaluation unit of a Juvenile Court were submitted to a multivariate classification system. By correlating individual subject profiles with the modal profiles, six membership groups were formed. No relationship was found between group membership and age or race. (Author)
Impact of Resident Rotations on Critically Ill Patient Outcomes: Results of a French Multicenter Observational Study.

PubMed

Chousterman, Benjamin G; Pirracchio, Romain; Guidet, Bertrand; Aegerter, Philippe; Mentec, Hervé

2016-01-01

The impact of resident rotation on patient outcomes in the intensive care unit (ICU) has been poorly studied. The aim of this study was to address this question using a large ICU database. We retrospectively analyzed the French CUB-REA database. French residents rotate every six months. Two periods were compared: the first (POST) and fifth (PRE) months of the rotation. The primary endpoint was ICU mortality. The secondary endpoints were the length of ICU stay (LOS), the number of organ supports, and the duration of mechanical ventilation (DMV). The impact of resident rotation was explored using multivariate regression, classification tree and random forest models. 262,772 patients were included between 1996 and 2010 in the database. The patient characteristics were similar between the PRE (n = 44,431) and POST (n = 49,979) periods. Multivariate analysis did not reveal any impact of resident rotation on ICU mortality (OR = 1.01, 95% CI = 0.94; 1.07, p = 0.91). Based on the classification trees, the SAPS II and the number of organ failures were the strongest predictors of ICU mortality. In the less severe patients (SAPS II<24), the POST period was associated with increased mortality (OR = 1.65, 95%CI = 1.17-2.33, p = 0.004). After adjustment, no significant association was observed between the rotation period and the LOS, the number of organ supports, or the DMV. Resident rotation exerts no impact on overall ICU mortality at French teaching hospitals but might affect the prognosis of less severe ICU patients. Surveillance should be reinforced when treating those patients.
Rapid determination of chemical composition and classification of bamboo fractions using visible-near infrared spectroscopy coupled with multivariate data analysis.

PubMed

Yang, Zhong; Li, Kang; Zhang, Maomao; Xin, Donglin; Zhang, Junhua

2016-01-01

During conversion of bamboo into biofuels and chemicals, it is necessary to efficiently predict the chemical composition and digestibility of biomass. However, traditional methods for determination of lignocellulosic biomass composition are expensive and time consuming. In this work, a novel and fast method for quantitative and qualitative analysis of chemical composition and enzymatic digestibilities of juvenile bamboo and mature bamboo fractions (bamboo green, bamboo timber, bamboo yellow, bamboo node, and bamboo branch) using visible-near infrared spectra was evaluated. The developed partial least squares models yielded coefficients of determination in calibration of 0.88, 0.94, and 0.96, for cellulose, xylan, and lignin of bamboo fractions in raw spectra, respectively. After visible-near infrared spectra being pretreated, the corresponding coefficients of determination in calibration yielded by the developed partial least squares models are 0.994, 0.990, and 0.996, respectively. The score plots of principal component analysis of mature bamboo, juvenile bamboo, and different fractions of mature bamboo were obviously distinguished in raw spectra. Based on partial least squares discriminant analysis, the classification accuracies of mature bamboo, juvenile bamboo, and different fractions of bamboo (bamboo green, bamboo timber, bamboo yellow, and bamboo branch) all reached 100 %. In addition, high accuracies of evaluation of the enzymatic digestibilities of bamboo fractions after pretreatment with aqueous ammonia were also observed. The results showed the potential of visible-near infrared spectroscopy in combination with multivariate analysis in efficiently analyzing the chemical composition and hydrolysabilities of lignocellulosic biomass, such as bamboo fractions.
Parallel Multivariate Spatio-Temporal Clustering of Large Ecological Datasets on Hybrid Supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sreepathi, Sarat; Kumar, Jitendra; Mills, Richard T.

A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like themore » Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.« less
A multivariate analytical method to characterize sediment attributes from high-frequency acoustic backscatter and ground-truthing data (Jade Bay, German North Sea coast)

NASA Astrophysics Data System (ADS)

Biondo, Manuela; Bartholomä, Alexander

2017-04-01

One of the burning issues on the topic of acoustic seabed classification is the lack of solid, repeatable, statistical procedures that can support the verification of acoustic variability in relation to seabed properties. Acoustic sediment classification schemes often lead to biased and subjective interpretation, as they ultimately aim at an oversimplified categorization of the seabed based on conventionally defined sediment types. However, grain size variability alone cannot be accounted for acoustic diversity, which will be ultimately affected by multiple physical processes, scale of heterogeneity, instrument settings, data quality, image processing and segmentation performances. Understanding and assessing the weight of all of these factors on backscatter is a difficult task, due to the spatially limited and fragmentary knowledge of the seabed from of direct observations (e.g. grab samples, cores, videos). In particular, large-scale mapping requires an enormous availability of ground-truthing data that is often obtained from heterogeneous and multidisciplinary sources, resulting into a further chance of misclassification. Independently from all of these limitations, acoustic segments still contain signals for seabed changes that, if appropriate procedures are established, can be translated into meaningful knowledge. In this study we design a simple, repeatable method, based on multivariate procedures, with the scope to classify a 100 km2, high-frequency (450 kHz) sidescan sonar mosaic acquired in the year 2012 in the shallow upper-mesotidal inlet of the Jade Bay (German North Sea coast). The tool used for the automated classification of the backscatter mosaic is the QTC SWATHVIEWTMsoftware. The ground-truthing database included grab sample data from multiple sources (2009-2011). The method was designed to extrapolate quantitative descriptors for acoustic backscatter and model their spatial changes in relation to grain size distribution and morphology. The modelled relationships were used to: 1) asses the automated segmentation performance, 2) obtain a ranking of most discriminant seabed attributes responsible for acoustic diversity, 3) select the best-fit ground-truthing information to characterize each acoustic class. Using a supervised Linear Discriminant Analysis (LDA), relationships between seabed parameters and acoustic classes discrimination were modelled, and acoustic classes for each data point were predicted. The model predicted a success rate of 63.5%. An unsupervised LDA was used to model relationships between acoustic variables and clustered seabed categories with the scope of identifying misrepresentative ground-truthing data points. The model prediction scored a success rate of 50.8%. Misclassified data points were disregarded for final classification. Analyses led to clearer, more accurate appreciation of relationship patterns and improved understanding of site-specific processes affecting the acoustic signal. Value to the qualitative classification output was added by comparing the latter with a more recent set of acoustic and ground-truthing information (2014). Classification resulted in the first acoustic sediment map ever produced in the area and offered valuable knowledge for detailed sediment variability. The method proved to be a simple, repeatable strategy that may be applied to similar work and environments.
Real-time Neuroimaging and Cognitive Monitoring Using Wearable Dry EEG

PubMed Central

Mullen, Tim R.; Kothe, Christian A.E.; Chi, Mike; Ojeda, Alejandro; Kerth, Trevor; Makeig, Scott; Jung, Tzyy-Ping; Cauwenberghs, Gert

2015-01-01

Goal We present and evaluate a wearable high-density dry electrode EEG system and an open-source software framework for online neuroimaging and state classification. Methods The system integrates a 64-channel dry EEG form-factor with wireless data streaming for online analysis. A real-time software framework is applied, including adaptive artifact rejection, cortical source localization, multivariate effective connectivity inference, data visualization, and cognitive state classification from connectivity features using a constrained logistic regression approach (ProxConn). We evaluate the system identification methods on simulated 64-channel EEG data. Then we evaluate system performance, using ProxConn and a benchmark ERP method, in classifying response errors in 9 subjects using the dry EEG system. Results Simulations yielded high accuracy (AUC=0.97±0.021) for real-time cortical connectivity estimation. Response error classification using cortical effective connectivity (sdDTF) was significantly above chance with similar performance (AUC) for cLORETA (0.74±0.09) and LCMV (0.72±0.08) source localization. Cortical ERP-based classification was equivalent to ProxConn for cLORETA (0.74±0.16) but significantly better for LCMV (0.82±0.12). Conclusion We demonstrated the feasibility for real-time cortical connectivity analysis and cognitive state classification from high-density wearable dry EEG. Significance This paper is the first validated application of these methods to 64-channel dry EEG. The work addresses a need for robust real-time measurement and interpretation of complex brain activity in the dynamic environment of the wearable setting. Such advances can have broad impact in research, medicine, and brain-computer interfaces. The pipelines are made freely available in the open-source SIFT and BCILAB toolboxes. PMID:26415149
An NRG Oncology/GOG study of molecular classification for risk prediction in endometrioid endometrial cancer

PubMed Central

Cosgrove, Casey M; Tritchler, David L; Cohn, David E; Mutch, David G; Rush, Craig M; Lankes, Heather A; Creasman, William T.; Miller, David S; Ramirez, Nilsa C; Geller, Melissa A; Powell, Matthew A; Backes, Floor J; Landrum, Lisa M; Timmers, Cynthia; Suarez, Adrian A; Zaino, Richard J; Pearl, Michael L; DiSilvestro, Paul A; Lele, Shashikant B; Goodfellow, Paul J

2017-01-01

Objectives The purpose of this study was to assess the prognostic significance of a simplified, clinically accessible classification system for endometrioid endometrial cancers combining Lynch syndrome screening and molecular risk stratification. Methods Tumors from NRG/GOG GOG210 were evaluated for mismatch repair defects (MSI, MMR IHC, and MLH1 methylation), POLE mutations, and loss of heterozygosity. TP53 was evaluated in a subset of cases. Tumors were assigned to four molecular classes. Relationships between molecular classes and clinicopathologic variables were assessed using contingency tests and Cox proportional methods. Results Molecular classification was successful for 982 tumors. Based on the NCI consensus MSI panel assessing MSI and loss of heterozygosity combined with POLE testing, 49% of tumors were classified copy number stable (CNS), 39% MMR deficient, 8% copy number altered (CNA) and 4% POLE mutant. Cancer-specific mortality occurred in 5% of patients with CNS tumors; 2.6% with POLE tumors; 7.6% with MMR deficient tumors and 19% with CNA tumors. The CNA group had worse progression-free (HR 2.31, 95%CI 1.53–3.49) and cancer-specific survival (HR 3.95; 95%CI 2.10–7.44). The POLE group had improved outcomes, but the differences were not statistically significant. CNA class remained significant for cancer-specific survival (HR 2.11; 95%CI 1.04–4.26) in multivariable analysis. The CNA molecular class was associated with TP53 mutation and expression status. Conclusions A simple molecular classification for endometrioid endometrial cancers that can be easily combined with Lynch syndrome screening provides important prognostic information. These findings support prospective clinical validation and further studies on the predictive value of a simplified molecular classification system. PMID:29132872
Mild Depressive Symptoms Among Americans in Relation to Physical Activity, Current Overweight/Obesity, and Self-Reported History of Overweight/Obesity.

PubMed

Dankel, Scott J; Loenneke, Jeremy P; Loprinzi, Paul D

2016-10-01

Overweight/obese individuals are at an increased risk for depression with some evidence of a bidirectional association. The preventative effects of physical activity among overweight/obese individuals have been well documented; however, less is known on how the duration of overweight/obesity alters the association with negative health outcomes. Therefore, the purpose of this investigation was to determine how the classification, and more specifically duration, of overweight/obesity alters the association between physical activity and depressive symptoms. The 2005-2006 National Health and Nutrition Examination Survey (NHANES) data were used (n = 764), and individuals were divided into six mutually exclusive groups based on physical activity status, weight classification (measured BMI), and duration of weight classification (assessed via recall). Multivariable linear and logistic regression analyses were computed to examine odds of depressive symptoms (patient health questionnaire (PHQ)-9) among groups. After adjusting for covariates, only individuals who were inactive and overweight/obese at the examination and 10 years prior were at an increased odds of depressive symptoms in comparison to those who were active and normal weight (odds ratio (OR) = 2.40; 95 % confidence interval (CI) 1.03, 5.61; p = 0.04). Physical activity appeared to ameliorate the association with depressive symptoms independent of overweight/obesity classification or duration. The cyclic nature of overweight/obesity and depression (i.e., bidirectional association) appears to increase the odds of depression as the length of overweight/obesity is increased. These results provide support for clinicians to assess not only their clients' current BMI but also the duration in which they have been at a certain weight classification and to further promote physical activity as a preventative measure against depressive symptoms.
Can the new RCP R0/R1 classification predict the clinical outcome in ductal adenocarcinoma of the pancreatic head?

PubMed

Janot, M S; Kersting, S; Belyaev, O; Matuschek, A; Chromik, A M; Suelberg, D; Uhl, W; Tannapfel, A; Bergmann, U

2012-08-01

According to the International Union Against Cancer (UICC), R1 is defined as the microscopic presence of tumor cells at the surface of the resection margin (RM). In contrast, the Royal College of Pathologists (RCP) suggested to declare R1 already when tumor cells are found within 1 mm of the RM. The aim of this study was to determine the significance of the RM concerning the prognosis of pancreatic ductal adenocarcinoma (PDAC). From 2007 to 2009, 62 patients underwent a curative operation for PDAC of the pancreatic head. The relevance of R status on cumulative overall survival (OS) was assessed on univariate and multivariate analysis for both the classic R classification (UICC) and the suggestion of the RCP. Following the UICC criteria, a positive RM was detected in 8 %. Along with grading and lymph node ratio, R status revealed a significant impact on OS on univariate and multivariate analysis. Applying the suggestion of the RCP, R1 rate rose to 26 % resulting in no significant impact on OS in univariate analysis. Our study has shown that the RCP suggestion for R status has no impact on the prognosis of PDAC. In contrast, our data confirmed the UICC R classification of RM as well as N category, grading, and lymph node ratio as significant prognostic factors.

The Raman spectrum character of skin tumor induced by UVB

NASA Astrophysics Data System (ADS)

Wu, Shulian; Hu, Liangjun; Wang, Yunxia; Li, Yongzeng

2016-03-01

In our study, the skin canceration processes induced by UVB were analyzed from the perspective of tissue spectrum. A home-made Raman spectral system with a millimeter order excitation laser spot size combined with a multivariate statistical analysis for monitoring the skin changed irradiated by UVB was studied and the discrimination were evaluated. Raman scattering signals of the SCC and normal skin were acquired. Spectral differences in Raman spectra were revealed. Linear discriminant analysis (LDA) based on principal component analysis (PCA) were employed to generate diagnostic algorithms for the classification of skin SCC and normal. The results indicated that Raman spectroscopy combined with PCA-LDA demonstrated good potential for improving the diagnosis of skin cancers.
Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms.

PubMed

Bromuri, Stefano; Zufferey, Damien; Hennebert, Jean; Schumacher, Michael

2014-10-01

This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density. Copyright © 2014 Elsevier Inc. All rights reserved.
Temporal abstraction for the analysis of intensive care information

NASA Astrophysics Data System (ADS)

Hadad, Alejandro J.; Evin, Diego A.; Drozdowicz, Bartolomé; Chiotti, Omar

2007-11-01

This paper proposes a scheme for the analysis of time-stamped series data from multiple monitoring devices of intensive care units, using Temporal Abstraction concepts. This scheme is oriented to obtain a description of the patient state evolution in an unsupervised way. The case of study is based on a dataset clinically classified with Pulmonary Edema. For this dataset a trends based Temporal Abstraction mechanism is proposed, by means of a Behaviours Base of time-stamped series and then used in a classification step. Combining this approach with the introduction of expert knowledge, using Fuzzy Logic, and multivariate analysis by means of Self-Organizing Maps, a states characterization model is obtained. This model is feasible of being extended to different patients groups and states. The proposed scheme allows to obtain intermediate states descriptions through which it is passing the patient and that could be used to anticipate alert situations.
Single-particle cryo-EM using alignment by classification (ABC): the structure of Lumbricus terrestris haemoglobin

PubMed Central

Seer-Linnemayr, Charlotte; Ravelli, Raimond B. G.; Matadeen, Rishi; De Carlo, Sacha; Alewijnse, Bart; Portugal, Rodrigo V.; Pannu, Navraj S.; Schatz, Michael; van Heel, Marin

2017-01-01

Single-particle cryogenic electron microscopy (cryo-EM) can now yield near-atomic resolution structures of biological complexes. However, the reference-based alignment algorithms commonly used in cryo-EM suffer from reference bias, limiting their applicability (also known as the ‘Einstein from random noise’ problem). Low-dose cryo-EM therefore requires robust and objective approaches to reveal the structural information contained in the extremely noisy data, especially when dealing with small structures. A reference-free pipeline is presented for obtaining near-atomic resolution three-dimensional reconstructions from heterogeneous (‘four-dimensional’) cryo-EM data sets. The methodologies integrated in this pipeline include a posteriori camera correction, movie-based full-data-set contrast transfer function determination, movie-alignment algorithms, (Fourier-space) multivariate statistical data compression and unsupervised classification, ‘random-startup’ three-dimensional reconstructions, four-dimensional structural refinements and Fourier shell correlation criteria for evaluating anisotropic resolution. The procedures exclusively use information emerging from the data set itself, without external ‘starting models’. Euler-angle assignments are performed by angular reconstitution rather than by the inherently slower projection-matching approaches. The comprehensive ‘ABC-4D’ pipeline is based on the two-dimensional reference-free ‘alignment by classification’ (ABC) approach, where similar images in similar orientations are grouped by unsupervised classification. Some fundamental differences between X-ray crystallography versus single-particle cryo-EM data collection and data processing are discussed. The structure of the giant haemoglobin from Lumbricus terrestris at a global resolution of ∼3.8 Å is presented as an example of the use of the ABC-4D procedure. PMID:28989723
Urogenital tuberculosis: definition and classification.

PubMed

Kulchavenya, Ekaterina

2014-10-01

To improve the approach to the diagnosis and management of urogenital tuberculosis (UGTB), we need clear and unique classification. UGTB remains an important problem, especially in developing countries, but it is often an overlooked disease. As with any other infection, UGTB should be cured by antibacterial therapy, but because of late diagnosis it may often require surgery. Scientific literature dedicated to this problem was critically analyzed and juxtaposed with the author's own more than 30 years' experience in tuberculosis urology. The conception, terms and definition were consolidated into one system; classification stage by stage as well as complications are presented. Classification of any disease includes dispersion on forms and stages and exact definitions for each stage. Clinical features and symptoms significantly vary between different forms and stages of UGTB. A simple diagnostic algorithm was constructed. UGTB is multivariant disease and a standard unified approach to it is impossible. Clear definition as well as unique classification are necessary for real estimation of epidemiology and the optimization of therapy. The term 'UGTB' has insufficient information in order to estimate therapy, surgery and prognosis, or to evaluate the epidemiology.
Use of the Analysis of the Volatile Faecal Metabolome in Screening for Colorectal Cancer

PubMed Central

2015-01-01

Diagnosis of colorectal cancer is an invasive and expensive colonoscopy, which is usually carried out after a positive screening test. Unfortunately, existing screening tests lack specificity and sensitivity, hence many unnecessary colonoscopies are performed. Here we report on a potential new screening test for colorectal cancer based on the analysis of volatile organic compounds (VOCs) in the headspace of faecal samples. Faecal samples were obtained from subjects who had a positive faecal occult blood sample (FOBT). Subjects subsequently had colonoscopies performed to classify them into low risk (non-cancer) and high risk (colorectal cancer) groups. Volatile organic compounds were analysed by selected ion flow tube mass spectrometry (SIFT-MS) and then data were analysed using both univariate and multivariate statistical methods. Ions most likely from hydrogen sulphide, dimethyl sulphide and dimethyl disulphide are statistically significantly higher in samples from high risk rather than low risk subjects. Results using multivariate methods show that the test gives a correct classification of 75% with 78% specificity and 72% sensitivity on FOBT positive samples, offering a potentially effective alternative to FOBT. PMID:26086914
Buried landmine detection using multivariate normal clustering

NASA Astrophysics Data System (ADS)

Duston, Brian M.

2001-10-01

A Bayesian classification algorithm is presented for discriminating buried land mines from buried and surface clutter in Ground Penetrating Radar (GPR) signals. This algorithm is based on multivariate normal (MVN) clustering, where feature vectors are used to identify populations (clusters) of mines and clutter objects. The features are extracted from two-dimensional images created from ground penetrating radar scans. MVN clustering is used to determine the number of clusters in the data and to create probability density models for target and clutter populations, producing the MVN clustering classifier (MVNCC). The Bayesian Information Criteria (BIC) is used to evaluate each model to determine the number of clusters in the data. An extension of the MVNCC allows the model to adapt to local clutter distributions by treating each of the MVN cluster components as a Poisson process and adaptively estimating the intensity parameters. The algorithm is developed using data collected by the Mine Hunter/Killer Close-In Detector (MH/K CID) at prepared mine lanes. The Mine Hunter/Killer is a prototype mine detecting and neutralizing vehicle developed for the U.S. Army to clear roads of anti-tank mines.
Evaluating effects of methylphenidate on brain activity in cocaine addiction: a machine-learning approach

NASA Astrophysics Data System (ADS)

Rish, Irina; Bashivan, Pouya; Cecchi, Guillermo A.; Goldstein, Rita Z.

2016-03-01

The objective of this study is to investigate effects of methylphenidate on brain activity in individuals with cocaine use disorder (CUD) using functional MRI (fMRI). Methylphenidate hydrochloride (MPH) is an indirect dopamine agonist commonly used for treating attention deficit/hyperactivity disorders; it was also shown to have some positive effects on CUD subjects, such as improved stop signal reaction times associated with better control/inhibition,1 as well as normalized task-related brain activity2 and resting-state functional connectivity in specific areas.3 While prior fMRI studies of MPH in CUDs have focused on mass-univariate statistical hypothesis testing, this paper evaluates multivariate, whole-brain effects of MPH as captured by the generalization (prediction) accuracy of different classification techniques applied to features extracted from resting-state functional networks (e.g., node degrees). Our multivariate predictive results based on resting-state data from3 suggest that MPH tends to normalize network properties such as voxel degrees in CUD subjects, thus providing additional evidence for potential benefits of MPH in treating cocaine addiction.
Analysis of Big Data in Gait Biomechanics: Current Trends and Future Directions.

PubMed

Phinyomark, Angkoon; Petri, Giovanni; Ibáñez-Marcelo, Esther; Osis, Sean T; Ferber, Reed

2018-01-01

The increasing amount of data in biomechanics research has greatly increased the importance of developing advanced multivariate analysis and machine learning techniques, which are better able to handle "big data". Consequently, advances in data science methods will expand the knowledge for testing new hypotheses about biomechanical risk factors associated with walking and running gait-related musculoskeletal injury. This paper begins with a brief introduction to an automated three-dimensional (3D) biomechanical gait data collection system: 3D GAIT, followed by how the studies in the field of gait biomechanics fit the quantities in the 5 V's definition of big data: volume, velocity, variety, veracity, and value. Next, we provide a review of recent research and development in multivariate and machine learning methods-based gait analysis that can be applied to big data analytics. These modern biomechanical gait analysis methods include several main modules such as initial input features, dimensionality reduction (feature selection and extraction), and learning algorithms (classification and clustering). Finally, a promising big data exploration tool called "topological data analysis" and directions for future research are outlined and discussed.
Metabolome based volatiles profiling in 13 date palm fruit varieties from Egypt via SPME GC-MS and chemometrics.

PubMed

Khalil, Mohammed N A; Fekry, Mostafa I; Farag, Mohamed A

2017-02-15

Dates (Phoenix dactylifera L.) are distributed worldwide as major food complement providing a source of sugars and dietary fiber as well as macro- and micronutrients. Although phytochemical analyses of date fruit non-volatile metabolites have been reported, much less is known about the aroma given off by the fruit, which is critical for dissecting sensory properties and quality traits. Volatile constituents from 13 date varieties grown in Egypt were profiled using SPME-GCMS coupled to multivariate data analysis to explore date fruit aroma composition and investigate potential future uses by food industry. A total of 89 volatiles were identified where lipid-derived volatiles and phenylpropanoid derivatives were the major components of date fruit aroma. Multivariate data analyses revealed that 2,3-butanediol, hexanal, hexanol and cinnamaldehyde contributed the most to classification of different varieties. This study provides the most complete map of volatiles in Egyptian date fruit, with Siwi and Sheshi varieties exhibiting the most distinct aroma among studied date varieties. Copyright © 2016 Elsevier Ltd. All rights reserved.
Identifying ADHD children using hemodynamic responses during a working memory task measured by functional near-infrared spectroscopy.

PubMed

Gu, Yue; Miao, Shuo; Han, Junxia; Liang, Zhenhu; Ouyang, Gaoxiang; Yang, Jian; Li, Xiaoli

2018-06-01

Attention-deficit/hyperactivity disorder (ADHD) is a neurodevelopmental disorder affecting children and adults. Previous studies found that functional near-infrared spectroscopy (fNIRS) can reveal significant group differences in several brain regions between ADHD children and healthy controls during working memory tasks. This study aimed to use fNIRS activation patterns to identify ADHD children from healthy controls. FNIRS signals from 25 ADHD children and 25 healthy controls performing the n-back task were recorded; then, multivariate pattern analysis was used to discriminate ADHD individuals from healthy controls, and classification performance was evaluated for significance by the permutation test. The results showed that 86.0% ([Formula: see text]) of participants can be correctly classified in leave-one-out cross-validation. The most discriminative brain regions included the bilateral dorsolateral prefrontal cortex, inferior medial prefrontal cortex, right posterior prefrontal cortex, and right temporal cortex. This study demonstrated that, in a small sample, multivariate pattern analysis can effectively identify ADHD children from healthy controls based on fNIRS signals, which argues for the potential utility of fNIRS in future assessments.
Classification of broiler breast fillets according to storage and to freeze-thaw treatment using near infrared spectroscopy and multivariate analysis

USDA-ARS?s Scientific Manuscript database

Visible/near-infrared (NIR) spectroscopy has shown potential for successfully classifying broiler breast fillets according to their texture properties. Freshness and shelf life are also important quality characteristics of boneless skinless chicken breast products in the marketplace. This study deal...
Interpretable Early Classification of Multivariate Time Series

ERIC Educational Resources Information Center

Ghalwash, Mohamed F.

2013-01-01

Recent advances in technology have led to an explosion in data collection over time rather than in a single snapshot. For example, microarray technology allows us to measure gene expression levels in different conditions over time. Such temporal data grants the opportunity for data miners to develop algorithms to address domain-related problems,…
Molecular Classification of Grade 3 Endometrioid Endometrial Cancers Identifies Distinct Prognostic Subgroups.

PubMed

Bosse, Tjalling; Nout, Remi A; McAlpine, Jessica N; McConechy, Melissa K; Britton, Heidi; Hussein, Yaser R; Gonzalez, Carlene; Ganesan, Raji; Steele, Jane C; Harrison, Beth T; Oliva, Esther; Vidal, August; Matias-Guiu, Xavier; Abu-Rustum, Nadeem R; Levine, Douglas A; Gilks, C Blake; Soslow, Robert A

2018-05-01

Our aim was to investigate whether molecular classification can be used to refine prognosis in grade 3 endometrial endometrioid carcinomas (EECs). Grade 3 EECs were classified into 4 subgroups: p53 abnormal, based on mutant-like immunostaining (p53abn); MMR deficient, based on loss of mismatch repair protein expression (MMRd); presence of POLE exonuclease domain hotspot mutation (POLE); no specific molecular profile (NSMP), in which none of these aberrations were present. Overall survival (OS) and recurrence-free survival (RFS) rates were compared using the Kaplan-Meier method (Log-rank test) and univariable and multivariable Cox proportional hazard models. In total, 381 patients were included. The median age was 66 years (range, 33 to 96 y). Federation Internationale de Gynecologie et d'Obstetrique stages (2009) were as follows: IA, 171 (44.9%); IB, 120 (31.5%); II, 24 (6.3%); III, 50 (13.1%); IV, 11 (2.9%). There were 49 (12.9%) POLE, 79 (20.7%) p53abn, 115 (30.2%) NSMP, and 138 (36.2%) MMRd tumors. Median follow-up of patients was 6.1 years (range, 0.2 to 17.0 y). Compared to patients with NSMP, patients with POLE mutant grade 3 EEC (OS: hazard ratio [HR], 0.36 [95% confidence interval, 0.18-0.70]; P=0.003; RFS: HR, 0.17 [0.05-0.54]; P=0.003) had a significantly better prognosis; patients with p53abn tumors had a significantly worse RFS (HR, 1.73 [1.09-2.74]; P=0.021); patients with MMRd tumors showed a trend toward better RFS. Estimated 5-year OS rates were as follows: POLE 89%, MMRd 75%, NSMP 69%, p53abn 55% (Log rank P=0.001). Five-year RFS rates were as follows: POLE 96%, MMRd 77%, NSMP 64%, p53abn 47% (P=0.000001), respectively. In a multivariable Cox model that included age and Federation Internationale de Gynecologie et d'Obstetrique stage, POLE and MMRd status remained independent prognostic factors for better RFS; p53 status was an independent prognostic factor for worse RFS. Molecular classification of grade 3 EECs reveals that these tumors are a mixture of molecular subtypes of endometrial carcinoma, rather than a homogeneous group. The addition of molecular markers identifies prognostic subgroups, with potential therapeutic implications.
Prediction of Gestational Diabetes through NMR Metabolomics of Maternal Blood.

PubMed

Pinto, Joana; Almeida, Lara M; Martins, Ana S; Duarte, Daniela; Barros, António S; Galhano, Eulália; Pita, Cristina; Almeida, Maria do Céu; Carreira, Isabel M; Gil, Ana M

2015-06-05

Metabolic biomarkers of pre- and postdiagnosis gestational diabetes mellitus (GDM) were sought, using nuclear magnetic resonance (NMR) metabolomics of maternal plasma and corresponding lipid extracts. Metabolite differences between controls and disease were identified through multivariate analysis of variable selected (1)H NMR spectra. For postdiagnosis GDM, partial least squares regression identified metabolites with higher dependence on normal gestational age evolution. Variable selection of NMR spectra produced good classification models for both pre- and postdiagnostic GDM. Prediagnosis GDM was accompanied by cholesterol increase and minor increases in lipoproteins (plasma), fatty acids, and triglycerides (extracts). Small metabolite changes comprised variations in glucose (up regulated), amino acids, betaine, urea, creatine, and metabolites related to gut microflora. Most changes were enhanced upon GDM diagnosis, in addition to newly observed changes in low-Mw compounds. GDM prediction seems possible exploiting multivariate profile changes rather than a set of univariate changes. Postdiagnosis GDM is successfully classified using a 26-resonance plasma biomarker. Plasma and extracts display comparable classification performance, the former enabling direct and more rapid analysis. Results and putative biochemical hypotheses require further confirmation in larger cohorts of distinct ethnicities.
Multivariate cross-classification: applying machine learning techniques to characterize abstraction in neural representations

PubMed Central

Kaplan, Jonas T.; Man, Kingson; Greening, Steven G.

2015-01-01

Here we highlight an emerging trend in the use of machine learning classifiers to test for abstraction across patterns of neural activity. When a classifier algorithm is trained on data from one cognitive context, and tested on data from another, conclusions can be drawn about the role of a given brain region in representing information that abstracts across those cognitive contexts. We call this kind of analysis Multivariate Cross-Classification (MVCC), and review several domains where it has recently made an impact. MVCC has been important in establishing correspondences among neural patterns across cognitive domains, including motor-perception matching and cross-sensory matching. It has been used to test for similarity between neural patterns evoked by perception and those generated from memory. Other work has used MVCC to investigate the similarity of representations for semantic categories across different kinds of stimulus presentation, and in the presence of different cognitive demands. We use these examples to demonstrate the power of MVCC as a tool for investigating neural abstraction and discuss some important methodological issues related to its application. PMID:25859202
A method to relate chemical accident properties and expert judgements in order to derive useful information for the development of Environment-Accident Index.

PubMed

Scott Andersson, Asa; Tysklind, Mats; Fängmark, Ingrid

2007-08-17

The environment consists of a variety of different compartments and processes that act together in a complex system that complicate the environmental risk assessment after a chemical accident. The Environment-Accident Index (EAI) is an example of a tool based on a strategy to join the properties of a chemical with site-specific properties to facilitate this assessment and to be used in the planning process. In the development of the EAI it is necessary to make an unbiased judgement of relevant variables to include in the formula and to estimate their relative importance. The development of EAI has so far included the assimilation of chemical accidents, selection of a representative set of chemical accidents, and response values (representing effects in the environment after a chemical accident) have been developed by means of an expert panel. The developed responses were then related to the chemical and site-specific properties, through a mathematical model based on multivariate modelling (PLS), to create an improved EAI model. This resulted in EAI(new), a PLS based EAI model connected to a new classification scale. The advantages of EAI(new) compared to the old EAI (EAI(old)) is that it can be calculated without the use of tables, it can estimate the effects for all included responses and make a rough classification of chemical accidents according to the new classification scale. Finally EAI(new) is a more stable model than EAI(old), built on a valid base of accident scenarios which makes it more reliable to use for a variety of chemicals and situations as it covers a broader spectra of accident scenarios. EAI(new) can be expressed as a regression model to facilitate the calculation of the index for persons that do not have access to PLS. Future work can be; an external validation of EAI(new); to complete the formula structure; to adjust the classification scale; and to make a real life evaluation of EAI(new).
Landsat TM inventory and assessment of waterbird habitat in the southern altiplano of South America

USGS Publications Warehouse

Boyle, T.P.; Caziani, S.M.; Waltermire, R.G.

2004-01-01

The diverse set of wetlands in southern altiplano of South America supports a number of endemic and migratory waterbirds. These species include endangered endemic flamingos and shorebirds that nest in North America and winter in the altiplano. This research developed maps from nine Landsat Thematic Mapper (TM) images (254,300 km2) to provide an inventory of aquatic waterbird habitats. Image processing software was used to produce a map with a classification of wetlands according to the habitat requirements of different types of waterbirds. A hierarchical procedure was used to, first, isolate the bodies of water within the TM image; second, execute an unsupervised classification on the subsetted image to produce 300 signatures of cover types, which were further subdivided as necessary. Third, each of the classifications was examined in the light of field data and personal experience for relevance to the determination of the various habitat types. Finally, the signatures were applied to the entire image and other adjacent images to yield a map depicting the location of the various waterbird habitats in the southern altiplano. The data sets referenced with a global positioning system receiver were used to test the classification system. Multivariate analysis of the bird communities censused at each lake by individual habitats indicated a salinity gradient, and then the depth of the water separated the birds. Multivariate analysis of the chemical and physical data from the lakes showed that the variation in lakes were significantly associated with difference in depth, transparency, latitude, elevation, and pH. The presence of gravel bottoms was also one of the qualities distinguishing a group of lakes. This information will be directly useful to the Flamingo Census Project and serve as an element for risk assessment for future development.
Multivariate decoding of brain images using ordinal regression.

PubMed

Doyle, O M; Ashburner, J; Zelaya, F O; Williams, S C R; Mehta, M A; Marquand, A F

2013-11-01

Neuroimaging data are increasingly being used to predict potential outcomes or groupings, such as clinical severity, drug dose response, and transitional illness states. In these examples, the variable (target) we want to predict is ordinal in nature. Conventional classification schemes assume that the targets are nominal and hence ignore their ranked nature, whereas parametric and/or non-parametric regression models enforce a metric notion of distance between classes. Here, we propose a novel, alternative multivariate approach that overcomes these limitations - whole brain probabilistic ordinal regression using a Gaussian process framework. We applied this technique to two data sets of pharmacological neuroimaging data from healthy volunteers. The first study was designed to investigate the effect of ketamine on brain activity and its subsequent modulation with two compounds - lamotrigine and risperidone. The second study investigates the effect of scopolamine on cerebral blood flow and its modulation using donepezil. We compared ordinal regression to multi-class classification schemes and metric regression. Considering the modulation of ketamine with lamotrigine, we found that ordinal regression significantly outperformed multi-class classification and metric regression in terms of accuracy and mean absolute error. However, for risperidone ordinal regression significantly outperformed metric regression but performed similarly to multi-class classification both in terms of accuracy and mean absolute error. For the scopolamine data set, ordinal regression was found to outperform both multi-class and metric regression techniques considering the regional cerebral blood flow in the anterior cingulate cortex. Ordinal regression was thus the only method that performed well in all cases. Our results indicate the potential of an ordinal regression approach for neuroimaging data while providing a fully probabilistic framework with elegant approaches for model selection. Copyright © 2013. Published by Elsevier Inc.
Estimating global distribution of boreal, temperate, and tropical tree plant functional types using clustering techniques

NASA Astrophysics Data System (ADS)

Wang, Audrey; Price, David T.

2007-03-01

A simple integrated algorithm was developed to relate global climatology to distributions of tree plant functional types (PFT). Multivariate cluster analysis was performed to analyze the statistical homogeneity of the climate space occupied by individual tree PFTs. Forested regions identified from the satellite-based GLC2000 classification were separated into tropical, temperate, and boreal sub-PFTs for use in the Canadian Terrestrial Ecosystem Model (CTEM). Global data sets of monthly minimum temperature, growing degree days, an index of climatic moisture, and estimated PFT cover fractions were then used as variables in the cluster analysis. The statistical results for individual PFT clusters were found consistent with other global-scale classifications of dominant vegetation. As an improvement of the quantification of the climatic limitations on PFT distributions, the results also demonstrated overlapping of PFT cluster boundaries that reflected vegetation transitions, for example, between tropical and temperate biomes. The resulting global database should provide a better basis for simulating the interaction of climate change and terrestrial ecosystem dynamics using global vegetation models.

Rapid classification of pharmaceutical ingredients with Raman spectroscopy using compressive detection strategy with PLS-DA multivariate filters.

PubMed

Cebeci Maltaş, Derya; Kwok, Kaho; Wang, Ping; Taylor, Lynne S; Ben-Amotz, Dor

2013-06-01

Identifying pharmaceutical ingredients is a routine procedure required during industrial manufacturing. Here we show that a recently developed Raman compressive detection strategy can be employed to classify various widely used pharmaceutical materials using a hybrid supervised/unsupervised strategy in which only two ingredients are used for training and yet six other ingredients can also be distinguished. More specifically, our liquid crystal spatial light modulator (LC-SLM) based compressive detection instrument is trained using only the active ingredient, tadalafil, and the excipient, lactose, but is tested using these and various other excipients; microcrystalline cellulose, magnesium stearate, titanium (IV) oxide, talc, sodium lauryl sulfate and hydroxypropyl cellulose. Partial least squares discriminant analysis (PLS-DA) is used to generate the compressive detection filters necessary for fast chemical classification. Although the filters used in this study are trained on only lactose and tadalafil, we show that all the pharmaceutical ingredients mentioned above can be differentiated and classified using PLS-DA compressive detection filters with an accumulation time of 10ms per filter. Copyright © 2013 Elsevier B.V. All rights reserved.
Metabolomics approach of infant formula for the evaluation of contamination and degradation using hydrophilic interaction liquid chromatography coupled with mass spectrometry.

PubMed

Inoue, Koichi; Tanada, Chihiro; Sakamoto, Tasuku; Tsutsui, Haruhito; Akiba, Takashi; Min, Jun Zhe; Todoroki, Kenichiro; Yamano, Yutaka; Toyo'oka, Toshimasa

2015-08-15

In this study including the field of metabolomics approach for food, the evaluation of untargeted compounds using HILIC-ESI/TOF/MS and multivariate statistical analysis method is proposed for the assessment of classification, contamination and degradation of infant formula. HILIC mode is used to monitor more detected numbers in infant formulas in the ESI-positive scan mode than the reversed phase. The repeatability of the non-targeted contents from 4 kinds of infant formulas based on PCA was less than the relative standard deviation of 15% in all groups. The PCA pattern showed that significant differences in the classification of types and origins, the contamination of melamine and the degradations for one week were evaluated using HILIC-ESI/TOF/MS. In the S-plot from the degradation test, we could identify two markers by comparison to standards as nicotinic acid and nicotinamide. With this strategy, the differences from the untargeted compounds could be utilized for quality and safety assessment of infant formula. Copyright © 2015 Elsevier Ltd. All rights reserved.
Detection of Anomalies in Citrus Leaves Using Laser-Induced Breakdown Spectroscopy (LIBS).

PubMed

Sankaran, Sindhuja; Ehsani, Reza; Morgan, Kelly T

2015-08-01

Nutrient assessment and management are important to maintain productivity in citrus orchards. In this study, laser-induced breakdown spectroscopy (LIBS) was applied for rapid and real-time detection of citrus anomalies. Laser-induced breakdown spectroscopy spectra were collected from citrus leaves with anomalies such as diseases (Huanglongbing, citrus canker) and nutrient deficiencies (iron, manganese, magnesium, zinc), and compared with those of healthy leaves. Baseline correction, wavelet multivariate denoising, and normalization techniques were applied to the LIBS spectra before analysis. After spectral pre-processing, features were extracted using principal component analysis and classified using two models, quadratic discriminant analysis and support vector machine (SVM). The SVM resulted in a high average classification accuracy of 97.5%, with high average canker classification accuracy (96.5%). LIBS peak analysis indicated that high intensities at 229.7, 247.9, 280.3, 393.5, 397.0, and 769.8 nm were observed of 11 peaks found in all the samples. Future studies using controlled experiments with variable nutrient applications are required for quantification of foliar nutrients by using LIBS-based sensing.
Examining the Association Between School Vending Machines and Children's Body Mass Index by Socioeconomic Status.

PubMed

O'Hara, Jeffrey K; Haynes-Maslow, Lindsey

2015-01-01

To examine the association between vending machine availability in schools and body mass index (BMI) among subgroups of children based on gender, race/ethnicity, and socioeconomic status classifications. First-difference multivariate regressions were estimated using longitudinal fifth- and eighth-grade data from the Early Childhood Longitudinal Study. The specifications were disaggregated by gender, race/ethnicity, and family socioeconomic status classifications. Vending machine availability had a positive association (P < .10) with BMI among Hispanic male children and low-income Hispanic children. Living in an urban location (P < .05) and hours watching television (P < .05) were also positively associated with BMI for these subgroups. Supplemental Nutrition Assistance Program enrollment was negatively associated with BMI for low-income Hispanic students (P < .05). These findings were not statistically significant when using Bonferroni adjusted critical values. The results suggest that the school food environment could reinforce health disparities that exist for Hispanic male children and low-income Hispanic children. Copyright © 2015 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Exploration of computational methods for classification of movement intention during human voluntary movement from single trial EEG.

PubMed

Bai, Ou; Lin, Peter; Vorbach, Sherry; Li, Jiang; Furlani, Steve; Hallett, Mark

2007-12-01

To explore effective combinations of computational methods for the prediction of movement intention preceding the production of self-paced right and left hand movements from single trial scalp electroencephalogram (EEG). Twelve naïve subjects performed self-paced movements consisting of three key strokes with either hand. EEG was recorded from 128 channels. The exploration was performed offline on single trial EEG data. We proposed that a successful computational procedure for classification would consist of spatial filtering, temporal filtering, feature selection, and pattern classification. A systematic investigation was performed with combinations of spatial filtering using principal component analysis (PCA), independent component analysis (ICA), common spatial patterns analysis (CSP), and surface Laplacian derivation (SLD); temporal filtering using power spectral density estimation (PSD) and discrete wavelet transform (DWT); pattern classification using linear Mahalanobis distance classifier (LMD), quadratic Mahalanobis distance classifier (QMD), Bayesian classifier (BSC), multi-layer perceptron neural network (MLP), probabilistic neural network (PNN), and support vector machine (SVM). A robust multivariate feature selection strategy using a genetic algorithm was employed. The combinations of spatial filtering using ICA and SLD, temporal filtering using PSD and DWT, and classification methods using LMD, QMD, BSC and SVM provided higher performance than those of other combinations. Utilizing one of the better combinations of ICA, PSD and SVM, the discrimination accuracy was as high as 75%. Further feature analysis showed that beta band EEG activity of the channels over right sensorimotor cortex was most appropriate for discrimination of right and left hand movement intention. Effective combinations of computational methods provide possible classification of human movement intention from single trial EEG. Such a method could be the basis for a potential brain-computer interface based on human natural movement, which might reduce the requirement of long-term training. Effective combinations of computational methods can classify human movement intention from single trial EEG with reasonable accuracy.
Rapid differentiation of Chinese hop varieties (Humulus lupulus) using volatile fingerprinting by HS-SPME-GC-MS combined with multivariate statistical analysis.

PubMed

Liu, Zechang; Wang, Liping; Liu, Yumei

2018-01-18

Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
Characterization of the volatile components in green tea by IRAE-HS-SPME/GC-MS combined with multivariate analysis.

PubMed

Yang, Yan-Qin; Yin, Hong-Xu; Yuan, Hai-Bo; Jiang, Yong-Wen; Dong, Chun-Wang; Deng, Yu-Liang

2018-01-01

In the present work, a novel infrared-assisted extraction coupled to headspace solid-phase microextraction (IRAE-HS-SPME) followed by gas chromatography-mass spectrometry (GC-MS) was developed for rapid determination of the volatile components in green tea. The extraction parameters such as fiber type, sample amount, infrared power, extraction time, and infrared lamp distance were optimized by orthogonal experimental design. Under optimum conditions, a total of 82 volatile compounds in 21 green tea samples from different geographical origins were identified. Compared with classical water-bath heating, the proposed technique has remarkable advantages of considerably reducing the analytical time and high efficiency. In addition, an effective classification of green teas based on their volatile profiles was achieved by partial least square-discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Furthermore, the application of a dual criterion based on the variable importance in the projection (VIP) values of the PLS-DA models and on the category from one-way univariate analysis (ANOVA) allowed the identification of 12 potential volatile markers, which were considered to make the most important contribution to the discrimination of the samples. The results suggest that IRAE-HS-SPME/GC-MS technique combined with multivariate analysis offers a valuable tool to assess geographical traceability of different tea varieties.
Characterization of the volatile components in green tea by IRAE-HS-SPME/GC-MS combined with multivariate analysis

PubMed Central

Yin, Hong-Xu; Yuan, Hai-Bo; Jiang, Yong-Wen; Dong, Chun-Wang; Deng, Yu-Liang

2018-01-01

In the present work, a novel infrared-assisted extraction coupled to headspace solid-phase microextraction (IRAE-HS-SPME) followed by gas chromatography-mass spectrometry (GC-MS) was developed for rapid determination of the volatile components in green tea. The extraction parameters such as fiber type, sample amount, infrared power, extraction time, and infrared lamp distance were optimized by orthogonal experimental design. Under optimum conditions, a total of 82 volatile compounds in 21 green tea samples from different geographical origins were identified. Compared with classical water-bath heating, the proposed technique has remarkable advantages of considerably reducing the analytical time and high efficiency. In addition, an effective classification of green teas based on their volatile profiles was achieved by partial least square-discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Furthermore, the application of a dual criterion based on the variable importance in the projection (VIP) values of the PLS-DA models and on the category from one-way univariate analysis (ANOVA) allowed the identification of 12 potential volatile markers, which were considered to make the most important contribution to the discrimination of the samples. The results suggest that IRAE-HS-SPME/GC-MS technique combined with multivariate analysis offers a valuable tool to assess geographical traceability of different tea varieties. PMID:29494626
Formalized classification of moss litters in swampy spruce forests of intermontane depressions of Kuznetsk Alatau

NASA Astrophysics Data System (ADS)

Efremova, T. T.; Avrova, A. F.; Efremov, S. P.

2016-09-01

The approaches of multivariate statistics have been used for the numerical classification of morphogenetic types of moss litters in swampy spruce forests according to their physicochemical properties (the ash content, decomposition degree, bulk density, pH, mass, and thickness). Three clusters of moss litters— peat, peaty, and high-ash peaty—have been specified. The functions of classification for identification of new objects have been calculated and evaluated. The degree of decomposition and the ash content are the main classification parameters of litters, though all other characteristics are also statistically significant. The final prediction accuracy of the assignment of a litter to a particular cluster is 86%. Two leading factors participating in the clustering of litters have been determined. The first factor—the degree of transformation of plant remains (quality)—specifies 49% of the total variance, and the second factor—the accumulation rate (quantity)— specifies 26% of the total variance. The morphogenetic structure and physicochemical properties of the clusters of moss litters are characterized.
Risk prediction for myocardial infarction via generalized functional regression models.

PubMed

Ieva, Francesca; Paganoni, Anna M

2016-08-01

In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.
Fast classification and compositional analysis of cornstover fractions using Fourier transform near-infrared techniques.

PubMed

Philip Ye, X; Liu, Lu; Hayes, Douglas; Womac, Alvin; Hong, Kunlun; Sokhansanj, Shahab

2008-10-01

The objectives of this research were to determine the variation of chemical composition across botanical fractions of cornstover, and to probe the potential of Fourier transform near-infrared (FT-NIR) techniques in qualitatively classifying separated cornstover fractions and in quantitatively analyzing chemical compositions of cornstover by developing calibration models to predict chemical compositions of cornstover based on FT-NIR spectra. Large variations of cornstover chemical composition for wide calibration ranges, which is required by a reliable calibration model, were achieved by manually separating the cornstover samples into six botanical fractions, and their chemical compositions were determined by conventional wet chemical analyses, which proved that chemical composition varies significantly among different botanical fractions of cornstover. Different botanic fractions, having total saccharide content in descending order, are husk, sheath, pith, rind, leaf, and node. Based on FT-NIR spectra acquired on the biomass, classification by Soft Independent Modeling of Class Analogy (SIMCA) was employed to conduct qualitative classification of cornstover fractions, and partial least square (PLS) regression was used for quantitative chemical composition analysis. SIMCA was successfully demonstrated in classifying botanical fractions of cornstover. The developed PLS model yielded root mean square error of prediction (RMSEP %w/w) of 0.92, 1.03, 0.17, 0.27, 0.21, 1.12, and 0.57 for glucan, xylan, galactan, arabinan, mannan, lignin, and ash, respectively. The results showed the potential of FT-NIR techniques in combination with multivariate analysis to be utilized by biomass feedstock suppliers, bioethanol manufacturers, and bio-power producers in order to better manage bioenergy feedstocks and enhance bioconversion.
Diagnosis of rheumatoid arthritis: multivariate analysis of biomarkers.

PubMed

Wild, Norbert; Karl, Johann; Grunert, Veit P; Schmitt, Raluca I; Garczarek, Ursula; Krause, Friedemann; Hasler, Fritz; van Riel, Piet L C M; Bayer, Peter M; Thun, Matthias; Mattey, Derek L; Sharif, Mohammed; Zolg, Werner

2008-02-01

To test if a combination of biomarkers can increase the classification power of autoantibodies to cyclic citrullinated peptides (anti-CCP) in the diagnosis of rheumatoid arthritis (RA) depending on the diagnostic situation. Biomarkers were subject to three inclusion/exclusion criteria (discrimination between RA patients and healthy blood donors, ability to identify anti-CCP-negative RA patients, specificity in a panel with major non-rheumatological diseases) before univariate ranking and multivariate analysis was carried out using a modelling panel (n = 906). To enable the evaluation of the classification power in different diagnostic settings the disease controls (n = 542) were weighted according to the admission rates in rheumatology clinics modelling a clinic panel or according to the relative prevalences of musculoskeletal disorders in the general population seen by general practitioners modelling a GP panel. Out of 131 biomarkers considered originally, we evaluated 32 biomarkers in this study, of which only seven passed the three inclusion/exclusion criteria and were combined by multivariate analysis using four different mathematical models. In the modelled clinic panel, anti-CCP was the lead marker with a sensitivity of 75.8% and a specificity of 94.0%. Due to the lack in specificity of the markers other than anti-CCP in this diagnostic setting, any gain in sensitivity by any marker combination is off-set by a corresponding loss in specificity. In the modelled GP panel, the best marker combination of anti-CCP and interleukin (IL)-6 resulted in a sensitivity gain of 7.6% (85.9% vs. 78.3%) at a minor loss in specificity of 1.6% (90.3% vs. 91.9%) compared with anti-CCP as the best single marker. Depending on the composition of the sample panel, anti-CCP alone or anti-CCP in combination with IL-6 has the highest classification power for the diagnosis of established RA.
The Model-Based Study of the Effectiveness of Reporting Lists of Small Feature Sets Using RNA-Seq Data.

PubMed

Kim, Eunji; Ivanov, Ivan; Hua, Jianping; Lampe, Johanna W; Hullar, Meredith Aj; Chapkin, Robert S; Dougherty, Edward R

2017-01-01

Ranking feature sets for phenotype classification based on gene expression is a challenging issue in cancer bioinformatics. When the number of samples is small, all feature selection algorithms are known to be unreliable, producing significant error, and error estimators suffer from different degrees of imprecision. The problem is compounded by the fact that the accuracy of classification depends on the manner in which the phenomena are transformed into data by the measurement technology. Because next-generation sequencing technologies amount to a nonlinear transformation of the actual gene or RNA concentrations, they can potentially produce less discriminative data relative to the actual gene expression levels. In this study, we compare the performance of ranking feature sets derived from a model of RNA-Seq data with that of a multivariate normal model of gene concentrations using 3 measures: (1) ranking power, (2) length of extensions, and (3) Bayes features. This is the model-based study to examine the effectiveness of reporting lists of small feature sets using RNA-Seq data and the effects of different model parameters and error estimators. The results demonstrate that the general trends of the parameter effects on the ranking power of the underlying gene concentrations are preserved in the RNA-Seq data, whereas the power of finding a good feature set becomes weaker when gene concentrations are transformed by the sequencing machine.
Ante mortem identification of BSE from serum using infrared spectroscopy

NASA Astrophysics Data System (ADS)

Schmitt, Jürgen; Lasch, Peter; Beekes, Michael; Udelhoven, Thomas; Eiden, Michael; Fabian, Heinz; Petrich, Wolfgang H.; Naumann, Dieter

2004-07-01

In our former studies a diagnostic approach for the detection of transmissible spongiform encephalopaties (TSE) based on FT-IR spectroscopy in combination with artificial neural networks was described, based on a controlled animal study with terminally ill Syrian hamsters and control animals. As a consequence of the bovine spongiform encephalopathy (BSE) crisis in Europe, the development of a disgnostic ante mortem test for cattle has become a matter of great scientific importance and public interest. Since 1986 more than 180,000 clinical cases of BSE have been observed in the UK alone. Most of these cases were confirmed by post mortem examination of brain tissue. However, BSE-related risk assessment and risk-management would greatly benefit from ante mortem testing on living animals. For example, a serum-based test could allow for screening of the cattle population, thus, even a BSE eradication program would be conceivable. Here we report on a novel method for ante mortem BSE testing, which combines infrared spectroscopy of serum samples with multivariate pattern recognition analysis. A classification algorithm was trained using infrared spectra of sera from more than 800 animals from a field study (including BSE positive, healthy controls and animals suffering from viral or bacterial infections). In two validation studies sensitivities of 85% and 87% and specificities of 84% and 91% were achieved, respectively. The combination of classification algorithms increased sensitivity and specificity to 96% and 92%, respectively.
The impact of joint responses of devices in an airport security system.

PubMed

Nie, Xiaofeng; Batta, Rajan; Drury, Colin G; Lin, Li

2009-02-01

In this article, we consider a model for an airport security system in which the declaration of a threat is based on the joint responses of inspection devices. This is in contrast to the typical system in which each check station independently declares a passenger as having a threat or not having a threat. In our framework the declaration of threat/no-threat is based upon the passenger scores at the check stations he/she goes through. To do this we use concepts from classification theory in the field of multivariate statistics analysis and focus on the main objective of minimizing the expected cost of misclassification. The corresponding correct classification and misclassification probabilities can be obtained by using a simulation-based method. After computing the overall false alarm and false clear probabilities, we compare our joint response system with two other independently operated systems. A model that groups passengers in a manner that minimizes the false alarm probability while maintaining the false clear probability within specifications set by a security authority is considered. We also analyze the staffing needs at each check station for such an inspection scheme. An illustrative example is provided along with sensitivity analysis on key model parameters. A discussion is provided on some implementation issues, on the various assumptions made in the analysis, and on potential drawbacks of the approach.
An application of LANDSAT multispectral imagery for the classification of hydrobiological systems, Shark River Slough, Everglades National Park, Florida

NASA Technical Reports Server (NTRS)

Rose, P. W.; Rosendahl, P. C. (Principal Investigator)

1979-01-01

Multivariant hydrologic parameters over the Shark River Slough were investigated. Ground truth was established utilizing U-2 infrared photography and comprehensive field data to define a control network which represented all hydrobiological systems in the slough. These data were then applied to LANDSAT imagery utilizing an interactive multispectral processor which generated hydrographic maps through classification of the slough and defined the multispectral surface radiance characteristics of the wetlands areas in the park. The spectral response of each hydrobiological zone was determined and plotted to formulate multispectral relationships between the emittent energy from the slough in order to determine the best possible multispectral wavelength combinations to enhance classification results. The extent of each hydrobiological zone in slough was determined and flow vectors for water movement throughout the slough established.
Crohn's disease in a southern European country: Montreal classification and clinical activity.

PubMed

Magro, Fernando; Portela, Francisco; Lago, Paula; Ramos de Deus, João; Vieira, Ana; Peixe, Paula; Cremers, Isabelle; Cotter, José; Cravo, Marília; Tavares, Lourdes; Reis, Jorge; Gonçalves, Raquel; Lopes, Horácio; Caldeira, Paulo; Ministro, Paula; Carvalho, Laura; Azevedo, Luis; da Costa-Pereira, Altamiro

2009-09-01

Given the heterogeneous nature of Crohn's disease (CD), our aim was to apply the Montreal Classification to a large cohort of Portuguese patients with CD in order to identify potential predictive regarding the need for medical and/or surgical treatment. A cross-sectional study was used based on data from an on-line registry of patients with CD. Of the 1692 patients with 5 or more years of disease, 747 (44%) were male and 945 (56%) female. On multivariate analysis the A2 group was an independent risk factor of the need for steroids (odds ratio [OR] 1.6, 95% confidence interval [CI] 1.1-2.3) and the A1 and A2 groups for immunosuppressants (OR 2.2; CI 1.2-3.8; OR 1.4; CI 1.0-2.0, respectively). An L3+L3(4) and L(4) location were risk factors for immunosuppression (OR 1.9; CI 1.5-2.4), whereas an L1 location was significantly associated with the need for abdominal surgery (P < 0.001). After 20 years of disease, less than 10% of patients persisted without steroids, immunosuppression, or surgery. The Montreal Classification allowed us to identify different groups of disease severity: A1 were more immunosuppressed without surgery, most of A2 patients were submitted to surgery, and 52% of L1+L1(4) patients were operated without immunosuppressants. Stratifying patients according to the Montreal Classification may prove useful in identifying different phenotypes with different therapies and severity. Most of our patients have severe disease.
Why do pathological stage IA lung adenocarcinomas vary from prognosis?: a clinicopathologic study of 176 patients with pathological stage IA lung adenocarcinoma based on the IASLC/ATS/ERS classification.

PubMed

Zhang, Jie; Wu, Jie; Tan, Qiang; Zhu, Lei; Gao, Wen

2013-09-01

Patients with pathological stage IA adenocarcinoma (AC) have a variable prognosis, even if treated in the same way. The postoperative treatment of pathological stage IA patients is also controversial. We identified 176 patients with pathological stage IA AC who had undergone a lobectomy and mediastinal lymph node dissection at the Shanghai Chest Hospital, Shanghai, China, between 2000 and 2006. No patient had preoperative treatment. The histologic subtypes of all patients were classified according to the 2011 International Association for the Study of Lung Cancer (IASLC)/American Thoracic Society (ATS)/European Respiratory Society (ERS) international multidisciplinary lung AC classification. Patients' 5-year overall survival (OS) and 5-year disease-free survival (DFS) were calculated using Kaplan-Meier and Cox regression analyses. One hundred seventy-six patients with pathological stage IA AC had an 86.6% 5-year OS and 74.6% 5-year DFS. The 10 patients with micropapillary predominant subtype had the lowest 5-year DFS (40.0%).The 12 patients with solid predominant with mucin production subtype had the lowest 5-year OS (66.7%). Univariate and multivariate analysis showed that sex and prognositic groups of the IASLC/ATS/ERS histologic classification were significantly associated with 5-year DFS of pathological stage IA AC. Our study revealed that sex was an independent prognostic factor of pathological stage IA AC. The IASLC/ATS/ERS classification of lung AC identifies histologic categories with prognostic differences that could be helpful in clinical therapy.
Development of a neural-based forecasting tool to classify recreational water quality using fecal indicator organisms.

PubMed

Motamarri, Srinivas; Boccelli, Dominic L

2012-09-15

Users of recreational waters may be exposed to elevated pathogen levels through various point/non-point sources. Typical daily notifications rely on microbial analysis of indicator organisms (e.g., Escherichia coli) that require 18, or more, hours to provide an adequate response. Modeling approaches, such as multivariate linear regression (MLR) and artificial neural networks (ANN), have been utilized to provide quick predictions of microbial concentrations for classification purposes, but generally suffer from high false negative rates. This study introduces the use of learning vector quantization (LVQ)--a direct classification approach--for comparison with MLR and ANN approaches and integrates input selection for model development with respect to primary and secondary water quality standards within the Charles River Basin (Massachusetts, USA) using meteorologic, hydrologic, and microbial explanatory variables. Integrating input selection into model development showed that discharge variables were the most important explanatory variables while antecedent rainfall and time since previous events were also important. With respect to classification, all three models adequately represented the non-violated samples (>90%). The MLR approach had the highest false negative rates associated with classifying violated samples (41-62% vs 13-43% (ANN) and <16% (LVQ)) when using five or more explanatory variables. The ANN performance was more similar to LVQ when a larger number of explanatory variables were utilized, but the ANN performance degraded toward MLR performance as explanatory variables were removed. Overall, the use of LVQ as a direct classifier provided the best overall classification ability with respect to violated/non-violated samples for both standards. Copyright © 2012 Elsevier Ltd. All rights reserved.
Predictive value of hippocampal MR imaging-based high-dimensional mapping in mesial temporal epilepsy: preliminary findings.

PubMed

Hogan, R E; Wang, L; Bertrand, M E; Willmore, L J; Bucholz, R D; Nassif, A S; Csernansky, J G

2006-01-01

We objectively assessed surface structural changes of the hippocampus in mesial temporal sclerosis (MTS) and assessed the ability of large-deformation high-dimensional mapping (HDM-LD) to demonstrate hippocampal surface symmetry and predict group classification of MTS in right and left MTS groups compared with control subjects. Using eigenvector field analysis of HDM-LD segmentations of the hippocampus, we compared the symmetry of changes in the right and left MTS groups with a group of 15 matched controls. To assess the ability of HDM-LD to predict group classification, eigenvectors were selected by a logistic regression procedure when comparing the MTS group with control subjects. Multivariate analysis of variance on the coefficients from the first 9 eigenvectors accounted for 75% of the total variance between groups. The first 3 eigenvectors showed the largest differences between the control group and each of the MTS groups, but with eigenvector 2 showing the greatest difference in the MTS groups. Reconstruction of the hippocampal deformation vector fields due solely to eigenvector 2 shows symmetrical patterns in the right and left MTS groups. A "leave-one-out" (jackknife) procedure correctly predicted group classification in 14 of 15 (93.3%) left MTS subjects and all 15 right MTS subjects. Analysis of principal dimensions of hippocampal shape change suggests that MTS, after accounting for normal right-left asymmetries, affects the right and left hippocampal surface structure very symmetrically. Preliminary analysis using HDM-LD shows it can predict group classification of MTS and control hippocampi in this well-defined population of patients with MTS and mesial temporal lobe epilepsy (MTLE).

Development and Psychometric Evaluation of the Brief Adolescent Gambling Screen (BAGS)

PubMed Central

Stinchfield, Randy; Wynne, Harold; Wiebe, Jamie; Tremblay, Joel

2017-01-01

The purpose of this study was to develop and evaluate the initial reliability, validity and classification accuracy of a new brief screen for adolescent problem gambling. The three-item Brief Adolescent Gambling Screen (BAGS) was derived from the nine-item Gambling Problem Severity Subscale (GPSS) of the Canadian Adolescent Gambling Inventory (CAGI) using a secondary analysis of existing CAGI data. The sample of 105 adolescents included 49 females and 56 males from Canada who completed the CAGI, a self-administered measure of DSM-IV diagnostic criteria for Pathological Gambling, and a clinician-administered diagnostic interview including the DSM-IV diagnostic criteria for Pathological Gambling (both of which were adapted to yield DSM-5 Gambling Disorder diagnosis). A stepwise multivariate discriminant function analysis selected three GPSS items as the best predictors of a diagnosis of Gambling Disorder. The BAGS demonstrated satisfactory estimates of reliability, validity and classification accuracy and was equivalent to the nine-item GPSS of the CAGI and the BAGS was more accurate than the SOGS-RA. The BAGS estimates of classification accuracy include hit rate = 0.95, sensitivity = 0.88, specificity = 0.98, false positive rate = 0.02, and false negative rate = 0.12. Since these classification estimates are preliminary, derived from a relatively small sample size, and based upon the same sample from which the items were selected, it will be important to cross-validate the BAGS with larger and more diverse samples. The BAGS should be evaluated for use as a screening tool in both clinical and school settings as well as epidemiological surveys. PMID:29312064
Nonparametric, Coupled ,Bayesian ,Dictionary ,and Classifier Learning for Hyperspectral Classification.

PubMed

Akhtar, Naveed; Mian, Ajmal

2017-10-03

We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.
Tumors with unmethylated MLH1 and the CpG island methylator phenotype are associated with a poor prognosis in stage II colorectal cancer patients

PubMed Central

Fu, Tao; Liu, Yanliang; Li, Kai; Wan, Weiwei; Pappou, Emmanouil P.; Iacobuzio-Donahue, Christine A.; Kerner, Zachary; Baylin, Stephen B.; Wolfgang, Christopher L.; Ahuja, Nita

2016-01-01

We previously developed a novel tumor subtype classification model for duodenal adenocarcinomas based on a combination of the CpG island methylator phenotype (CIMP) and MLH1 methylation status. Here, we tested the prognostic value of this model in stage II colorectal cancer (CRC) patients. Tumors were assigned to CIMP+/MLH1-unmethylated (MLH1-U), CIMP+/MLH1-methylated (MLH1-M), CIMP−/MLH1-U, or CIMP−/MLH1-M groups. Age, tumor location, lymphovascular invasion, and mucin production differed among the four patient subgroups, and CIMP+/MLH1-U tumors were more likely to have lymphovascular invasion and mucin production. Kaplan-Meier analyses revealed differences in both disease-free survival (DFS) and overall survival (OS) among the four groups. In a multivariate analysis, CIMP/MLH1 methylation status was predictive of both DFS and OS, and DFS and OS were shortest in CIMP+/MLH1-U stage II CRC patients. These results suggest that tumor subtype classification based on the combination of CIMP and MLH1 methylation status is informative in stage II CRC patients, and that CIMP+/MLH1-U tumors exhibit aggressive features and are associated with poor clinical outcomes. PMID:27880934
Tumors with unmethylated MLH1 and the CpG island methylator phenotype are associated with a poor prognosis in stage II colorectal cancer patients.

PubMed

Fu, Tao; Liu, Yanliang; Li, Kai; Wan, Weiwei; Pappou, Emmanouil P; Iacobuzio-Donahue, Christine A; Kerner, Zachary; Baylin, Stephen B; Wolfgang, Christopher L; Ahuja, Nita

2016-12-27

We previously developed a novel tumor subtype classification model for duodenal adenocarcinomas based on a combination of the CpG island methylator phenotype (CIMP) and MLH1 methylation status. Here, we tested the prognostic value of this model in stage II colorectal cancer (CRC) patients. Tumors were assigned to CIMP+/MLH1-unmethylated (MLH1-U), CIMP+/MLH1-methylated (MLH1-M), CIMP-/MLH1-U, or CIMP-/MLH1-M groups. Age, tumor location, lymphovascular invasion, and mucin production differed among the four patient subgroups, and CIMP+/MLH1-U tumors were more likely to have lymphovascular invasion and mucin production. Kaplan-Meier analyses revealed differences in both disease-free survival (DFS) and overall survival (OS) among the four groups. In a multivariate analysis, CIMP/MLH1 methylation status was predictive of both DFS and OS, and DFS and OS were shortest in CIMP+/MLH1-U stage II CRC patients. These results suggest that tumor subtype classification based on the combination of CIMP and MLH1 methylation status is informative in stage II CRC patients, and that CIMP+/MLH1-U tumors exhibit aggressive features and are associated with poor clinical outcomes.
Classification of type 2 diabetes rats based on urine amino acids metabolic profiling by liquid chromatography coupled with tandem mass spectrometry.

PubMed

Wang, Chunyan; Zhu, Hongbin; Pi, Zifeng; Song, Fengrui; Liu, Zhiqiang; Liu, Shuying

2013-09-15

An analytical method for quantifying underivatized amino acids (AAs) in urine samples of rats was developed by using liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS). Classification of type 2 diabetes rats was based on urine amino acids metabolic profiling. LC-MS/MS analysis was applied through chromatographic separation and multiple reactions monitoring (MRM) transitions of MS/MS. Multivariate profile-wide predictive models were constructed using partial least squares discriminant analysis (PLS-DA) by SIMAC-P 11.5 version software package and hierarchical cluster analysis (HCA) by SPSS 18.0 version software. Some amino acids in urine of rats have significant change. The results of the present study prove that this method could perform the quantification of free AAs in urine of rats by using LC-MS/MS. In summary, the PLS-DA and HCA statistical analysis in our research were preferable to differentiate healthy rats and type 2 diabetes rats by the quantification of AAs in their urine samples. In addition, comparing with health group the seven increased amino acids in urine of type 2 rats were returned to normal under the treatment of acarbose. Copyright © 2013 Elsevier B.V. All rights reserved.
Classification of buildings mold threat using electronic nose

NASA Astrophysics Data System (ADS)

Łagód, Grzegorz; Suchorab, Zbigniew; Guz, Łukasz; Sobczuk, Henryk

2017-07-01

Mold is considered to be one of the most important features of Sick Building Syndrome and is an important problem in current building industry. In many cases it is caused by the rising moisture of building envelopes surface and exaggerated humidity of indoor air. Concerning historical buildings it is mostly caused by outdated raising techniques among that is absence of horizontal isolation against moisture and hygroscopic materials applied for construction. Recent buildings also suffer problem of mold risk which is caused in many cases by hermetization leading to improper performance of gravitational ventilation systems that make suitable conditions for mold development. Basing on our research there is proposed a method of buildings mold threat classification using electronic nose, based on a gas sensors array which consists of MOS sensors (metal oxide semiconductor). Used device is frequently applied for air quality assessment in environmental engineering branches. Presented results show the interpretation of e-nose readouts of indoor air sampled in rooms threatened with mold development in comparison with clean reference rooms and synthetic air. Obtained multivariate data were processed, visualized and classified using a PCA (Principal Component Analysis) and ANN (Artificial Neural Network) methods. Described investigation confirmed that electronic nose - gas sensors array supported with data processing enables to classify air samples taken from different rooms affected with mold.
A nonlinear heartbeat dynamics model approach for personalized emotion recognition.

PubMed

Valenza, Gaetano; Citi, Luca; Lanatà, Antonio; Scilingo, Enzo Pasquale; Barbieri, Riccardo

2013-01-01

Emotion recognition based on autonomic nervous system signs is one of the ambitious goals of affective computing. It is well-accepted that standard signal processing techniques require relative long-time series of multivariate records to ensure reliability and robustness of recognition and classification algorithms. In this work, we present a novel methodology able to assess cardiovascular dynamics during short-time (i.e. < 10 seconds) affective stimuli, thus overcoming some of the limitations of current emotion recognition approaches. We developed a personalized, fully parametric probabilistic framework based on point-process theory where heartbeat events are modelled using a 2(nd)-order nonlinear autoregressive integrative structure in order to achieve effective performances in short-time affective assessment. Experimental results show a comprehensive emotional characterization of 4 subjects undergoing a passive affective elicitation using a sequence of standardized images gathered from the international affective picture system. Each picture was identified by the IAPS arousal and valence scores as well as by a self-reported emotional label associating a subjective positive or negative emotion. Results show a clear classification of two defined levels of arousal, valence and self-emotional state using features coming from the instantaneous spectrum and bispectrum of the considered RR intervals, reaching up to 90% recognition accuracy.
Raman spectral post-processing for oral tissue discrimination – a step for an automatized diagnostic system

PubMed Central

Carvalho, Luis Felipe C. S.; Nogueira, Marcelo Saito; Neto, Lázaro P. M.; Bhattacharjee, Tanmoy T.; Martin, Airton A.

2017-01-01

Most oral injuries are diagnosed by histopathological analysis of a biopsy, which is an invasive procedure and does not give immediate results. On the other hand, Raman spectroscopy is a real time and minimally invasive analytical tool with potential for the diagnosis of diseases. The potential for diagnostics can be improved by data post-processing. Hence, this study aims to evaluate the performance of preprocessing steps and multivariate analysis methods for the classification of normal tissues and pathological oral lesion spectra. A total of 80 spectra acquired from normal and abnormal tissues using optical fiber Raman-based spectroscopy (OFRS) were subjected to PCA preprocessing in the z-scored data set, and the KNN (K-nearest neighbors), J48 (unpruned C4.5 decision tree), RBF (radial basis function), RF (random forest), and MLP (multilayer perceptron) classifiers at WEKA software (Waikato environment for knowledge analysis), after area normalization or maximum intensity normalization. Our results suggest the best classification was achieved by using maximum intensity normalization followed by MLP. Based on these results, software for automated analysis can be generated and validated using larger data sets. This would aid quick comprehension of spectroscopic data and easy diagnosis by medical practitioners in clinical settings. PMID:29188115
Raman spectral post-processing for oral tissue discrimination - a step for an automatized diagnostic system.

PubMed

Carvalho, Luis Felipe C S; Nogueira, Marcelo Saito; Neto, Lázaro P M; Bhattacharjee, Tanmoy T; Martin, Airton A

2017-11-01

Most oral injuries are diagnosed by histopathological analysis of a biopsy, which is an invasive procedure and does not give immediate results. On the other hand, Raman spectroscopy is a real time and minimally invasive analytical tool with potential for the diagnosis of diseases. The potential for diagnostics can be improved by data post-processing. Hence, this study aims to evaluate the performance of preprocessing steps and multivariate analysis methods for the classification of normal tissues and pathological oral lesion spectra. A total of 80 spectra acquired from normal and abnormal tissues using optical fiber Raman-based spectroscopy (OFRS) were subjected to PCA preprocessing in the z-scored data set, and the KNN (K-nearest neighbors), J48 (unpruned C4.5 decision tree), RBF (radial basis function), RF (random forest), and MLP (multilayer perceptron) classifiers at WEKA software (Waikato environment for knowledge analysis), after area normalization or maximum intensity normalization. Our results suggest the best classification was achieved by using maximum intensity normalization followed by MLP. Based on these results, software for automated analysis can be generated and validated using larger data sets. This would aid quick comprehension of spectroscopic data and easy diagnosis by medical practitioners in clinical settings.
A Novel Hyperspectral Microscopic Imaging System for Evaluating Fresh Degree of Pork.

PubMed

Xu, Yi; Chen, Quansheng; Liu, Yan; Sun, Xin; Huang, Qiping; Ouyang, Qin; Zhao, Jiewen

2018-04-01

This study proposed a rapid microscopic examination method for pork freshness evaluation by using the self-assembled hyperspectral microscopic imaging (HMI) system with the help of feature extraction algorithm and pattern recognition methods. Pork samples were stored for different days ranging from 0 to 5 days and the freshness of samples was divided into three levels which were determined by total volatile basic nitrogen (TVB-N) content. Meanwhile, hyperspectral microscopic images of samples were acquired by HMI system and processed by the following steps for the further analysis. Firstly, characteristic hyperspectral microscopic images were extracted by using principal component analysis (PCA) and then texture features were selected based on the gray level co-occurrence matrix (GLCM). Next, features data were reduced dimensionality by fisher discriminant analysis (FDA) for further building classification model. Finally, compared with linear discriminant analysis (LDA) model and support vector machine (SVM) model, good back propagation artificial neural network (BP-ANN) model obtained the best freshness classification with a 100 % accuracy rating based on the extracted data. The results confirm that the fabricated HMI system combined with multivariate algorithms has ability to evaluate the fresh degree of pork accurately in the microscopic level, which plays an important role in animal food quality control.
A Novel Hyperspectral Microscopic Imaging System for Evaluating Fresh Degree of Pork

PubMed Central

Xu, Yi; Chen, Quansheng; Liu, Yan; Sun, Xin; Huang, Qiping; Ouyang, Qin; Zhao, Jiewen

2018-01-01

Abstract This study proposed a rapid microscopic examination method for pork freshness evaluation by using the self-assembled hyperspectral microscopic imaging (HMI) system with the help of feature extraction algorithm and pattern recognition methods. Pork samples were stored for different days ranging from 0 to 5 days and the freshness of samples was divided into three levels which were determined by total volatile basic nitrogen (TVB-N) content. Meanwhile, hyperspectral microscopic images of samples were acquired by HMI system and processed by the following steps for the further analysis. Firstly, characteristic hyperspectral microscopic images were extracted by using principal component analysis (PCA) and then texture features were selected based on the gray level co-occurrence matrix (GLCM). Next, features data were reduced dimensionality by fisher discriminant analysis (FDA) for further building classification model. Finally, compared with linear discriminant analysis (LDA) model and support vector machine (SVM) model, good back propagation artificial neural network (BP-ANN) model obtained the best freshness classification with a 100 % accuracy rating based on the extracted data. The results confirm that the fabricated HMI system combined with multivariate algorithms has ability to evaluate the fresh degree of pork accurately in the microscopic level, which plays an important role in animal food quality control. PMID:29805285
Agricultural Land Cover from Multitemporal C-Band SAR Data

NASA Astrophysics Data System (ADS)

Skriver, H.

2013-12-01

Henning Skriver DTU Space, Technical University of Denmark Ørsteds Plads, Building 348, DK-2800 Lyngby e-mail: hs@space.dtu.dk Problem description This paper focuses on land cover type from SAR data using high revisit acquisitions, including single and dual polarisation and fully polarimetric data, at C-band. The data set were acquired during an ESA-supported campaign, AgriSAR09, with the Radarsat-2 system. Ground surveys to obtain detailed land cover maps were performed during the campaign. Classification methods using single- and dual-polarisation data, and fully polarimetric data are used with multitemporal data with short revisit time. Results for airborne campaigns have previously been reported in Skriver et al. (2011) and Skriver (2012). In this paper, the short revisit satellite SAR data will be used to assess the trade-off between polarimetric SAR data and data as single or dual polarisation SAR data. This is particularly important in relation to the future GMES Sentinel-1 SAR satellites, where two satellites with a relatively wide swath will ensure a short revisit time globally. Questions dealt with are: which accuracy can we expect from a mission like the Sentinel-1, what is the improvement of using polarimetric SAR compared to single or dual polarisation SAR, and what is the optimum number of acquisitions needed. Methodology The data have sufficient number of looks for the Gaussian assumption to be valid for the backscatter coefficients for the individual polarizations. The classification method used for these data is therefore the standard Bayesian classification method for multivariate Gaussian statistics. For the full-polarimetric cases two classification methods have been applied, the standard ML Wishart classifier, and a method based on a reversible transform of the covariance matrix into backscatter intensities. The following pre-processing steps were performed on both data sets: The scattering matrix data in the form of SLC products were coregistered, converted to covariance matrix format and multilooked to a specific equivalent number of looks. Results The multitemporal data improve significantly the classification results, and single acquisition data cannot provide the necessary classification performance. The multitemporal data are especially important for the single and dual polarization data, but less important for the fully polarimetric data. The satellite data set produces realistic classification results based on about 2000 fields. The best classification results for the single-polarized mode provide classification errors in the mid-twenties. Using the dual-polarized mode reduces the classification error with about 5 percentage points, whereas the polarimetric mode reduces it with about 10 percentage points. These results show, that it will be possible to obtain reasonable results with relatively simple systems with short revisit time. This very important result shows that systems like the Sentinel-1 mission will be able to produce fairly good results for global land cover classification. References Skriver, H. et al., 2011, 'Crop Classification using Short-Revisit Multitemporal SAR Data', IEEE J. Sel. Topics in Appl. Earth Obs. Rem. Sens., vol. 4, pp. 423-431. Skriver, H., 2012, 'Crop classification by multitemporal C- and L-band single- and dual-polarization and fully polarimetric SAR', IEEE Trans. Geosc. Rem. Sens., vol. 50, pp. 2138-2149.
Multimodal Feature Integration in the Angular Gyrus during Episodic and Semantic Retrieval

PubMed Central

Bonnici, Heidi M.; Richter, Franziska R.; Yazar, Yasemin

2016-01-01

Much evidence from distinct lines of investigation indicates the involvement of angular gyrus (AnG) in the retrieval of both episodic and semantic information, but the region's precise function and whether that function differs across episodic and semantic retrieval have yet to be determined. We used univariate and multivariate fMRI analysis methods to examine the role of AnG in multimodal feature integration during episodic and semantic retrieval. Human participants completed episodic and semantic memory tasks involving unimodal (auditory or visual) and multimodal (audio-visual) stimuli. Univariate analyses revealed the recruitment of functionally distinct AnG subregions during the retrieval of episodic and semantic information. Consistent with a role in multimodal feature integration during episodic retrieval, significantly greater AnG activity was observed during retrieval of integrated multimodal episodic memories compared with unimodal episodic memories. Multivariate classification analyses revealed that individual multimodal episodic memories could be differentiated in AnG, with classification accuracy tracking the vividness of participants' reported recollections, whereas distinct unimodal memories were represented in sensory association areas only. In contrast to episodic retrieval, AnG was engaged to a statistically equivalent degree during retrieval of unimodal and multimodal semantic memories, suggesting a distinct role for AnG during semantic retrieval. Modality-specific sensory association areas exhibited corresponding activity during both episodic and semantic retrieval, which mirrored the functional specialization of these regions during perception. The results offer new insights into the integrative processes subserved by AnG and its contribution to our subjective experience of remembering. SIGNIFICANCE STATEMENT Using univariate and multivariate fMRI analyses, we provide evidence that functionally distinct subregions of angular gyrus (AnG) contribute to the retrieval of episodic and semantic memories. Our multivariate pattern classifier could distinguish episodic memory representations in AnG according to whether they were multimodal (audio-visual) or unimodal (auditory or visual) in nature, whereas statistically equivalent AnG activity was observed during retrieval of unimodal and multimodal semantic memories. Classification accuracy during episodic retrieval scaled with the trial-by-trial vividness with which participants experienced their recollections. Therefore, the findings offer new insights into the integrative processes subserved by AnG and how its function may contribute to our subjective experience of remembering. PMID:27194327
Multimodal Feature Integration in the Angular Gyrus during Episodic and Semantic Retrieval.

PubMed

Bonnici, Heidi M; Richter, Franziska R; Yazar, Yasemin; Simons, Jon S

2016-05-18

Much evidence from distinct lines of investigation indicates the involvement of angular gyrus (AnG) in the retrieval of both episodic and semantic information, but the region's precise function and whether that function differs across episodic and semantic retrieval have yet to be determined. We used univariate and multivariate fMRI analysis methods to examine the role of AnG in multimodal feature integration during episodic and semantic retrieval. Human participants completed episodic and semantic memory tasks involving unimodal (auditory or visual) and multimodal (audio-visual) stimuli. Univariate analyses revealed the recruitment of functionally distinct AnG subregions during the retrieval of episodic and semantic information. Consistent with a role in multimodal feature integration during episodic retrieval, significantly greater AnG activity was observed during retrieval of integrated multimodal episodic memories compared with unimodal episodic memories. Multivariate classification analyses revealed that individual multimodal episodic memories could be differentiated in AnG, with classification accuracy tracking the vividness of participants' reported recollections, whereas distinct unimodal memories were represented in sensory association areas only. In contrast to episodic retrieval, AnG was engaged to a statistically equivalent degree during retrieval of unimodal and multimodal semantic memories, suggesting a distinct role for AnG during semantic retrieval. Modality-specific sensory association areas exhibited corresponding activity during both episodic and semantic retrieval, which mirrored the functional specialization of these regions during perception. The results offer new insights into the integrative processes subserved by AnG and its contribution to our subjective experience of remembering. Using univariate and multivariate fMRI analyses, we provide evidence that functionally distinct subregions of angular gyrus (AnG) contribute to the retrieval of episodic and semantic memories. Our multivariate pattern classifier could distinguish episodic memory representations in AnG according to whether they were multimodal (audio-visual) or unimodal (auditory or visual) in nature, whereas statistically equivalent AnG activity was observed during retrieval of unimodal and multimodal semantic memories. Classification accuracy during episodic retrieval scaled with the trial-by-trial vividness with which participants experienced their recollections. Therefore, the findings offer new insights into the integrative processes subserved by AnG and how its function may contribute to our subjective experience of remembering. Copyright © 2016 Bonnici, Richter, et al.
Characterization of agricultural land using singular value decomposition

NASA Astrophysics Data System (ADS)

Herries, Graham M.; Danaher, Sean; Selige, Thomas

1995-11-01

A method is defined and tested for the characterization of agricultural land from multi-spectral imagery, based on singular value decomposition (SVD) and key vector analysis. The SVD technique, which bears a close resemblance to multivariate statistic techniques, has previously been successfully applied to problems of signal extraction for marine data and forestry species classification. In this study the SVD technique is used as a classifier for agricultural regions, using airborne Daedalus ATM data, with 1 m resolution. The specific region chosen is an experimental research farm in Bavaria, Germany. This farm has a large number of crops, within a very small region and hence is not amenable to existing techniques. There are a number of other significant factors which render existing techniques such as the maximum likelihood algorithm less suitable for this area. These include a very dynamic terrain and tessellated pattern soil differences, which together cause large variations in the growth characteristics of the crops. The SVD technique is applied to this data set using a multi-stage classification approach, removing unwanted land-cover classes one step at a time. Typical classification accuracy's for SVD are of the order of 85-100%. Preliminary results indicate that it is a fast and efficient classifier with the ability to differentiate between crop types such as wheat, rye, potatoes and clover. The results of characterizing 3 sub-classes of Winter Wheat are also shown.
Geographic identification of Boletus mushrooms by data fusion of FT-IR and UV spectroscopies combined with multivariate statistical analysis

NASA Astrophysics Data System (ADS)

Yao, Sen; Li, Tao; Li, JieQing; Liu, HongGao; Wang, YuanZhong

2018-06-01

Boletus griseus and Boletus edulis are two well-known wild-grown edible mushrooms which have high nutrition, delicious flavor and high economic value distributing in Yunnan Province. In this study, a rapid method using Fourier transform infrared (FT-IR) and ultraviolet (UV) spectroscopies coupled with data fusion was established for the discrimination of Boletus mushrooms from seven different geographical origins with pattern recognition method. Initially, the spectra of 332 mushroom samples obtained from the two spectroscopic techniques were analyzed individually and then the classification performance based on data fusion strategy was investigated. Meanwhile, the latent variables (LVs) of FT-IR and UV spectra were extracted by partial least square discriminant analysis (PLS-DA) and two datasets were concatenated into a new matrix for data fusion. Then, the fusion matrix was further analyzed by support vector machine (SVM). Compared with single spectroscopic technique, data fusion strategy can improve the classification performance effectively. In particular, the accuracy of correct classification of SVM model in training and test sets were 99.10% and 100.00%, respectively. The results demonstrated that data fusion of FT-IR and UV spectra can provide higher synergic effect for the discrimination of different geographical origins of Boletus mushrooms, which may be benefit for further authentication and quality assessment of edible mushrooms.
Use of genetic algorithm for the selection of EEG features

NASA Astrophysics Data System (ADS)

Asvestas, P.; Korda, A.; Kostopoulos, S.; Karanasiou, I.; Ouzounoglou, A.; Sidiropoulos, K.; Ventouras, E.; Matsopoulos, G.

2015-09-01

Genetic Algorithm (GA) is a popular optimization technique that can detect the global optimum of a multivariable function containing several local optima. GA has been widely used in the field of biomedical informatics, especially in the context of designing decision support systems that classify biomedical signals or images into classes of interest. The aim of this paper is to present a methodology, based on GA, for the selection of the optimal subset of features that can be used for the efficient classification of Event Related Potentials (ERPs), which are recorded during the observation of correct or incorrect actions. In our experiment, ERP recordings were acquired from sixteen (16) healthy volunteers who observed correct or incorrect actions of other subjects. The brain electrical activity was recorded at 47 locations on the scalp. The GA was formulated as a combinatorial optimizer for the selection of the combination of electrodes that maximizes the performance of the Fuzzy C Means (FCM) classification algorithm. In particular, during the evolution of the GA, for each candidate combination of electrodes, the well-known (Σ, Φ, Ω) features were calculated and were evaluated by means of the FCM method. The proposed methodology provided a combination of 8 electrodes, with classification accuracy 93.8%. Thus, GA can be the basis for the selection of features that discriminate ERP recordings of observations of correct or incorrect actions.
Effect of phenotype on health care costs in Crohn's disease: A European study using the Montreal classification.

PubMed

Odes, Selwyn; Vardi, Hillel; Friger, Michael; Wolters, Frank; Hoie, Ole; Moum, Bjørn; Bernklev, Tomm; Yona, Hagit; Russel, Maurice; Munkholm, Pia; Langholz, Ebbe; Riis, Lene; Politi, Patrizia; Bondini, Paolo; Tsianos, Epameinondas; Katsanos, Kostas; Clofent, Juan; Vermeire, Severine; Freitas, João; Mouzas, Iannis; Limonard, Charles; O'Morain, Colm; Monteiro, Estela; Fornaciari, Giovanni; Vatn, Morten; Stockbrugger, Reinhold

2007-12-01

Crohn's disease (CD) is a chronic inflammation of the gastrointestinal tract associated with life-long high health care costs. We aimed to determine the effect of disease phenotype on cost. Clinical and economic data of a community-based CD cohort with 10-year follow-up were analyzed retrospectively in relation to Montreal classification phenotypes. In 418 patients, mean total costs of health care for the behavior phenotypes were: nonstricturing-nonpenetrating 1690, stricturing 2081, penetrating 3133 and penetrating-with-perianal-fistula 3356 €/patient-phenotype-year (P<0.001), and mean costs of surgical hospitalization 215, 751, 1293 and 1275 €/patient-phenotype-year respectively (P<0.001). Penetrating-with-perianal-fistula patients incurred significantly greater expenses than penetrating patients for total care, diagnosis and drugs, but not surgical hospitalization. Total costs were similar in the location phenotypes: ileum 1893, colon 1748, ileo-colonic 2010 and upper gastrointestinal tract 1758 €/patient-phenotype-year, but surgical hospitalization costs differed significantly, 558, 209, 492 and 542 €/patient-phenotype-year respectively (P<0.001). By multivariate analysis, the behavior phenotype significantly impacted total, medical and surgical hospitalization costs, whereas the location phenotype affected only surgical costs. Younger age at diagnosis predicted greater surgical expenses. Behavior is the dominant phenotype driving health care cost. Use of the Montreal classification permits detection of cost differences caused by perianal fistula.
Geographic identification of Boletus mushrooms by data fusion of FT-IR and UV spectroscopies combined with multivariate statistical analysis.

PubMed

Yao, Sen; Li, Tao; Li, JieQing; Liu, HongGao; Wang, YuanZhong

2018-06-05

Boletus griseus and Boletus edulis are two well-known wild-grown edible mushrooms which have high nutrition, delicious flavor and high economic value distributing in Yunnan Province. In this study, a rapid method using Fourier transform infrared (FT-IR) and ultraviolet (UV) spectroscopies coupled with data fusion was established for the discrimination of Boletus mushrooms from seven different geographical origins with pattern recognition method. Initially, the spectra of 332 mushroom samples obtained from the two spectroscopic techniques were analyzed individually and then the classification performance based on data fusion strategy was investigated. Meanwhile, the latent variables (LVs) of FT-IR and UV spectra were extracted by partial least square discriminant analysis (PLS-DA) and two datasets were concatenated into a new matrix for data fusion. Then, the fusion matrix was further analyzed by support vector machine (SVM). Compared with single spectroscopic technique, data fusion strategy can improve the classification performance effectively. In particular, the accuracy of correct classification of SVM model in training and test sets were 99.10% and 100.00%, respectively. The results demonstrated that data fusion of FT-IR and UV spectra can provide higher synergic effect for the discrimination of different geographical origins of Boletus mushrooms, which may be benefit for further authentication and quality assessment of edible mushrooms. Copyright © 2018 Elsevier B.V. All rights reserved.
Development of a channel classification to evaluate potential for cottonwood restoration, lower segments of the Middle Missouri River, South Dakota and Nebraska

USGS Publications Warehouse

Jacobson, Robert B.; Elliott, Caroline M.; Huhmann, Brittany L.

2010-01-01

This report documents development of a spatially explicit river and flood-plain classification to evaluate potential for cottonwood restoration along the Sharpe and Fort Randall segments of the Middle Missouri River. This project involved evaluating existing topographic, water-surface elevation, and soils data to determine if they were sufficient to create a classification similar to the Land Capability Potential Index (LCPI) developed by Jacobson and others (U.S. Geological Survey Scientific Investigations Report 2007–5256) and developing a geomorphically based classification to apply to evaluating restoration potential.Existing topographic, water-surface elevation, and soils data for the Middle Missouri River were not sufficient to replicate the LCPI. The 1/3-arc-second National Elevation Dataset delineated most of the topographic complexity and produced cumulative frequency distributions similar to a high-resolution 5-meter topographic dataset developed for the Lower Missouri River. However, lack of bathymetry in the National Elevation Dataset produces a potentially critical bias in evaluation of frequently flooded surfaces close to the river. High-resolution soils data alone were insufficient to replace the information content of the LCPI. In test reaches in the Lower Missouri River, soil drainage classes from the Soil Survey Geographic Database database correctly classified 0.8–98.9 percent of the flood-plain area at or below the 5-year return interval flood stage depending on state of channel incision; on average for river miles 423–811, soil drainage class correctly classified only 30.2 percent of the flood-plain area at or below the 5-year return interval flood stage. Lack of congruence between soil characteristics and present-day hydrology results from relatively rapid incision and aggradation of segments of the Missouri River resulting from impoundments and engineering. The most sparsely available data in the Middle Missouri River were water-surface elevations. Whereas hydraulically modeled water-surface elevations were available at 1.6-kilometer intervals in the Lower Missouri River, water-surface elevations in the Middle Missouri River had to be interpolated between streamflow-gaging stations spaced 3–116 kilometers. Lack of high-resolution water-surface elevation data precludes development of LCPI-like classification maps.An hierarchical river classification framework is proposed to provide structure for a multiscale river classification. The segment-scale classification presented in this report is deductive and based on presumed effects of dams, significant tributaries, and geological (and engineered) channel constraints. An inductive reach-scale classification, nested within the segment scale, is based on multivariate statistical clustering of geomorphic data collected at 500-meter intervals along the river. Cluster-based classifications delineate reaches of the river with similar channel and flood-plain geomorphology, and presumably, similar geomorphic and hydrologic processes. The dominant variables in the clustering process were channel width (Fort Randall) and valley width (Sharpe), followed by braiding index (both segments).Clusters with multithread and highly sinuous channels are likely to be associated with dynamic channel migration and deposition of fresh, bare sediment conducive to natural cottonwood germination. However, restoration potential within these reaches is likely to be mitigated by interaction of cottonwood life stages with the highly altered flow regime.

The Pathways for Intelligible Speech: Multivariate and Univariate Perspectives

PubMed Central

Evans, S.; Kyong, J.S.; Rosen, S.; Golestani, N.; Warren, J.E.; McGettigan, C.; Mourão-Miranda, J.; Wise, R.J.S.; Scott, S.K.

2014-01-01

An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400–2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486–2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local “searchlights” and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. PMID:23585519
A Multivariate Analytic Approach to the Differential Diagnosis of Apraxia of Speech

ERIC Educational Resources Information Center

Basilakos, Alexandra; Yourganov, Grigori; den Ouden, Dirk-Bart; Fogerty, Daniel; Rorden, Chris; Feenaughty, Lynda; Fridriksson, Julius

2017-01-01

Purpose: Apraxia of speech (AOS) is a consequence of stroke that frequently co-occurs with aphasia. Its study is limited by difficulties with its perceptual evaluation and dissociation from co-occurring impairments. This study examined the classification accuracy of several acoustic measures for the differential diagnosis of AOS in a sample of…
Feature combinations and the divergence criterion

NASA Technical Reports Server (NTRS)

Decell, H. P., Jr.; Mayekar, S. M.

1976-01-01

Classifying large quantities of multidimensional remotely sensed agricultural data requires efficient and effective classification techniques and the construction of certain transformations of a dimension reducing, information preserving nature. The construction of transformations that minimally degrade information (i.e., class separability) is described. Linear dimension reducing transformations for multivariate normal populations are presented. Information content is measured by divergence.
An unconventional approach to ecosystem unit classification in western North Carolina, USA

Treesearch

W. Henry McNab; Sara A. Browning; Steven A. Simon; Penelope E. Fouts

1999-01-01

The authors used an unconventional combination of data transformation and multivariate analyses to reduce subjectivity in identification of ecosystem units in a mountainous region of western North Carolina, USA. Vegetative cover and environmental variables were measured on 79 stratified, randomly located, 0.1 ha sample plots in a 4000 ha watershed. Binary...
Supervision of Ethylene Propylene Diene M-Class (EPDM) Rubber Vulcanization and Recovery Processes Using Attenuated Total Reflection Fourier Transform Infrared (ATR FT-IR) Spectroscopy and Multivariate Analysis.

PubMed

Riba Ruiz, Jordi-Roger; Canals, Trini; Cantero, Rosa

2017-01-01

Ethylene propylene diene monomer (EPDM) rubber is widely used in a diverse type of applications, such as the automotive, industrial and construction sectors among others. Due to its appealing features, the consumption of vulcanized EPDM rubber is growing significantly. However, environmental issues are forcing the application of devulcanization processes to facilitate recovery, which has led rubber manufacturers to implement strict quality controls. Consequently, it is important to develop methods for supervising the vulcanizing and recovery processes of such products. This paper deals with the supervision process of EPDM compounds by means of Fourier transform mid-infrared (FT-IR) spectroscopy and suitable multivariate statistical methods. An expedited and nondestructive classification approach was applied to a sufficient number of EPDM samples with different applied processes, that is, with and without application of vulcanizing agents, vulcanized samples, and microwave treated samples. First the FT-IR spectra of the samples is acquired and next it is processed by applying suitable feature extraction methods, i.e., principal component analysis and canonical variate analysis to obtain the latent variables to be used for classifying test EPDM samples. Finally, the k nearest neighbor algorithm was used in the classification stage. Experimental results prove the accuracy of the proposed method and the potential of FT-IR spectroscopy in this area, since the classification accuracy can be as high as 100%.
Periodontal inflamed surface area as a novel numerical variable describing periodontal conditions

PubMed Central

2017-01-01

Purpose A novel index, the periodontal inflamed surface area (PISA), represents the sum of the periodontal pocket depth of bleeding on probing (BOP)-positive sites. In the present study, we evaluated correlations between PISA and periodontal classifications, and examined PISA as an index integrating the discrete conventional periodontal indexes. Methods This study was a cross-sectional subgroup analysis of data from a prospective cohort study investigating the association between chronic periodontitis and the clinical features of ankylosing spondylitis. Data from 84 patients without systemic diseases (the control group in the previous study) were analyzed in the present study. Results PISA values were positively correlated with conventional periodontal classifications (Spearman correlation coefficient=0.52; P<0.01) and with periodontal indexes, such as BOP and the plaque index (PI) (r=0.94; P<0.01 and r=0.60; P<0.01, respectively; Pearson correlation test). Porphyromonas gingivalis (P. gingivalis) expression and the presence of serum P. gingivalis antibodies were significant factors affecting PISA values in a simple linear regression analysis, together with periodontal classification, PI, bleeding index, and smoking, but not in the multivariate analysis. In the multivariate linear regression analysis, PISA values were positively correlated with the quantity of current smoking, PI, and severity of periodontal disease. Conclusions PISA integrates multiple periodontal indexes, such as probing pocket depth, BOP, and PI into a numerical variable. PISA is advantageous for quantifying periodontal inflammation and plaque accumulation. PMID:29093989
What's in a title? An assessment of whether randomized controlled trial in a title means that it is one.

PubMed

Koletsi, Despina; Pandis, Nikolaos; Polychronopoulou, Argy; Eliades, Theodore

2012-06-01

In this study, we aimed to investigate whether studies published in orthodontic journals and titled as randomized clinical trials are truly randomized clinical trials. A second objective was to explore the association of journal type and other publication characteristics on correct classification. American Journal of Orthodontics and Dentofacial Orthopedics, European Journal of Orthodontics, Angle Orthodontist, Journal of Orthodontics, Orthodontics and Craniofacial Research, World Journal of Orthodontics, Australian Orthodontic Journal, and Journal of Orofacial Orthopedics were hand searched for clinical trials labeled in the title as randomized from 1979 to July 2011. The data were analyzed by using descriptive statistics, and univariable and multivariable examinations of statistical associations via ordinal logistic regression modeling (proportional odds model). One hundred twelve trials were identified. Of the included trials, 33 (29.5%) were randomized clinical trials, 52 (46.4%) had an unclear status, and 27 (24.1%) were not randomized clinical trials. In the multivariable analysis among the included journal types, year of publication, number of authors, multicenter trial, and involvement of statistician were significant predictors of correctly classifying a study as a randomized clinical trial vs unclear and not a randomized clinical trial. From 112 clinical trials in the orthodontic literature labeled as randomized clinical trials, only 29.5% were identified as randomized clinical trials based on clear descriptions of appropriate random number generation and allocation concealment. The type of journal, involvement of a statistician, multicenter trials, greater numbers of authors, and publication year were associated with correct clinical trial classification. This study indicates the need of clear and accurate reporting of clinical trials and the need for educating investigators on randomized clinical trial methodology. Copyright © 2012 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kowalchik, Kristin V.; Vallow, Laura A., E-mail: vallow.laura@mayo.edu; McDonough, Michelle

Purpose: To study the utility of preoperative breast MRI for partial breast irradiation (PBI) patient selection, using multivariable analysis of significant risk factors to create a classification rule. Methods and Materials: Between 2002 and 2009, 712 women with newly diagnosed breast cancer underwent preoperative bilateral breast MRI at Mayo Clinic Florida. Of this cohort, 566 were retrospectively deemed eligible for PBI according to the National Surgical Adjuvant Breast and Bowel Project Protocol B-39 inclusion criteria using physical examination, mammogram, and/or ultrasound. Magnetic resonance images were then reviewed to determine their impact on patient eligibility. The patient and tumor characteristics weremore » evaluated to determine risk factors for altered PBI eligibility after MRI and to create a classification rule. Results: Of the 566 patients initially eligible for PBI, 141 (25%) were found ineligible because of pathologically proven MRI findings. Magnetic resonance imaging detected additional ipsilateral breast cancer in 118 (21%). Of these, 62 (11%) had more extensive disease than originally noted before MRI, and 64 (11%) had multicentric disease. Contralateral breast cancer was detected in 28 (5%). Four characteristics were found to be significantly associated with PBI ineligibility after MRI on multivariable analysis: premenopausal status (P=.021), detection by palpation (P<.001), first-degree relative with a history of breast cancer (P=.033), and lobular histology (P=.002). Risk factors were assigned a score of 0-2. The risk of altered PBI eligibility from MRI based on number of risk factors was 0:18%; 1:22%; 2:42%; 3:65%. Conclusions: Preoperative bilateral breast MRI altered the PBI recommendations for 25% of women. Women who may undergo PBI should be considered for breast MRI, especially those with lobular histology or with 2 or more of the following risk factors: premenopausal, detection by palpation, and first-degree relative with a history of breast cancer.« less
The ecological status of Karavasta Lagoon (Albania): Closing the stable door before the horse has bolted?

PubMed

Munari, Cristina; Tessari, Umberto; Rossi, Remigio; Mistri, Michele

2010-02-01

Karavasta is the widest and most important lagoon in Albania. This study aimed to assess the ecological quality status of the lagoon, acquire knowledge of a natural environment which might be exploited for aquaculture, and give management hints on the basis of anthropogenic impact and ecological conditions. A sampling campaign was carried out in 2008: at six stations, benthic fauna, water, and sediment parameters were considered. Statistical analyses were carried out through multivariate procedures (PCA, classification-clustering, SIMPER, RDA, DISTLM, PERMANOVA). Ecological quality was assessed through the AZTI Marine Biotic Index (AMBI), the multivariate AMBI (M-AMBI) and the Benthic Index based on Taxonomic Sufficiency (BITS). Sediment characteristics (percent organic matter, %OM; redox potential discontinuity layer depth, RPDL; particle size composition) and salinity represented contributory influences on lagoon communities. It was possible to distinguish and characterise a confined area, and benthic communities, from a marine-influenced area and its biota. The number of species was quite low when compared with other open Adriatic lagoons. The M-AMBI and BITS classifications gave quite similar results, which seemed consistent with the ecological conditions of the lagoon, that is a distinction in the ecological quality between the seaward and landward stations, with higher ecological quality (EcoQ) at the seaward stations. Given the pressures and the ecological condition of Karavasta, an intensification of aquaculture activities must be considered with caution, since the lagoon seems at significant risk of serious hypereutrophication. This situation is made worse by the limited water exchange with the marine environment due to the irregular dredging of the communication channels. 2009 Elsevier Ltd. All rights reserved.
Prognosticators and risk grouping in patients with lung metastasis from nasopharyngeal carcinoma: a more accurate and appropriate assessment of prognosis.

PubMed

Cao, Xun; Luo, Rong-Zhen; He, Li-Ru; Li, Yong; Lin, Wen-Qian; Chen, You-Fang; Wen, Zhe-Sheng

2011-08-26

Lung metastases arising from nasopharyngeal carcinomas (NPC) have a relatively favourable prognosis. The purpose of this study was to identify the prognostic factors and to establish a risk grouping in patients with lung metastases from NPC. A total of 198 patients who developed lung metastases from NPC after primary therapy were retrospectively recruited from January 1982 to December 2000. Univariate and multivariate analyses of clinical variables were performed using Cox proportional hazards regression models. Actuarial survival rates were plotted against time using the Kaplan-Meier method, and log-rank testing was used to compare the differences between the curves. The median overall survival (OS) period and the lung metastasis survival (LMS) period were 51.5 and 20.9 months, respectively. After univariate and multivariate analyses of the clinical variables, age, T classification, N classification, site of metastases, secondary metastases and disease-free interval (DFI) correlated with OS, whereas age, VCA-IgA titre, number of metastases and secondary metastases were related to LMS. The prognoses of the low- (score 0-1), intermediate- (score 2-3) and high-risk (score 4-8) subsets based on these factors were significantly different. The 3-, 5- and 10-year survival rates of the low-, intermediate- and high-risk subsets, respectively (P < 0.001) were as follows: 77.3%, 60% and 59%; 52.3%, 30% and 27.8%; and 20.5%, 7% and 0%. In this study, clinical variables provided prognostic indicators of survival in NPC patients with lung metastases. Risk subsets would help in a more accurate assessment of a patient's prognosis in the clinical setting and could facilitate the establishment of patient-tailored medical strategies and supports.
Overweight and Obesity Prevalence Among School-Aged Nunavik Inuit Children According to Three Body Mass Index Classification Systems.

PubMed

Medehouenou, Thierry Comlan Marc; Ayotte, Pierre; St-Jean, Audray; Meziou, Salma; Roy, Cynthia; Muckle, Gina; Lucas, Michel

2015-07-01

Little is known about the suitability of three commonly used body mass index (BMI) classification system for Indigenous children. This study aims to estimate overweight and obesity prevalence among school-aged Nunavik Inuit children according to International Obesity Task Force (IOTF), Centers for Disease Control and Prevention (CDC), and World Health Organization (WHO) BMI classification systems, to measure agreement between those classification systems, and to investigate whether BMI status as defined by these classification systems is associated with levels of metabolic and inflammatory biomarkers. Data were collected on 290 school-aged children (aged 8-14 years; 50.7% girls) from the Nunavik Child Development Study with data collected in 2005-2010. Anthropometric parameters were measured and blood sampled. Participants were classified as normal weight, overweight, and obese according to BMI classification systems. Weighted kappa (κw) statistics assessed agreement between different BMI classification systems, and multivariate analysis of variance ascertained their relationship with metabolic and inflammatory biomarkers. The combined prevalence rate of overweight/obesity was 26.9% (with 6.6% obesity) with IOTF, 24.1% (11.0%) with CDC, and 40.4% (12.8%) with WHO classification systems. Agreement was the highest between IOTF and CDC (κw = .87) classifications, and substantial for IOTF and WHO (κw = .69) and for CDC and WHO (κw = .73). Insulin and high-sensitivity C-reactive protein plasma levels were significantly higher from normal weight to obesity, regardless of classification system. Among obese subjects, higher insulin level was observed with IOTF. Compared with other systems, IOTF classification appears to be more specific to identify overweight and obesity in Inuit children. Copyright © 2015 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Multivariate Protein Signatures of Pre-Clinical Alzheimer's Disease in the Alzheimer's Disease Neuroimaging Initiative (ADNI) Plasma Proteome Dataset

PubMed Central

Johnstone, Daniel; Milward, Elizabeth A.; Berretta, Regina; Moscato, Pablo

2012-01-01

Background Recent Alzheimer's disease (AD) research has focused on finding biomarkers to identify disease at the pre-clinical stage of mild cognitive impairment (MCI), allowing treatment to be initiated before irreversible damage occurs. Many studies have examined brain imaging or cerebrospinal fluid but there is also growing interest in blood biomarkers. The Alzheimer's Disease Neuroimaging Initiative (ADNI) has generated data on 190 plasma analytes in 566 individuals with MCI, AD or normal cognition. We conducted independent analyses of this dataset to identify plasma protein signatures predicting pre-clinical AD. Methods and Findings We focused on identifying signatures that discriminate cognitively normal controls (n = 54) from individuals with MCI who subsequently progress to AD (n = 163). Based on p value, apolipoprotein E (APOE) showed the strongest difference between these groups (p = 2.3×10−13). We applied a multivariate approach based on combinatorial optimization ((α,β)-k Feature Set Selection), which retains information about individual participants and maintains the context of interrelationships between different analytes, to identify the optimal set of analytes (signature) to discriminate these two groups. We identified 11-analyte signatures achieving values of sensitivity and specificity between 65% and 86% for both MCI and AD groups, depending on whether APOE was included and other factors. Classification accuracy was improved by considering “meta-features,” representing the difference in relative abundance of two analytes, with an 8-meta-feature signature consistently achieving sensitivity and specificity both over 85%. Generating signatures based on longitudinal rather than cross-sectional data further improved classification accuracy, returning sensitivities and specificities of approximately 90%. Conclusions Applying these novel analysis approaches to the powerful and well-characterized ADNI dataset has identified sets of plasma biomarkers for pre-clinical AD. While studies of independent test sets are required to validate the signatures, these analyses provide a starting point for developing a cost-effective and minimally invasive test capable of diagnosing AD in its pre-clinical stages. PMID:22485168
Age-specific discrimination of blood plasma samples of healthy and ovarian cancer prone mice using laser-induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Melikechi, Noureddine; Markushin, Yuri; Connolly, Denise C.; Lasue, Jeremie; Ewusi-Annan, Ebo; Makrogiannis, Sokratis

2016-09-01

Epithelial ovarian cancer (EOC) mortality rates are strongly correlated with the stage at which it is diagnosed. Detection of EOC prior to its dissemination from the site of origin is known to significantly improve the patient outcome. However, there are currently no effective methods for early detection of the most common and lethal subtype of EOC. We sought to determine whether laser-induced breakdown spectroscopy (LIBS) and classification techniques such as linear discriminant analysis (LDA) and random forest (RF) could classify and differentiate blood plasma specimens from transgenic mice with ovarian carcinoma and wild type control mice. Herein we report results using this approach to distinguish blood plasma samples obtained from serially bled (at 8, 12, and 16 weeks) tumor-bearing TgMISIIR-TAg transgenic and wild type cancer-free littermate control mice. We have calculated the age-specific accuracy of classification using 18,000 laser-induced breakdown spectra of the blood plasma samples from tumor-bearing mice and wild type controls. When the analysis is performed in the spectral range 250 nm to 680 nm using LDA, these are 76.7 (± 2.6)%, 71.2 (± 1.3)%, and 73.1 (± 1.4)%, for the 8, 12 and 16 weeks. When the RF classifier is used, we obtain values of 78.5 (± 2.3)%, 76.9 (± 2.1)% and 75.4 (± 2.0)% in the spectral range of 250 nm to 680 nm, and 81.0 (± 1.8)%, 80.4 (± 2.1)% and 79.6 (± 3.5)% in 220 nm to 850 nm. In addition, we report, the positive and negative predictive values of the classification of the two classes of blood plasma samples. The approach used in this study is rapid, requires only 5 μL of blood plasma, and is based on the use of unsupervised and widely accepted multivariate analysis algorithms. These findings suggest that LIBS and multivariate analysis may be a novel approach for detecting EOC.
Potential application of machine vision technology to saffron (Crocus sativus L.) quality characterization.

PubMed

Kiani, Sajad; Minaei, Saeid

2016-12-01

Saffron quality characterization is an important issue in the food industry and of interest to the consumers. This paper proposes an expert system based on the application of machine vision technology for characterization of saffron and shows how it can be employed in practical usage. There is a correlation between saffron color and its geographic location of production and some chemical attributes which could be properly used for characterization of saffron quality and freshness. This may be accomplished by employing image processing techniques coupled with multivariate data analysis for quantification of saffron properties. Expert algorithms can be made available for prediction of saffron characteristics such as color as well as for product classification. Copyright © 2016. Published by Elsevier Ltd.
G-mode analysis of the reflection spectra of 84 asteroids.

NASA Astrophysics Data System (ADS)

Birlan, M.; Barucci, M. A.; Fulchignoni, M.

1996-01-01

A revised version of the G-mode multivariate statistics (Coradini et al. 1977) has been used to analyse a sample of 84 asteroids. This sample of asteroids is described by 29 variables, namely 23 colours between 0.9 and 2.35 microns obtained from the data base collected by Bell et al. (Private communication), 5 colors between 0.3 and 0.85 microns from the ECAS survey (Zellner et al. 1985) and the revised IRAS albedo (Tedesco et al. 1992). The G-mode method allows the user to obtain an automatic classification of the asteroids in spectrally homogeneous groups. The role of the IR colours in separating the various groups is outlined, particularly with regard to the fine subdivision of S and C taxonomical types.
Symposium on Machine Processing of Remotely Sensed Data, Purdue University, West Lafayette, Ind., June 29-July 1, 1976, Proceedings

NASA Technical Reports Server (NTRS)

1976-01-01

Papers are presented on the applicability of Landsat data to water management and control needs, IBIS, a geographic information system based on digital image processing and image raster datatype, and the Image Data Access Method (IDAM) for the Earth Resources Interactive Processing System. Attention is also given to the Prototype Classification and Mensuration System (PROCAMS) applied to agricultural data, the use of Landsat for water quality monitoring in North Carolina, and the analysis of geophysical remote sensing data using multivariate pattern recognition. The Illinois crop-acreage estimation experiment, the Pacific Northwest Resources Inventory Demonstration, and the effects of spatial misregistration on multispectral recognition are also considered. Individual items are announced in this issue.
Toward a hyperspectral optical signature of extra virgin olive oil

NASA Astrophysics Data System (ADS)

Mignani, A. G.; Ciaccheri, L.; Thienpont, H.; Ottevaere, H.; Attilio, C.; Cimato, A.

2007-05-01

Italian extra virgin olive oils bearing labels of certified area of origin were considered. Their multispectral digital signature was measured by means of absorption spectroscopy in the 200-1700 nm spectral range. The instrumentation was a fiber optic-based, cheap, and compact device. The spectral data were processed by means of multivariate analysis and plotted on a 2D classification map. The map showed sharp clusters according to the geographical origin of the oils, thus demonstrating the potentials of UV-VIS-NIR spectroscopy for optical fingerprinting. Then, the spectral data were correlated to the content of the most important fatty acids. The good fitting achieved demonstrated that the optical fingerprinting can be used also for predicting nutritional and chemical parameters.
Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification.

PubMed

Oberthuer, André; Berthold, Frank; Warnat, Patrick; Hero, Barbara; Kahlert, Yvonne; Spitz, Rüdiger; Ernestus, Karen; König, Rainer; Haas, Stefan; Eils, Roland; Schwab, Manfred; Brors, Benedikt; Westermann, Frank; Fischer, Matthias

2006-11-01

To develop a gene expression-based classifier for neuroblastoma patients that reliably predicts courses of the disease. Two hundred fifty-one neuroblastoma specimens were analyzed using a customized oligonucleotide microarray comprising 10,163 probes for transcripts with differential expression in clinical subgroups of the disease. Subsequently, the prediction analysis for microarrays (PAM) was applied to a first set of patients with maximally divergent clinical courses (n = 77). The classification accuracy was estimated by a complete 10-times-repeated 10-fold cross validation, and a 144-gene predictor was constructed from this set. This classifier's predictive power was evaluated in an independent second set (n = 174) by comparing results of the gene expression-based classification with those of risk stratification systems of current trials from Germany, Japan, and the United States. The first set of patients was accurately predicted by PAM (cross-validated accuracy, 99%). Within the second set, the PAM classifier significantly separated cohorts with distinct courses (3-year event-free survival [EFS] 0.86 +/- 0.03 [favorable; n = 115] v 0.52 +/- 0.07 [unfavorable; n = 59] and 3-year overall survival 0.99 +/- 0.01 v 0.84 +/- 0.05; both P < .0001) and separated risk groups of current neuroblastoma trials into subgroups with divergent outcome (NB2004: low-risk 3-year EFS 0.86 +/- 0.04 v 0.25 +/- 0.15, P < .0001; intermediate-risk 1.00 v 0.57 +/- 0.19, P = .018; high-risk 0.81 +/- 0.10 v 0.56 +/- 0.08, P = .06). In a multivariate Cox regression model, the PAM predictor classified patients of the second set more accurately than risk stratification of current trials from Germany, Japan, and the United States (P < .001; hazard ratio, 4.756 [95% CI, 2.544 to 8.893]). Integration of gene expression-based class prediction of neuroblastoma patients may improve risk estimation of current neuroblastoma trials.
Multivariate pattern analysis reveals subtle brain anomalies relevant to the cognitive phenotype in neurofibromatosis type 1.

PubMed

Duarte, João V; Ribeiro, Maria J; Violante, Inês R; Cunha, Gil; Silva, Eduardo; Castelo-Branco, Miguel

2014-01-01

Neurofibromatosis Type 1 (NF1) is a common genetic condition associated with cognitive dysfunction. However, the pathophysiology of the NF1 cognitive deficits is not well understood. Abnormal brain structure, including increased total brain volume, white matter (WM) and grey matter (GM) abnormalities have been reported in the NF1 brain. These previous studies employed univariate model-driven methods preventing detection of subtle and spatially distributed differences in brain anatomy. Multivariate pattern analysis allows the combination of information from multiple spatial locations yielding a discriminative power beyond that of single voxels. Here we investigated for the first time subtle anomalies in the NF1 brain, using a multivariate data-driven classification approach. We used support vector machines (SVM) to classify whole-brain GM and WM segments of structural T1 -weighted MRI scans from 39 participants with NF1 and 60 non-affected individuals, divided in children/adolescents and adults groups. We also employed voxel-based morphometry (VBM) as a univariate gold standard to study brain structural differences. SVM classifiers correctly classified 94% of cases (sensitivity 92%; specificity 96%) revealing the existence of brain structural anomalies that discriminate NF1 individuals from controls. Accordingly, VBM analysis revealed structural differences in agreement with the SVM weight maps representing the most relevant brain regions for group discrimination. These included the hippocampus, basal ganglia, thalamus, and visual cortex. This multivariate data-driven analysis thus identified subtle anomalies in brain structure in the absence of visible pathology. Our results provide further insight into the neuroanatomical correlates of known features of the cognitive phenotype of NF1. Copyright © 2012 Wiley Periodicals, Inc.
Molecular Classification Substitutes for the Prognostic Variables Stage, Age, and MYCN Status in Neuroblastoma Risk Assessment.

PubMed

Rosswog, Carolina; Schmidt, Rene; Oberthuer, André; Juraeva, Dilafruz; Brors, Benedikt; Engesser, Anne; Kahlert, Yvonne; Volland, Ruth; Bartenhagen, Christoph; Simon, Thorsten; Berthold, Frank; Hero, Barbara; Faldum, Andreas; Fischer, Matthias

2017-12-01

Current risk stratification systems for neuroblastoma patients consider clinical, histopathological, and genetic variables, and additional prognostic markers have been proposed in recent years. We here sought to select highly informative covariates in a multistep strategy based on consecutive Cox regression models, resulting in a risk score that integrates hazard ratios of prognostic variables. A cohort of 695 neuroblastoma patients was divided into a discovery set (n=75) for multigene predictor generation, a training set (n=411) for risk score development, and a validation set (n=209). Relevant prognostic variables were identified by stepwise multivariable L1-penalized least absolute shrinkage and selection operator (LASSO) Cox regression, followed by backward selection in multivariable Cox regression, and then integrated into a novel risk score. The variables stage, age, MYCN status, and two multigene predictors, NB-th24 and NB-th44, were selected as independent prognostic markers by LASSO Cox regression analysis. Following backward selection, only the multigene predictors were retained in the final model. Integration of these classifiers in a risk scoring system distinguished three patient subgroups that differed substantially in their outcome. The scoring system discriminated patients with diverging outcome in the validation cohort (5-year event-free survival, 84.9±3.4 vs 63.6±14.5 vs 31.0±5.4; P<.001), and its prognostic value was validated by multivariable analysis. We here propose a translational strategy for developing risk assessment systems based on hazard ratios of relevant prognostic variables. Our final neuroblastoma risk score comprised two multigene predictors only, supporting the notion that molecular properties of the tumor cells strongly impact clinical courses of neuroblastoma patients. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

A Temporal Pattern Mining Approach for Classifying Electronic Health Record Data

PubMed Central

Batal, Iyad; Valizadegan, Hamed; Cooper, Gregory F.; Hauskrecht, Milos

2013-01-01

We study the problem of learning classification models from complex multivariate temporal data encountered in electronic health record systems. The challenge is to define a good set of features that are able to represent well the temporal aspect of the data. Our method relies on temporal abstractions and temporal pattern mining to extract the classification features. Temporal pattern mining usually returns a large number of temporal patterns, most of which may be irrelevant to the classification task. To address this problem, we present the Minimal Predictive Temporal Patterns framework to generate a small set of predictive and non-spurious patterns. We apply our approach to the real-world clinical task of predicting patients who are at risk of developing heparin induced thrombocytopenia. The results demonstrate the benefit of our approach in efficiently learning accurate classifiers, which is a key step for developing intelligent clinical monitoring systems. PMID:25309815
Subdivision of Holocene Baltic sea sediments by their physical properties [Gliederung holozaner ostseesedimente nach physikalischen Eigenschaften

USGS Publications Warehouse

Harff, Jan; Bohling, Geoffrey C.; Endler, R.; Davis, J.C.; Olea, R.A.

1999-01-01

The Holocene sediment sequence of a core taken within the centre of the Eastern Gotland Basin was subdivided into 12 lithostratigraphic units based on MSCL-data (sound velocity, wet bulk density, magnetic susceptibility) using a multivariate classification method. The lower 6 units embrace the sediments until the Litorina transgression, and the upper 6 units subdivide the brackish-marine Litorina- and post-Litorina sediments. The upper lithostratigraphic units reflect a change of anoxic (laminated) and oxic (non-laminated) sediments. By application of a numerical stratigraphic correlation method the zonation was extended laterally onto contiguous sediment cores within the central basin. Consequently the change of anoxic and oxic sediments can be used for a general lithostratigraphic subdivision of sediments of the Gotland Basin. A quantitative criterion based on the sediment-physical lithofacies is added to existing subdivisions of the Holocene in the Baltic Sea.
Development and implementation of a low cost micro computer system for LANDSAT analysis and geographic data base applications

NASA Technical Reports Server (NTRS)

Faust, N.; Jordon, L.

1981-01-01

Since the implementation of the GRID and IMGRID computer programs for multivariate spatial analysis in the early 1970's, geographic data analysis subsequently moved from large computers to minicomputers and now to microcomputers with radical reduction in the costs associated with planning analyses. Programs designed to process LANDSAT data to be used as one element in a geographic data base were used once NIMGRID (new IMGRID), a raster oriented geographic information system, was implemented on the microcomputer. Programs for training field selection, supervised and unsupervised classification, and image enhancement were added. Enhancements to the color graphics capabilities of the microsystem allow display of three channels of LANDSAT data in color infrared format. The basic microcomputer hardware needed to perform NIMGRID and most LANDSAT analyses is listed as well as the software available for LANDSAT processing.
A Java-based fMRI processing pipeline evaluation system for assessment of univariate general linear model and multivariate canonical variate analysis-based pipelines.

PubMed

Zhang, Jing; Liang, Lichen; Anderson, Jon R; Gatewood, Lael; Rottenberg, David A; Strother, Stephen C

2008-01-01

As functional magnetic resonance imaging (fMRI) becomes widely used, the demands for evaluation of fMRI processing pipelines and validation of fMRI analysis results is increasing rapidly. The current NPAIRS package, an IDL-based fMRI processing pipeline evaluation framework, lacks system interoperability and the ability to evaluate general linear model (GLM)-based pipelines using prediction metrics. Thus, it can not fully evaluate fMRI analytical software modules such as FSL.FEAT and NPAIRS.GLM. In order to overcome these limitations, a Java-based fMRI processing pipeline evaluation system was developed. It integrated YALE (a machine learning environment) into Fiswidgets (a fMRI software environment) to obtain system interoperability and applied an algorithm to measure GLM prediction accuracy. The results demonstrated that the system can evaluate fMRI processing pipelines with univariate GLM and multivariate canonical variates analysis (CVA)-based models on real fMRI data based on prediction accuracy (classification accuracy) and statistical parametric image (SPI) reproducibility. In addition, a preliminary study was performed where four fMRI processing pipelines with GLM and CVA modules such as FSL.FEAT and NPAIRS.CVA were evaluated with the system. The results indicated that (1) the system can compare different fMRI processing pipelines with heterogeneous models (NPAIRS.GLM, NPAIRS.CVA and FSL.FEAT) and rank their performance by automatic performance scoring, and (2) the rank of pipeline performance is highly dependent on the preprocessing operations. These results suggest that the system will be of value for the comparison, validation, standardization and optimization of functional neuroimaging software packages and fMRI processing pipelines.
SELECTION AND TRAINING, A SURVEY OF IOWA MANUFACTURING FIRMS. MONOGRAPH SERIES NO. 4.

ERIC Educational Resources Information Center

SHERIFF, DON R.; AND OTHERS

INFORMATION ON EMPLOYEE SELECTION AND TRAINING ACTIVITIES WAS SECURED FROM QUESTIONNAIRES RETURNED BY 215 OF 283 FIRMS EMPLOYING AT LEAST 100 PERSONS. DATA FROM 207 SEPARATE ITEMS FOR EACH FIRM WERE KEY PUNCHED AND TABULATED INTO MULTIVARIATE CROSS-CLASSIFICATIONS. OVER 60 PERCENT OF THE FIRMS WERE IN CITIES HAVING OVER 25,000 POPULATION, 40…
Proceedings of the Third Annual Symposium on Mathematical Pattern Recognition and Image Analysis

NASA Technical Reports Server (NTRS)

Guseman, L. F., Jr.

1985-01-01

Topics addressed include: multivariate spline method; normal mixture analysis applied to remote sensing; image data analysis; classifications in spatially correlated environments; probability density functions; graphical nonparametric methods; subpixel registration analysis; hypothesis integration in image understanding systems; rectification of satellite scanner imagery; spatial variation in remotely sensed images; smooth multidimensional interpolation; and optimal frequency domain textural edge detection filters.
Simulation techniques for estimating error in the classification of normal patterns

NASA Technical Reports Server (NTRS)

Whitsitt, S. J.; Landgrebe, D. A.

1974-01-01

Methods of efficiently generating and classifying samples with specified multivariate normal distributions were discussed. Conservative confidence tables for sample sizes are given for selective sampling. Simulation results are compared with classified training data. Techniques for comparing error and separability measure for two normal patterns are investigated and used to display the relationship between the error and the Chernoff bound.
Distributed effects of methylphenidate on the network structure of the resting brain: a connectomic pattern classification analysis.

PubMed

Sripada, Chandra Sekhar; Kessler, Daniel; Welsh, Robert; Angstadt, Michael; Liberzon, Israel; Phan, K Luan; Scott, Clayton

2013-11-01

Methylphenidate is a psychostimulant medication that produces improvements in functions associated with multiple neurocognitive systems. To investigate the potentially distributed effects of methylphenidate on the brain's intrinsic network architecture, we coupled resting state imaging with multivariate pattern classification. In a within-subject, double-blind, placebo-controlled, randomized, counterbalanced, cross-over design, 32 healthy human volunteers received either methylphenidate or placebo prior to two fMRI resting state scans separated by approximately one week. Resting state connectomes were generated by placing regions of interest at regular intervals throughout the brain, and these connectomes were submitted for support vector machine analysis. We found that methylphenidate produces a distributed, reliably detected, multivariate neural signature. Methylphenidate effects were evident across multiple resting state networks, especially visual, somatomotor, and default networks. Methylphenidate reduced coupling within visual and somatomotor networks. In addition, default network exhibited decoupling with several task positive networks, consistent with methylphenidate modulation of the competitive relationship between these networks. These results suggest that connectivity changes within and between large-scale networks are potentially involved in the mechanisms by which methylphenidate improves attention functioning. Copyright © 2013 Elsevier Inc. All rights reserved.
Gap Shape Classification using Landscape Indices and Multivariate Statistics

PubMed Central

Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung

2016-01-01

This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks’ lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap. PMID:27901127
Gap Shape Classification using Landscape Indices and Multivariate Statistics.

PubMed

Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung

2016-11-30

This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks' lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap.
Application of machine learning methods to describe the effects of conjugated equine estrogens therapy on region-specific brain volumes.

PubMed

Casanova, Ramon; Espeland, Mark A; Goveas, Joseph S; Davatzikos, Christos; Gaussoin, Sarah A; Maldjian, Joseph A; Brunner, Robert L; Kuller, Lewis H; Johnson, Karen C; Mysiw, W Jerry; Wagner, Benjamin; Resnick, Susan M

2011-05-01

Use of conjugated equine estrogens (CEE) has been linked to smaller regional brain volumes in women aged ≥65 years; however, it is unknown whether this results in a broad-based characteristic pattern of effects. Structural magnetic resonance imaging was used to assess regional volumes of normal tissue and ischemic lesions among 513 women who had been enrolled in a randomized clinical trial of CEE therapy for an average of 6.6 years, beginning at ages 65-80 years. A multivariate pattern analysis, based on a machine learning technique that combined Random Forest and logistic regression with L(1) penalty, was applied to identify patterns among regional volumes associated with therapy and whether patterns discriminate between treatment groups. The multivariate pattern analysis detected smaller regional volumes of normal tissue within the limbic and temporal lobes among women that had been assigned to CEE therapy. Mean decrements ranged as high as 7% in the left entorhinal cortex and 5% in the left perirhinal cortex, which exceeded the effect sizes reported previously in frontal lobe and hippocampus. Overall accuracy of classification based on these patterns, however, was projected to be only 54.5%. Prescription of CEE therapy for an average of 6.6 years is associated with lower regional brain volumes, but it does not induce a characteristic spatial pattern of changes in brain volumes of sufficient magnitude to discriminate users and nonusers. Copyright © 2011 Elsevier Inc. All rights reserved.
A data fusion-based drought index

NASA Astrophysics Data System (ADS)

Azmi, Mohammad; Rüdiger, Christoph; Walker, Jeffrey P.

2016-03-01

Drought and water stress monitoring plays an important role in the management of water resources, especially during periods of extreme climate conditions. Here, a data fusion-based drought index (DFDI) has been developed and analyzed for three different locations of varying land use and climate regimes in Australia. The proposed index comprehensively considers all types of drought through a selection of indices and proxies associated with each drought type. In deriving the proposed index, weekly data from three different data sources (OzFlux Network, Asia-Pacific Water Monitor, and MODIS-Terra satellite) were employed to first derive commonly used individual standardized drought indices (SDIs), which were then grouped using an advanced clustering method. Next, three different multivariate methods (principal component analysis, factor analysis, and independent component analysis) were utilized to aggregate the SDIs located within each group. For the two clusters in which the grouped SDIs best reflected the water availability and vegetation conditions, the variables were aggregated based on an averaging between the standardized first principal components of the different multivariate methods. Then, considering those two aggregated indices as well as the classifications of months (dry/wet months and active/non-active months), the proposed DFDI was developed. Finally, the symbolic regression method was used to derive mathematical equations for the proposed DFDI. The results presented here show that the proposed index has revealed new aspects in water stress monitoring which previous indices were not able to, by simultaneously considering both hydrometeorological and ecological concepts to define the real water stress of the study areas.
Application of machine learning methods to describe the effects of conjugated equine estrogens therapy on region-specific brain volumes

PubMed Central

Casanova, Ramon; Espeland, Mark A.; Goveas, Joseph S.; Davatzikos, Christos; Gaussoin, Sarah A.; Maldjian, Joseph A.; Brunner, Robert L.; Kuller, Lewis H.; Johnson, Karen C.; Mysiw, W. Jerry; Wagner, Benjamin; Resnick, Susan M.

2011-01-01

Use of conjugated equine estrogens (CEE) has been linked to smaller regional brain volumes in women aged ≥65 years, however it is unknown whether this results in a broad-based characteristic pattern of effects. Structural MRI was used to assess regional volumes of normal tissue and ischemic lesions among 513 women who had been enrolled in a randomized clinical trial of CEE therapy for an average of 6.6 years, beginning at ages 65-80 years. A multivariate pattern analysis, based on a machine learning technique that combined Random Forest and logistic regression with L1 penalty, was applied to identify patterns among regional volumes associated with therapy and whether patterns discriminate between treatment groups. The multivariate pattern analysis detected smaller regional volumes of normal tissue within the limbic and temporal lobes among women that had been assigned to CEE therapy. Mean decrements ranged as high as 7% in the left entorhinal cortex and 5% in the left perirhinal cortex, which exceeded the effect sizes reported previously in frontal lobe and hippocampus. Overall accuracy of classification based on these patterns, however, was projected to be only 54.5%. Prescription of CEE therapy for an average of 6.6 years is associated with lower regional brain volumes, but it does not induce a characteristic spatial pattern of changes in brain volumes of sufficient magnitude to discriminate users and non-users. PMID:21292420
Chemical discrimination of lubricant marketing types using direct analysis in real time time-of-flight mass spectrometry.

PubMed

Maric, Mark; Harvey, Lauren; Tomcsak, Maren; Solano, Angelique; Bridge, Candice

2017-06-30

In comparison to other violent crimes, sexual assaults suffer from very low prosecution and conviction rates especially in the absence of DNA evidence. As a result, the forensic community needs to utilize other forms of trace contact evidence, like lubricant evidence, in order to provide a link between the victim and the assailant. In this study, 90 personal bottled and condom lubricants from the three main marketing types, silicone-based, water-based and condoms, were characterized by direct analysis in real time time of flight mass spectrometry (DART-TOFMS). The instrumental data was analyzed by multivariate statistics including hierarchal cluster analysis, principal component analysis, and linear discriminant analysis. By interpreting the mass spectral data with multivariate statistics, 12 discrete groupings were identified, indicating inherent chemical diversity not only between but within the three main marketing groups. A number of unique chemical markers, both major and minor, were identified, other than the three main chemical components (i.e. PEG, PDMS and nonoxynol-9) currently used for lubricant classification. The data was validated by a stratified 20% withheld cross-validation which demonstrated that there was minimal overlap between the groupings. Based on the groupings identified and unique features of each group, a highly discriminating statistical model was then developed that aims to provide the foundation for the development of a forensic lubricant database that may eventually be applied to casework. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Toward a definition of blueprint of virgin olive oil by comprehensive two-dimensional gas chromatography.

PubMed

Purcaro, Giorgia; Cordero, Chiara; Liberto, Erica; Bicchi, Carlo; Conte, Lanfranco S

2014-03-21

This study investigates the applicability of an iterative approach aimed at defining a chemical blueprint of virgin olive oil volatiles to be correlated to the product sensory quality. The investigation strategy proposed allows to fully exploit the informative content of a comprehensive multidimensional gas chromatography (GC×GC) coupled to a mass spectrometry (MS) data set. Olive oil samples (19), including 5 reference standards, obtained from the International Olive Oil Council, and commercial samples, were submitted to a sensory evaluation by a Panel test, before being analyzed in two laboratories using different instrumentation, column set, and software elaboration packages in view of a cross-validation of the entire methodology. A first classification of samples based on untargeted peak features information, was obtained on raw data from two different column combinations (apolar×polar and polar×apolar) by applying unsupervised multivariate analysis (i.e., principal component analysis-PCA). However, to improve effectiveness and specificity of this classification, peak features were reliably identified (261 compounds), on the basis of the MS spectrum and linear retention index matching, and subjected to successive pair-wise comparisons based on 2D patterns, which revealed peculiar distribution of chemicals correlated with samples sensory classification. The most informative compounds were thus identified and collected in a "blueprint" of specific defects (or combination of defects) successively adopted to discriminate Extra Virgin from defected oils (i.e., lampante oil) with the aid of a supervised approach, i.e., partial least squares-discriminant analysis (PLS-DA). In this last step, the principles of sensomics, which assigns higher information potential to analytes with lower odor threshold proved to be successful, and a much more powerful discrimination of samples was obtained in view of a sensory quality assessment. Copyright © 2014 Elsevier B.V. All rights reserved.
Toward literature-based feature selection for diagnostic classification: a meta-analysis of resting-state fMRI in depression.

PubMed

Sundermann, Benedikt; Olde Lütke Beverborg, Mona; Pfleiderer, Bettina

2014-01-01

Information derived from functional magnetic resonance imaging (fMRI) during wakeful rest has been introduced as a candidate diagnostic biomarker in unipolar major depressive disorder (MDD). Multiple reports of resting state fMRI in MDD describe group effects. Such prior knowledge can be adopted to pre-select potentially discriminating features for diagnostic classification models with the aim to improve diagnostic accuracy. Purpose of this analysis was to consolidate spatial information about alterations of spontaneous brain activity in MDD, primarily to serve as feature selection for multivariate pattern analysis techniques (MVPA). Thirty two studies were included in final analyses. Coordinates extracted from the original reports were assigned to two categories based on directionality of findings. Meta-analyses were calculated using the non-additive activation likelihood estimation approach with coordinates organized by subject group to account for non-independent samples. Converging evidence revealed a distributed pattern of brain regions with increased or decreased spontaneous activity in MDD. The most distinct finding was hyperactivity/hyperconnectivity presumably reflecting the interaction of cortical midline structures (posterior default mode network components including the precuneus and neighboring posterior cingulate cortices associated with self-referential processing and the subgenual anterior cingulate and neighboring medial frontal cortices) with lateral prefrontal areas related to externally-directed cognition. Other areas of hyperactivity/hyperconnectivity include the left lateral parietal cortex, right hippocampus and right cerebellum whereas hypoactivity/hypoconnectivity was observed mainly in the left temporal cortex, the insula, precuneus, superior frontal gyrus, lentiform nucleus and thalamus. Results are made available in two different data formats to be used as spatial hypotheses in future studies, particularly for diagnostic classification by MVPA.
Landslide susceptibility mapping using decision-tree based CHi-squared automatic interaction detection (CHAID) and Logistic regression (LR) integration

NASA Astrophysics Data System (ADS)

Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin

2014-06-01

This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.
Bayesian Integration and Classification of Composition C-4 Plastic Explosives Based on Time-of-Flight-Secondary Ion Mass Spectrometry and Laser Ablation-Inductively Coupled Plasma Mass Spectrometry.

PubMed

Mahoney, Christine M; Kelly, Ryan T; Alexander, Liz; Newburn, Matt; Bader, Sydney; Ewing, Robert G; Fahey, Albert J; Atkinson, David A; Beagley, Nathaniel

2016-04-05

Time-of-flight-secondary ion mass spectrometry (TOF-SIMS) and laser ablation-inductively coupled plasma mass spectrometry (LA-ICPMS) were used for characterization and identification of unique signatures from a series of 18 Composition C-4 plastic explosives. The samples were obtained from various commercial and military sources around the country. Positive and negative ion TOF-SIMS data were acquired directly from the C-4 residue on Si surfaces, where the positive ion mass spectra obtained were consistent with the major composition of organic additives, and the negative ion mass spectra were more consistent with explosive content in the C-4 samples. Each series of mass spectra was subjected to partial least squares-discriminant analysis (PLS-DA), a multivariate statistical analysis approach which serves to first find the areas of maximum variance within different classes of C-4 and subsequently to classify unknown samples based on correlations between the unknown data set and the original data set (often referred to as a training data set). This method was able to successfully classify test samples of C-4, though with a limited degree of certainty. The classification accuracy of the method was further improved by integrating the positive and negative ion data using a Bayesian approach. The TOF-SIMS data was combined with a second analytical method, LA-ICPMS, which was used to analyze elemental signatures in the C-4. The integrated data were able to classify test samples with a high degree of certainty. Results indicate that this Bayesian integrated approach constitutes a robust classification method that should be employable even in dirty samples collected in the field.
Diagnostic classification of macular ganglion cell and retinal nerve fiber layer analysis: differentiation of false-positives from glaucoma.

PubMed

Kim, Ko Eun; Jeoung, Jin Wook; Park, Ki Ho; Kim, Dong Myung; Kim, Seok Hwan

2015-03-01

To investigate the rate and associated factors of false-positive diagnostic classification of ganglion cell analysis (GCA) and retinal nerve fiber layer (RNFL) maps, and characteristic false-positive patterns on optical coherence tomography (OCT) deviation maps. Prospective, cross-sectional study. A total of 104 healthy eyes of 104 normal participants. All participants underwent peripapillary and macular spectral-domain (Cirrus-HD, Carl Zeiss Meditec Inc, Dublin, CA) OCT scans. False-positive diagnostic classification was defined as yellow or red color-coded areas for GCA and RNFL maps. Univariate and multivariate logistic regression analyses were used to determine associated factors. Eyes with abnormal OCT deviation maps were categorized on the basis of the shape and location of abnormal color-coded area. Differences in clinical characteristics among the subgroups were compared. (1) The rate and associated factors of false-positive OCT maps; (2) patterns of false-positive, color-coded areas on the GCA deviation map and associated clinical characteristics. Of the 104 healthy eyes, 42 (40.4%) and 32 (30.8%) showed abnormal diagnostic classifications on any of the GCA and RNFL maps, respectively. Multivariate analysis revealed that false-positive GCA diagnostic classification was associated with longer axial length and larger fovea-disc angle, whereas longer axial length and smaller disc area were associated with abnormal RNFL maps. Eyes with abnormal GCA deviation map were categorized as group A (donut-shaped round area around the inner annulus), group B (island-like isolated area), and group C (diffuse, circular area with an irregular inner margin in either). The axial length showed a significant increasing trend from group A to C (P=0.001), and likewise, the refractive error was more myopic in group C than in groups A (P=0.015) and B (P=0.014). Group C had thinner average ganglion cell-inner plexiform layer thickness compared with other groups (group A=B>C, P=0.004). Abnormal OCT diagnostic classification should be interpreted with caution, especially in eyes with long axial lengths, large fovea-disc angles, and small optic discs. Our findings suggest that the characteristic patterns of OCT deviation map can provide useful clues to distinguish glaucomatous changes from false-positive findings. Copyright © 2015 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.
Predictors of clinical-pathologic stage discrepancy in oral cavity squamous cell carcinoma: A National Cancer Database study.

PubMed

Kılıç, Sarah S; Kılıç, Suat; Crippen, Meghan M; Varughese, Denny; Eloy, Jean Anderson; Baredes, Soly; Mahmoud, Omar M; Park, Richard Chan Woo

2018-04-01

Few studies have examined the frequency and survival implications of clinicopathologic stage discrepancy in oral cavity squamous cell carcinoma (SCC). Oral cavity SCC cases with full pathologic staging information were identified in the National Cancer Database (NCDB). Clinical and pathologic stages were compared. Multivariate logistic regressions were performed to identify factors associated with stage discrepancy. There were 9110 cases identified, of which 67.3% of the cases were stage concordant, 19.9% were upstaged, and 12.8% were downstaged. The N classification discordance (28.5%) was more common than T classification discordance (27.6%). In cases of T classification discordance, downstaging is more common than upstaging (15.4% vs 12.1% of cases), but in cases of N classification discordance, the reverse is true; upstaging is much more common than downstaging (20.1 vs 8.4% of cases). Clinicopathologic stage discrepancy in oral cavity SCC is a common phenomenon that is associated with a number of clinical factors and has survival implications. © 2018 Wiley Periodicals, Inc.

Identifying ADHD children using hemodynamic responses during a working memory task measured by functional near-infrared spectroscopy

NASA Astrophysics Data System (ADS)

Gu, Yue; Miao, Shuo; Han, Junxia; Liang, Zhenhu; Ouyang, Gaoxiang; Yang, Jian; Li, Xiaoli

2018-06-01

Objective. Attention-deficit/hyperactivity disorder (ADHD) is a neurodevelopmental disorder affecting children and adults. Previous studies found that functional near-infrared spectroscopy (fNIRS) can reveal significant group differences in several brain regions between ADHD children and healthy controls during working memory tasks. This study aimed to use fNIRS activation patterns to identify ADHD children from healthy controls. Approach. FNIRS signals from 25 ADHD children and 25 healthy controls performing the n-back task were recorded; then, multivariate pattern analysis was used to discriminate ADHD individuals from healthy controls, and classification performance was evaluated for significance by the permutation test. Main results. The results showed that 86.0% (p<0.001 ) of participants can be correctly classified in leave-one-out cross-validation. The most discriminative brain regions included the bilateral dorsolateral prefrontal cortex, inferior medial prefrontal cortex, right posterior prefrontal cortex, and right temporal cortex. Significance. This study demonstrated that, in a small sample, multivariate pattern analysis can effectively identify ADHD children from healthy controls based on fNIRS signals, which argues for the potential utility of fNIRS in future assessments.
The role of the human leukocyte antigen system in retinopathy of prematurity: a pilot study.

PubMed

Flor-de-Lima, Filipa; Rocha, Gustavo; Proença, Elisa; Tafulo, Sandra; Freitas, Fátima; Guimarães, Hercília

2013-12-01

To assess the association between the human leukocyte antigen system and retinopathy of prematurity. Neonates of <32 weeks of gestational age, born at two level III neonatal intensive care units from January 2000 to December 2001 and from January 2006 to June 2009, were included in the study. Demographic and clinical data were recorded, and retinopathy was classified according to the International Classification. Epithelial cells were collected from the oral cavity and the HLA were studied using the PCR/SSO method. Univariate and multivariate analyses were performed using SPSS® v.18. We evaluated 156 neonates, including 82 (52.6%) males. Median gestational age was 29 (23-31) weeks, and median birth weight was 1030 (525-1935) grams. Seventy (44.9%) of the neonates developed retinopathy. Alleles HLA-B*38, HLA-Cw*12, HLA-DRB1*09, HLA-DRB1*14 (univariate analysis) and HLA-A*68 and HLA-Cw*12 were associated to retinopathy (multivariate analysis). The results suggest that the HLA system may be associated with the development of retinopathy of prematurity. A large-scale population-based study should be performed to clarify this association. ©2013 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.
Development of Raman microspectroscopy for automated detection and imaging of basal cell carcinoma

NASA Astrophysics Data System (ADS)

Larraona-Puy, Marta; Ghita, Adrian; Zoladek, Alina; Perkins, William; Varma, Sandeep; Leach, Iain H.; Koloydenko, Alexey A.; Williams, Hywel; Notingher, Ioan

2009-09-01

We investigate the potential of Raman microspectroscopy (RMS) for automated evaluation of excised skin tissue during Mohs micrographic surgery (MMS). The main aim is to develop an automated method for imaging and diagnosis of basal cell carcinoma (BCC) regions. Selected Raman bands responsible for the largest spectral differences between BCC and normal skin regions and linear discriminant analysis (LDA) are used to build a multivariate supervised classification model. The model is based on 329 Raman spectra measured on skin tissue obtained from 20 patients. BCC is discriminated from healthy tissue with 90+/-9% sensitivity and 85+/-9% specificity in a 70% to 30% split cross-validation algorithm. This multivariate model is then applied on tissue sections from new patients to image tumor regions. The RMS images show excellent correlation with the gold standard of histopathology sections, BCC being detected in all positive sections. We demonstrate the potential of RMS as an automated objective method for tumor evaluation during MMS. The replacement of current histopathology during MMS by a ``generalization'' of the proposed technique may improve the feasibility and efficacy of MMS, leading to a wider use according to clinical need.
Human immunotoxicologic markers of chemical exposures: preliminary validation studies.

PubMed

Wartenberg, D; Laskin, D; Kipen, H

1993-01-01

The circulating cells of the immune system are sensitive to environmental contaminants, and effects are often manifested as changes in the cell surface differentiation antigens of affected populations of cells, particularly lymphocytes. In this investigation, we explore the likelihood that variation in the expression of the surface markers of immune cells can be used as an index of exposure to toxic chemicals. We recruited 38 healthy New Jersey men to study pesticides effects: 19 orchard farmers (high exposure); 13 berry farmers (low exposure); and 6 hardware store owners (no exposure). Immunophenotyping was performed assaying the following cell surface antigens: CD2, CD4, CD8, CD14, CD20, CD26, CD29, CD45R, CD56, and PMN. Data were analyzed using univariate and multivariate methods. There were no significant differences among the groups with respect to routine medical histories, physical examinations, or routine laboratory parameters. No striking differences between groups were seen in univariate tests. Multivariate tests suggested some differences among groups and limited ability to correctly classify individuals based on immunophenotyping results. Immunophenotyping represents a fruitful area of research for improved exposure classification. Work is needed both on mechanistic understanding of the patterns observed and on the statistical interpretation of these patterns.
Recurrent Neural Networks for Multivariate Time Series with Missing Values.

PubMed

Che, Zhengping; Purushotham, Sanjay; Cho, Kyunghyun; Sontag, David; Liu, Yan

2018-04-17

Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.
Exploring Geographical Differentiation of the Hoelen Medicinal Mushroom, Wolfiporia extensa (Agaricomycetes), Using Fourier-Transform Infrared Spectroscopy Combined with Multivariate Analysis.

PubMed

Li, Yan; Zhang, Ji; Zhao, Yanli; Liu, Honggao; Wang, Yuanzhong; Jin, Hang

2016-01-01

In this study the geographical differentiation of dried sclerotia of the medicinal mushroom Wolfiporia extensa, obtained from different regions in Yunnan Province, China, was explored using Fourier-transform infrared (FT-IR) spectroscopy coupled with multivariate data analysis. The FT-IR spectra of 97 samples were obtained for wave numbers ranging from 4000 to 400 cm-1. Then, the fingerprint region of 1800-600 cm-1 of the FT-IR spectrum, rather than the full spectrum, was analyzed. Different pretreatments were applied on the spectra, and a discriminant analysis model based on the Mahalanobis distance was developed to select an optimal pretreatment combination. Two unsupervised pattern recognition procedures- principal component analysis and hierarchical cluster analysis-were applied to enhance the authenticity of discrimination of the specimens. The results showed that excellent classification could be obtained after optimizing spectral pretreatment. The tested samples were successfully discriminated according to their geographical locations. The chemical properties of dried sclerotia of W. extensa were clearly dependent on the mushroom's geographical origins. Furthermore, an interesting finding implied that the elevations of collection areas may have effects on the chemical components of wild W. extensa sclerotia. Overall, this study highlights the feasibility of FT-IR spectroscopy combined with multivariate data analysis in particular for exploring the distinction of different regional W. extensa sclerotia samples. This research could also serve as a basis for the exploitation and utilization of medicinal mushrooms.
A comparison of different chemometrics approaches for the robust classification of electronic nose data.

PubMed

Gromski, Piotr S; Correa, Elon; Vaughan, Andrew A; Wedge, David C; Turner, Michael L; Goodacre, Royston

2014-11-01

Accurate detection of certain chemical vapours is important, as these may be diagnostic for the presence of weapons, drugs of misuse or disease. In order to achieve this, chemical sensors could be deployed remotely. However, the readout from such sensors is a multivariate pattern, and this needs to be interpreted robustly using powerful supervised learning methods. Therefore, in this study, we compared the classification accuracy of four pattern recognition algorithms which include linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), random forests (RF) and support vector machines (SVM) which employed four different kernels. For this purpose, we have used electronic nose (e-nose) sensor data (Wedge et al., Sensors Actuators B Chem 143:365-372, 2009). In order to allow direct comparison between our four different algorithms, we employed two model validation procedures based on either 10-fold cross-validation or bootstrapping. The results show that LDA (91.56% accuracy) and SVM with a polynomial kernel (91.66% accuracy) were very effective at analysing these e-nose data. These two models gave superior prediction accuracy, sensitivity and specificity in comparison to the other techniques employed. With respect to the e-nose sensor data studied here, our findings recommend that SVM with a polynomial kernel should be favoured as a classification method over the other statistical models that we assessed. SVM with non-linear kernels have the advantage that they can be used for classifying non-linear as well as linear mapping from analytical data space to multi-group classifications and would thus be a suitable algorithm for the analysis of most e-nose sensor data.
Proposal for a new T-stage classification system for distal cholangiocarcinoma: a 10-institution study from the U.S. Extrahepatic Biliary Malignancy Consortium.

PubMed

Postlewait, Lauren M; Ethun, Cecilia G; Le, Nina; Pawlik, Timothy M; Buettner, Stefan; Poultsides, George; Tran, Thuy; Idrees, Kamran; Isom, Chelsea A; Fields, Ryan C; Krasnick, Bradley; Weber, Sharon M; Salem, Ahmed; Martin, Robert C G; Scoggins, Charles; Shen, Perry; Mogal, Harveshp D; Schmidt, Carl; Beal, Eliza; Hatzaras, Ioannis; Vitiello, Gerardo; Cardona, Kenneth; Maithel, Shishir K

2016-10-01

Seventh AJCC distal cholangiocarcinoma T-stage classification inadequately separates patients by survival. This retrospective study aimed to define a novel T-stage system to better stratify patients after resection. Curative-intent pancreaticoduodenectomies for distal cholangiocarcinoma (1/2000-5/2015) at 10 US institutions were included. Relationships between tumor characteristics and overall survival (OS) were assessed and incorporated into a novel T-stage classification. 176 patients (median follow-up: 24mo) were included. Current AJCC T-stage was not associated with OS (T1: 23mo, T2: 20mo, T3: 25mo, T4: 12mo; p = 0.355). Tumor size ≥3 cm and presence of lymphovascular invasion (LVI) were associated with decreased OS on univariate and multivariable analyses. Patients were stratified into 3 groups [T1: size <3 cm and (-)LVI (n = 69; 39.2%); T2: size ≥3 cm and (-)LVI or size <3 cm and (+)LVI (n = 82; 46.6%); and T3: size ≥3 cm and (+)LVI (n = 25; 14.2%)]. Each progressive proposed T-stage was associated with decreased median OS (T1: 35mo; T2: 20mo; T3: 8mo; p = 0.002). Current AJCC distal cholangiocarcinoma T-stage does not adequately stratify patients by survival. This proposed T-stage classification, based on tumor size and LVI, better differentiates patient outcomes after resection and could be considered for incorporation into the next AJCC distal cholangiocarcinoma staging system. Copyright © 2016 International Hepato-Pancreato-Biliary Association Inc. Published by Elsevier Ltd. All rights reserved.
Incorporation of support vector machines in the LIBS toolbox for sensitive and robust classification amidst unexpected sample and system variability

PubMed Central

ChariDingari, Narahara; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P.; Kumar, G. Manoj

2012-01-01

Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real world applications, e.g. quality assurance and process monitoring. Specifically, variability in sample, system and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a non-linear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), due to its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data – highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples as well as in related areas of forensic and biological sample analysis. PMID:22292496
The applicability of new TNM classification for humanpapilloma virus-related oropharyngeal cancer in the 8th edition of the AJCC/UICC TNM staging system in Japan: A single-centre study.

PubMed

Sano, Daisuke; Yabuki, Kenichiro; Arai, Yasuhiro; Tanabe, Teruhiko; Chiba, Yoshihiro; Nishimura, Goshi; Takahashi, Hideaki; Yamanaka, Shoji; Oridate, Nobuhiko

2018-06-01

The purpose of this study is to validate the applicability of new TNM classification for human papillomavirus (HPV)-related oropharyngeal cancer (OPC) in the 8th edition of the American Joint Committee on Cancer (AJCC)/Union for International Cancer Control (UICC) TNM staging system in Japan. A total of 91 OPC patients treated with radiation-based therapy between November 2001 and July 2015 were analyzed retrospectively in this study. HPV infection status was evaluated using tumor p16 expression. 40 OPC patients (44.0%) had HPV-positive disease in this study. The distribution of disease stage of HPV-positive OPC patients dramatically changed from the 7th edition to the 8th edition of AJCC/UICC TNM classification. However, neither the 8th edition nor the 7th edition of the AJCC/UICC TNM staging system could adequately predict outcomes of HPV-positive OPC patients in our patient series. On the other hand, our multivariate analysis indicated that matted nodes and age ≥63 were independent prognostic factors for progression-free survival. In addition, HPV-positive OPC patients with stage I without matted nodes showed significantly better overall and progression-free survival compared with those with stage I with matted nodes and stages II and III in the 8th edition of the AJCC/UICC TNM staging system (P=0.008, and P=0.043, respectively). Our results suggested that matted nodes of HPV-positive OPC patients might be additionally examined to apply the 8th edition of AJCC/UICC TNM classification for more adequate predicting outcomes of HPV-positive OPC patients. Copyright © 2017 Elsevier B.V. All rights reserved.
Evaluation of thyroid eye disease: quality-of-life questionnaire (TED-QOL) in Korean patients.

PubMed

Son, Byeong Jae; Lee, Sang Yeul; Yoon, Jin Sook

2014-04-01

To assess impaired quality of life (QOL) of Korean patients with thyroid eye disease (TED) using the TED-QOL questionnaire, to evaluate the adaptability of the questionnaire, and to assess the correlation between TED-QOL and scales of disease severity. Prospective, cross-sectional study. Total of 90 consecutive adult patients with TED and Graves' disease were included in this study. TED-QOL was translated into Korean and administered to the patients. The results were compared with clinical severity scores (clinical activity score, VISA (vision loss (optic neuropathy); inflammation; strabismus/motility; appearance/exposure) classification, modified NOSPECS (no signs or symptoms; only signs; soft tissue; proptosis; extraocular muscle; cornea; sight loss) score, Gorman diplopia scale, and European Group of Graves' Orbitopathy Classification). Clinical scores indicating inflammation and strabismus in patients with TED were positively correlated with overall and visual function-related QOL (Spearman coefficient 0.21-0.38, p < 0.05). Clinical scores associated with appearance were positively correlated with appearance-related QOL (Spearman coefficient 0.26-0.27, p < 0.05). In multivariate analysis, age, soft-tissue inflammation, motility disorder of modified NOSPECS, and motility disorder of VISA classification had positive correlation with overall and function-related QOL. Sex, soft-tissue inflammation, proptosis of modified NOSPECS, and appearance of VISA classification had correlation with appearance-related QOL. In addition, validity of TED-QOL was proved sufficient based on the outcomes of patient interviews and correlation between the subscales of TED-QOL. TED-QOL showed significant correlations with various objective clinical parameters of TED. TED-QOL was a simple and useful tool for rapid evaluation of QOL in daily outpatient clinics, which could be readily translated into different languages to be widely applicable to various populations. Copyright © 2014 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.
Incorporation of support vector machines in the LIBS toolbox for sensitive and robust classification amidst unexpected sample and system variability.

PubMed

Dingari, Narahara Chari; Barman, Ishan; Myakalwar, Ashwin Kumar; Tewari, Surya P; Kumar Gundawar, Manoj

2012-03-20

Despite the intrinsic elemental analysis capability and lack of sample preparation requirements, laser-induced breakdown spectroscopy (LIBS) has not been extensively used for real-world applications, e.g., quality assurance and process monitoring. Specifically, variability in sample, system, and experimental parameters in LIBS studies present a substantive hurdle for robust classification, even when standard multivariate chemometric techniques are used for analysis. Considering pharmaceutical sample investigation as an example, we propose the use of support vector machines (SVM) as a nonlinear classification method over conventional linear techniques such as soft independent modeling of class analogy (SIMCA) and partial least-squares discriminant analysis (PLS-DA) for discrimination based on LIBS measurements. Using over-the-counter pharmaceutical samples, we demonstrate that the application of SVM enables statistically significant improvements in prospective classification accuracy (sensitivity), because of its ability to address variability in LIBS sample ablation and plasma self-absorption behavior. Furthermore, our results reveal that SVM provides nearly 10% improvement in correct allocation rate and a concomitant reduction in misclassification rates of 75% (cf. PLS-DA) and 80% (cf. SIMCA)-when measurements from samples not included in the training set are incorporated in the test data-highlighting its robustness. While further studies on a wider matrix of sample types performed using different LIBS systems is needed to fully characterize the capability of SVM to provide superior predictions, we anticipate that the improved sensitivity and robustness observed here will facilitate application of the proposed LIBS-SVM toolbox for screening drugs and detecting counterfeit samples, as well as in related areas of forensic and biological sample analysis.
A web-based system for neural network based classification in temporomandibular joint osteoarthritis.

PubMed

de Dumast, Priscille; Mirabel, Clément; Cevidanes, Lucia; Ruellas, Antonio; Yatabe, Marilia; Ioshida, Marcos; Ribera, Nina Tubau; Michoud, Loic; Gomes, Liliane; Huang, Chao; Zhu, Hongtu; Muniz, Luciana; Shoukri, Brandon; Paniagua, Beatriz; Styner, Martin; Pieper, Steve; Budin, Francois; Vimort, Jean-Baptiste; Pascal, Laura; Prieto, Juan Carlos

2018-07-01

The purpose of this study is to describe the methodological innovations of a web-based system for storage, integration and computation of biomedical data, using a training imaging dataset to remotely compute a deep neural network classifier of temporomandibular joint osteoarthritis (TMJOA). This study imaging dataset consisted of three-dimensional (3D) surface meshes of mandibular condyles constructed from cone beam computed tomography (CBCT) scans. The training dataset consisted of 259 condyles, 105 from control subjects and 154 from patients with diagnosis of TMJ OA. For the image analysis classification, 34 right and left condyles from 17 patients (39.9 ± 11.7 years), who experienced signs and symptoms of the disease for less than 5 years, were included as the testing dataset. For the integrative statistical model of clinical, biological and imaging markers, the sample consisted of the same 17 test OA subjects and 17 age and sex matched control subjects (39.4 ± 15.4 years), who did not show any sign or symptom of OA. For these 34 subjects, a standardized clinical questionnaire, blood and saliva samples were also collected. The technological methodologies in this study include a deep neural network classifier of 3D condylar morphology (ShapeVariationAnalyzer, SVA), and a flexible web-based system for data storage, computation and integration (DSCI) of high dimensional imaging, clinical, and biological data. The DSCI system trained and tested the neural network, indicating 5 stages of structural degenerative changes in condylar morphology in the TMJ with 91% close agreement between the clinician consensus and the SVA classifier. The DSCI remotely ran with a novel application of a statistical analysis, the Multivariate Functional Shape Data Analysis, that computed high dimensional correlations between shape 3D coordinates, clinical pain levels and levels of biological markers, and then graphically displayed the computation results. The findings of this study demonstrate a comprehensive phenotypic characterization of TMJ health and disease at clinical, imaging and biological levels, using novel flexible and versatile open-source tools for a web-based system that provides advanced shape statistical analysis and a neural network based classification of temporomandibular joint osteoarthritis. Published by Elsevier Ltd.
Mini-DIAL system measurements coupled with multivariate data analysis to identify TIC and TIM simulants: preliminary absorption database analysis.

NASA Astrophysics Data System (ADS)

Gaudio, P.; Malizia, A.; Gelfusa, M.; Martinelli, E.; Di Natale, C.; Poggi, L. A.; Bellecci, C.

2017-01-01

Nowadays Toxic Industrial Components (TICs) and Toxic Industrial Materials (TIMs) are one of the most dangerous and diffuse vehicle of contamination in urban and industrial areas. The academic world together with the industrial and military one are working on innovative solutions to monitor the diffusion in atmosphere of such pollutants. In this phase the most common commercial sensors are based on “point detection” technology but it is clear that such instruments cannot satisfy the needs of the smart cities. The new challenge is developing stand-off systems to continuously monitor the atmosphere. Quantum Electronics and Plasma Physics (QEP) research group has a long experience in laser system development and has built two demonstrators based on DIAL (Differential Absorption of Light) technology could be able to identify chemical agents in atmosphere. In this work the authors will present one of those DIAL system, the miniaturized one, together with the preliminary results of an experimental campaign conducted on TICs and TIMs simulants in cell with aim of use the absorption database for the further atmospheric an analysis using the same DIAL system. The experimental results are analysed with standard multivariate data analysis technique as Principal Component Analysis (PCA) to develop a classification model aimed at identifying organic chemical compound in atmosphere. The preliminary results of absorption coefficients of some chemical compound are shown together pre PCA analysis.
A multivariate study of mangrove morphology (Rhizophora mangle) using both above and below-water plant architecture

USGS Publications Warehouse

Brooks, R.A.; Bell, S.S.

2005-01-01

A descriptive study of the architecture of the red mangrove, Rhizophora mangle L., habitat of Tampa Bay, FL, was conducted to assess if plant architecture could be used to discriminate overwash from fringing forest type. Seven above-water (e.g., tree height, diameter at breast height, and leaf area) and 10 below-water (e.g., root density, root complexity, and maximum root order) architectural features were measured in eight mangrove stands. A multivariate technique (discriminant analysis) was used to test the ability of different models comprising above-water, below-water, or whole tree architecture to classify forest type. Root architectural features appear to be better than classical forestry measurements at discriminating between fringing and overwash forests but, regardless of the features loaded into the model, misclassification rates were high as forest type was only correctly classified in 66% of the cases. Based upon habitat architecture, the results of this study do not support a sharp distinction between overwash and fringing red mangrove forests in Tampa Bay but rather indicate that the two are architecturally undistinguishable. Therefore, within this northern portion of the geographic range of red mangroves, a more appropriate classification system based upon architecture may be one in which overwash and fringing forest types are combined into a single, "tide dominated" category. ?? 2005 Elsevier Ltd. All rights reserved.
Decoding Multiple Sound Categories in the Human Temporal Cortex Using High Resolution fMRI

PubMed Central

Zhang, Fengqing; Wang, Ji-Ping; Kim, Jieun; Parrish, Todd; Wong, Patrick C. M.

2015-01-01

Perception of sound categories is an important aspect of auditory perception. The extent to which the brain’s representation of sound categories is encoded in specialized subregions or distributed across the auditory cortex remains unclear. Recent studies using multivariate pattern analysis (MVPA) of brain activations have provided important insights into how the brain decodes perceptual information. In the large existing literature on brain decoding using MVPA methods, relatively few studies have been conducted on multi-class categorization in the auditory domain. Here, we investigated the representation and processing of auditory categories within the human temporal cortex using high resolution fMRI and MVPA methods. More importantly, we considered decoding multiple sound categories simultaneously through multi-class support vector machine-recursive feature elimination (MSVM-RFE) as our MVPA tool. Results show that for all classifications the model MSVM-RFE was able to learn the functional relation between the multiple sound categories and the corresponding evoked spatial patterns and classify the unlabeled sound-evoked patterns significantly above chance. This indicates the feasibility of decoding multiple sound categories not only within but across subjects. However, the across-subject variation affects classification performance more than the within-subject variation, as the across-subject analysis has significantly lower classification accuracies. Sound category-selective brain maps were identified based on multi-class classification and revealed distributed patterns of brain activity in the superior temporal gyrus and the middle temporal gyrus. This is in accordance with previous studies, indicating that information in the spatially distributed patterns may reflect a more abstract perceptual level of representation of sound categories. Further, we show that the across-subject classification performance can be significantly improved by averaging the fMRI images over items, because the irrelevant variations between different items of the same sound category are reduced and in turn the proportion of signals relevant to sound categorization increases. PMID:25692885
Decoding multiple sound categories in the human temporal cortex using high resolution fMRI.

PubMed

Zhang, Fengqing; Wang, Ji-Ping; Kim, Jieun; Parrish, Todd; Wong, Patrick C M

2015-01-01

Perception of sound categories is an important aspect of auditory perception. The extent to which the brain's representation of sound categories is encoded in specialized subregions or distributed across the auditory cortex remains unclear. Recent studies using multivariate pattern analysis (MVPA) of brain activations have provided important insights into how the brain decodes perceptual information. In the large existing literature on brain decoding using MVPA methods, relatively few studies have been conducted on multi-class categorization in the auditory domain. Here, we investigated the representation and processing of auditory categories within the human temporal cortex using high resolution fMRI and MVPA methods. More importantly, we considered decoding multiple sound categories simultaneously through multi-class support vector machine-recursive feature elimination (MSVM-RFE) as our MVPA tool. Results show that for all classifications the model MSVM-RFE was able to learn the functional relation between the multiple sound categories and the corresponding evoked spatial patterns and classify the unlabeled sound-evoked patterns significantly above chance. This indicates the feasibility of decoding multiple sound categories not only within but across subjects. However, the across-subject variation affects classification performance more than the within-subject variation, as the across-subject analysis has significantly lower classification accuracies. Sound category-selective brain maps were identified based on multi-class classification and revealed distributed patterns of brain activity in the superior temporal gyrus and the middle temporal gyrus. This is in accordance with previous studies, indicating that information in the spatially distributed patterns may reflect a more abstract perceptual level of representation of sound categories. Further, we show that the across-subject classification performance can be significantly improved by averaging the fMRI images over items, because the irrelevant variations between different items of the same sound category are reduced and in turn the proportion of signals relevant to sound categorization increases.
Is the Definition of Roma an Important Matter? The Parallel Application of Self and External Classification of Ethnicity in a Population-Based Health Interview Survey.

PubMed

Janka, Eszter Anna; Vincze, Ferenc; Ádány, Róza; Sándor, János

2018-02-16

The Roma population is typified by a poor and, due to difficulties in ethnicity assessment, poorly documented health status. We aimed to compare the usefulness of self-reporting and observer-reporting in Roma classification for surveys investigating differences between Roma and non-Roma populations. Both self-reporting and observer-reporting of Roma ethnicity were applied in a population-based health interview survey. A questionnaire was completed by 1849 people aged 18-64 years; this questionnaire provided information on 52 indicators (morbidity, functionality, lifestyle, social capital, accidents, healthcare use) indicators. Multivariate logistic regression models controlling for age, sex, education and employment were used to produce indicators for differences between the self-reported Roma ( N = 124) and non-Roma ( N = 1725) populations, as well as between observer-reported Roma ( N = 179) and non-Roma populations ( N = 1670). Differences between interviewer-reported and self-reported individuals of Roma ethnicity in statistical inferences were observed for only seven indicators. The self-reporting approach was more sensitive for two indicators, and the observer-reported assessment for five indicators. Based on our results, the self-reported identity can be considered as a useful approach, and the application of observer-reporting cannot considerably increase the usefulness of a survey, because the differences between Roma and non-Roma individuals are much bigger than the differences between indicators produced by self-reported or observer-reported data on individuals of Roma ethnicity.
Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs.

PubMed

Cao, Hongbao; Duan, Junbo; Lin, Dongdong; Shugart, Yin Yao; Calhoun, Vince; Wang, Yu-Ping

2014-11-15

Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different natures of diverse data modality, data integration is challenging. Here we address the data integration problem by developing a generalized sparse model (GSM) using weighting factors to integrate multi-modality data for biomarker selection. As an example, we applied the GSM model to a joint analysis of two types of schizophrenia data sets: 759,075 SNPs and 153,594 functional magnetic resonance imaging (fMRI) voxels in 208 subjects (92 cases/116 controls). To solve this small-sample-large-variable problem, we developed a novel sparse representation based variable selection (SRVS) algorithm, with the primary aim to identify biomarkers associated with schizophrenia. To validate the effectiveness of the selected variables, we performed multivariate classification followed by a ten-fold cross validation. We compared our proposed SRVS algorithm with an earlier sparse model based variable selection algorithm for integrated analysis. In addition, we compared with the traditional statistics method for uni-variant data analysis (Chi-squared test for SNP data and ANOVA for fMRI data). Results showed that our proposed SRVS method can identify novel biomarkers that show stronger capability in distinguishing schizophrenia patients from healthy controls. Moreover, better classification ratios were achieved using biomarkers from both types of data, suggesting the importance of integrative analysis. Copyright © 2014 Elsevier Inc. All rights reserved.
A hybrid PCA-CART-MARS-based prognostic approach of the remaining useful life for aircraft engines.

PubMed

Sánchez Lasheras, Fernando; García Nieto, Paulino José; de Cos Juez, Francisco Javier; Mayo Bayón, Ricardo; González Suárez, Victor Manuel

2015-03-23

Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines.

A Hybrid PCA-CART-MARS-Based Prognostic Approach of the Remaining Useful Life for Aircraft Engines

PubMed Central

Lasheras, Fernando Sánchez; Nieto, Paulino José García; de Cos Juez, Francisco Javier; Bayón, Ricardo Mayo; Suárez, Victor Manuel González

2015-01-01

Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines. PMID:25806876
Predictive factors in patients with hepatocellular carcinoma receiving sorafenib therapy using time-dependent receiver operating characteristic analysis.

PubMed

Nishikawa, Hiroki; Nishijima, Norihiro; Enomoto, Hirayuki; Sakamoto, Azusa; Nasu, Akihiro; Komekado, Hideyuki; Nishimura, Takashi; Kita, Ryuichi; Kimura, Toru; Iijima, Hiroko; Nishiguchi, Shuhei; Osaki, Yukio

2017-01-01

To investigate variables before sorafenib therapy on the clinical outcomes in hepatocellular carcinoma (HCC) patients receiving sorafenib and to further assess and compare the predictive performance of continuous parameters using time-dependent receiver operating characteristics (ROC) analysis. A total of 225 HCC patients were analyzed. We retrospectively examined factors related to overall survival (OS) and progression free survival (PFS) using univariate and multivariate analyses. Subsequently, we performed time-dependent ROC analysis of continuous parameters which were significant in the multivariate analysis in terms of OS and PFS. Total sum of area under the ROC in all time points (defined as TAAT score) in each case was calculated. Our cohort included 175 male and 50 female patients (median age, 72 years) and included 158 Child-Pugh A and 67 Child-Pugh B patients. The median OS time was 0.68 years, while the median PFS time was 0.24 years. On multivariate analysis, gender, body mass index (BMI), Child-Pugh classification, extrahepatic metastases, tumor burden, aspartate aminotransferase (AST) and alpha-fetoprotein (AFP) were identified as significant predictors of OS and ECOG-performance status, Child-Pugh classification and extrahepatic metastases were identified as significant predictors of PFS. Among three continuous variables (i.e., BMI, AST and AFP), AFP had the highest TAAT score for the entire cohort. In subgroup analyses, AFP had the highest TAAT score except for Child-Pugh B and female among three continuous variables. In continuous variables, AFP could have higher predictive accuracy for survival in HCC patients undergoing sorafenib therapy.
ATLS Hypovolemic Shock Classification by Prediction of Blood Loss in Rats Using Regression Models.

PubMed

Choi, Soo Beom; Choi, Joon Yul; Park, Jee Soo; Kim, Deok Won

2016-07-01

In our previous study, our input data set consisted of 78 rats, the blood loss in percent as a dependent variable, and 11 independent variables (heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, respiration rate, temperature, perfusion index, lactate concentration, shock index, and new index (lactate concentration/perfusion)). The machine learning methods for multicategory classification were applied to a rat model in acute hemorrhage to predict the four Advanced Trauma Life Support (ATLS) hypovolemic shock classes for triage in our previous study. However, multicategory classification is much more difficult and complicated than binary classification. We introduce a simple approach for classifying ATLS hypovolaemic shock class by predicting blood loss in percent using support vector regression and multivariate linear regression (MLR). We also compared the performance of the classification models using absolute and relative vital signs. The accuracies of support vector regression and MLR models with relative values by predicting blood loss in percent were 88.5% and 84.6%, respectively. These were better than the best accuracy of 80.8% of the direct multicategory classification using the support vector machine one-versus-one model in our previous study for the same validation data set. Moreover, the simple MLR models with both absolute and relative values could provide possibility of the future clinical decision support system for ATLS classification. The perfusion index and new index were more appropriate with relative changes than absolute values.
FT-IR spectroscopy and multivariate analysis as an auxiliary tool for diagnosis of mental disorders: Bipolar and schizophrenia cases

NASA Astrophysics Data System (ADS)

Ogruc Ildiz, G.; Arslan, M.; Unsalan, O.; Araujo-Andrade, C.; Kurt, E.; Karatepe, H. T.; Yilmaz, A.; Yalcinkaya, O. B.; Herken, H.

2016-01-01

In this study, a methodology based on Fourier-transform infrared spectroscopy and principal component analysis and partial least square methods is proposed for the analysis of blood plasma samples in order to identify spectral changes correlated with some biomarkers associated with schizophrenia and bipolarity. Our main goal was to use the spectral information for the calibration of statistical models to discriminate and classify blood plasma samples belonging to bipolar and schizophrenic patients. IR spectra of 30 samples of blood plasma obtained from each, bipolar and schizophrenic patients and healthy control group were collected. The results obtained from principal component analysis (PCA) show a clear discrimination between the bipolar (BP), schizophrenic (SZ) and control group' (CG) blood samples that also give possibility to identify three main regions that show the major differences correlated with both mental disorders (biomarkers). Furthermore, a model for the classification of the blood samples was calibrated using partial least square discriminant analysis (PLS-DA), allowing the correct classification of BP, SZ and CG samples. The results obtained applying this methodology suggest that it can be used as a complimentary diagnostic tool for the detection and discrimination of these mental diseases.
Accurate Identification of MCI Patients via Enriched White-Matter Connectivity Network

NASA Astrophysics Data System (ADS)

Wee, Chong-Yaw; Yap, Pew-Thian; Brownyke, Jeffery N.; Potter, Guy G.; Steffens, David C.; Welsh-Bohmer, Kathleen; Wang, Lihong; Shen, Dinggang

Mild cognitive impairment (MCI), often a prodromal phase of Alzheimer's disease (AD), is frequently considered to be a good target for early diagnosis and therapeutic interventions of AD. Recent emergence of reliable network characterization techniques have made understanding neurological disorders at a whole brain connectivity level possible. Accordingly, we propose a network-based multivariate classification algorithm, using a collection of measures derived from white-matter (WM) connectivity networks, to accurately identify MCI patients from normal controls. An enriched description of WM connections, utilizing six physiological parameters, i.e., fiber penetration count, fractional anisotropy (FA), mean diffusivity (MD), and principal diffusivities (λ 1, λ 2, λ 3), results in six connectivity networks for each subject to account for the connection topology and the biophysical properties of the connections. Upon parcellating the brain into 90 regions-of-interest (ROIs), the average statistics of each ROI in relation to the remaining ROIs are extracted as features for classification. These features are then sieved to select the most discriminant subset of features for building an MCI classifier via support vector machines (SVMs). Cross-validation results indicate better diagnostic power of the proposed enriched WM connection description than simple description with any single physiological parameter.
Intraoperative Raman Spectroscopy of Soft Tissue Sarcomas

PubMed Central

Nguyen, John Q.; Gowani, Zain S.; O’Connor, Maggie; Pence, Isaac J.; Nguyen, The-Quyen; Holt, Ginger E.; Schwartz, Herbert S.; Halpern, Jennifer L.; Mahadevan-Jansen, Anita

2017-01-01

Background and Objective Soft tissue sarcomas (STS) are a rare and heterogeneous group of malignant tumors that are often treated through surgical resection. Current intraoperative margin assessment methods are limited and highlight the need for an improved approach with respect to time and specificity. Here we investigate the potential of near-infrared Raman spectroscopy for the intraoperative differentiation of STS from surrounding normal tissue. Materials and Methods In vivo Raman measurements at 785 nm excitation were intraoperatively acquired from subjects undergoing STS resection using a probe based spectroscopy system. A multivariate classification algorithm was developed in order to automatically identify spectral features that can be used to differentiate STS from the surrounding normal muscle and fat. The classification algorithm was subsequently tested using leave-one-subject-out cross-validation. Results With the exclusion of well-differentiated liposarcomas, the algorithm was able to classify STS from the surrounding normal muscle and fat with a sensitivity and specificity of 89.5% and 96.4%, respectively. Conclusion These results suggest that single point near-infrared Raman spectroscopy could be utilized as a rapid and non-destructive surgical guidance tool for identifying abnormal tissue margins in need of further excision. PMID:27454580
Intraoperative Raman spectroscopy of soft tissue sarcomas.

PubMed

Nguyen, John Q; Gowani, Zain S; O'Connor, Maggie; Pence, Isaac J; Nguyen, The-Quyen; Holt, Ginger E; Schwartz, Herbert S; Halpern, Jennifer L; Mahadevan-Jansen, Anita

2016-10-01

Soft tissue sarcomas (STS) are a rare and heterogeneous group of malignant tumors that are often treated through surgical resection. Current intraoperative margin assessment methods are limited and highlight the need for an improved approach with respect to time and specificity. Here we investigate the potential of near-infrared Raman spectroscopy for the intraoperative differentiation of STS from surrounding normal tissue. In vivo Raman measurements at 785 nm excitation were intraoperatively acquired from subjects undergoing STS resection using a probe based spectroscopy system. A multivariate classification algorithm was developed in order to automatically identify spectral features that can be used to differentiate STS from the surrounding normal muscle and fat. The classification algorithm was subsequently tested using leave-one-subject-out cross-validation. With the exclusion of well-differentiated liposarcomas, the algorithm was able to classify STS from the surrounding normal muscle and fat with a sensitivity and specificity of 89.5% and 96.4%, respectively. These results suggest that single point near-infrared Raman spectroscopy could be utilized as a rapid and non-destructive surgical guidance tool for identifying abnormal tissue margins in need of further excision. Lasers Surg. Med. 48:774-781, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Chemometrics Methods for Specificity, Authenticity and Traceability Analysis of Olive Oils: Principles, Classifications and Applications.

PubMed

Messai, Habib; Farman, Muhammad; Sarraj-Laabidi, Abir; Hammami-Semmar, Asma; Semmar, Nabil

2016-11-17

Olive oils (OOs) show high chemical variability due to several factors of genetic, environmental and anthropic types. Genetic and environmental factors are responsible for natural compositions and polymorphic diversification resulting in different varietal patterns and phenotypes. Anthropic factors, however, are at the origin of different blends' preparation leading to normative, labelled or adulterated commercial products. Control of complex OO samples requires their (i) characterization by specific markers; (ii) authentication by fingerprint patterns; and (iii) monitoring by traceability analysis. These quality control and management aims require the use of several multivariate statistical tools: specificity highlighting requires ordination methods; authentication checking calls for classification and pattern recognition methods; traceability analysis implies the use of network-based approaches able to separate or extract mixed information and memorized signals from complex matrices. This chapter presents a review of different chemometrics methods applied for the control of OO variability from metabolic and physical-chemical measured characteristics. The different chemometrics methods are illustrated by different study cases on monovarietal and blended OO originated from different countries. Chemometrics tools offer multiple ways for quantitative evaluations and qualitative control of complex chemical variability of OO in relation to several intrinsic and extrinsic factors.
Evaluation of a multi-fibre needle Raman probe for tissue analysis

NASA Astrophysics Data System (ADS)

Fullwood, Leanne M.; Iping Petterson, Ingeborg E.; Dudgeon, Alexander P.; Lloyd, Gavin R.; Kendall, Catherine; Hall, Charlie; Day, John C. C.; Stone, Nick

2016-03-01

Raman spectroscopy is a rapid technique for the identification of cancers. Its coupling with a hypodermic needle provides a minimally invasive instrument with the potential to aid real time assessment of suspicious lesions in vivo and guide surgery. A fibre optic Raman needle probe was utilised in this study to evaluate the classification ability of the instrument as a diagnostic tool together with multivariate analysis, through measurements of tissues from different animal species as well as various different porcine tissue types. Cross validation was performed and preliminary classification accuracies were calculated as 100% for the identification of tissue type and 97.5% for the identification of animal species. A lymph node sample was also measured using the needle probe to assess the use of the technique for human tissue and hence its efficiency as a clinical instrument. This needle probe has been demonstrated to have the capabilities to classify tissue samples based on their biochemical components. The Raman needle probe also has the potential to act as a diagnostic and surgical tool to delineate cancerous from non-cancerous cells in real time, thus assisting complete removal of a tumour.
Postoperative chemoradiotherapy in patients with head and neck cancer aged 70 or older with positive margins or extranodal extension and the influence of nodal classification.

PubMed

Yoshida, Emi J; Luu, Michael; David, John M; Kim, Sungjin; Mita, Alain; Scher, Kevin; Shiao, Stephen L; Tighiouart, Mourad; Ho, Allen S; Zumsteg, Zachary S

2018-06-01

Postoperative concomitant chemoradiotherapy (CRT) improves outcomes for younger adults with head and neck squamous cell carcinoma (HNSCC) and positive margins or extranodal extension (ENE), but its benefit for older adults is not well established. Patients from the National Cancer Data Base (NCDB) with HNSCC undergoing curative-intent resection, neck dissection, and postoperative radiation with positive margins or ENE were identified. This analysis included 1199 patients aged ≥ 70 years with median follow-up of 42.6 months. Postoperative concurrent CRT was associated with improved overall survival (OS; hazard ratio [HR] 0.752; 95% confidence interval [CI] 0.638-0.886) compared to radiation alone in multivariable analysis. Three-year OS was 52.4% with CRT versus 43.4% with radiation (P = .012) in propensity-score matched cohorts. The survival impact of CRT varied by N classification (P = .002 for interaction), with benefit seen only in those with N2 to N3 disease. Postoperative concurrent CRT may benefit older patients with HNSCC with positive margins or ENE, particularly those with higher nodal burden. © 2018 Wiley Periodicals, Inc.
Parsing the roles of the frontal lobes and basal ganglia in task control using multivoxel pattern analysis

PubMed Central

Kehagia, Angie A.; Ye, Rong; Joyce, Dan W.; Doyle, Orla M.; Rowe, James B.; Robbins, Trevor W.

2017-01-01

Cognitive control has traditionally been associated with the prefrontal cortex, based on observations of deficits in patients with frontal lesions. However, evidence from patients with Parkinson’s disease (PD) indicates that subcortical regions also contribute to control under certain conditions. We scanned 17 healthy volunteers while they performed a task switching paradigm that previously dissociated performance deficits arising from frontal lesions in comparison with PD, as a function of the abstraction of the rules that are switched. From a multivoxel pattern analysis by Gaussian Process Classification (GPC), we then estimated the forward (generative) model to infer regional patterns of activity that predict Switch / Repeat behaviour between rule conditions. At 1000 permutations, Switch / Repeat classification accuracy for concrete rules was significant in the basal ganglia, but at chance in the frontal lobe. The inverse pattern was obtained for abstract rules, whereby the conditions were successfully discriminated in the frontal lobe but not in the basal ganglia. This double dissociation highlights the difference between cortical and subcortical contributions to cognitive control and demonstrates the utility of multivariate approaches in investigations of functions that rely on distributed and overlapping neural substrates. PMID:28387585
Ensemble support vector machine classification of dementia using structural MRI and mini-mental state examination.

PubMed

Sørensen, Lauge; Nielsen, Mads

2018-05-15

The International Challenge for Automated Prediction of MCI from MRI data offered independent, standardized comparison of machine learning algorithms for multi-class classification of normal control (NC), mild cognitive impairment (MCI), converting MCI (cMCI), and Alzheimer's disease (AD) using brain imaging and general cognition. We proposed to use an ensemble of support vector machines (SVMs) that combined bagging without replacement and feature selection. SVM is the most commonly used algorithm in multivariate classification of dementia, and it was therefore valuable to evaluate the potential benefit of ensembling this type of classifier. The ensemble SVM, using either a linear or a radial basis function (RBF) kernel, achieved multi-class classification accuracies of 55.6% and 55.0% in the challenge test set (60 NC, 60 MCI, 60 cMCI, 60 AD), resulting in a third place in the challenge. Similar feature subset sizes were obtained for both kernels, and the most frequently selected MRI features were the volumes of the two hippocampal subregions left presubiculum and right subiculum. Post-challenge analysis revealed that enforcing a minimum number of selected features and increasing the number of ensemble classifiers improved classification accuracy up to 59.1%. The ensemble SVM outperformed single SVM classifications consistently in the challenge test set. Ensemble methods using bagging and feature selection can improve the performance of the commonly applied SVM classifier in dementia classification. This resulted in competitive classification accuracies in the International Challenge for Automated Prediction of MCI from MRI data. Copyright © 2018 Elsevier B.V. All rights reserved.
Sex estimation standards for medieval and contemporary Croats

PubMed Central

Bašić, Željana; Kružić, Ivana; Jerković, Ivan; Anđelinović, Deny; Anđelinović, Šimun

2017-01-01

Aim To develop discriminant functions for sex estimation on medieval Croatian population and test their application on contemporary Croatian population. Methods From a total of 519 skeletons, we chose 84 adult excellently preserved skeletons free of antemortem and postmortem changes and took all standard measurements. Sex was estimated/determined using standard anthropological procedures and ancient DNA (amelogenin analysis) where pelvis was insufficiently preserved or where sex morphological indicators were not consistent. We explored which measurements showed sexual dimorphism and used them for developing univariate and multivariate discriminant functions for sex estimation. We included only those functions that reached accuracy rate ≥80%. We tested the applicability of developed functions on modern Croatian sample (n = 37). Results From 69 standard skeletal measurements used in this study, 56 of them showed statistically significant sexual dimorphism (74.7%). We developed five univariate discriminant functions with classification rate 80.6%-85.2% and seven multivariate discriminant functions with an accuracy rate of 81.8%-93.0%. When tested on the modern population functions showed classification rates 74.1%-100%, and ten of them reached aimed accuracy rate. Females showed higher classification rates in the medieval populations, whereas males were better classified in the modern populations. Conclusion Developed discriminant functions are sufficiently accurate for reliable sex estimation in both medieval Croatian population and modern Croatian samples and may be used in forensic settings. The methodological issues that emerged regarding the importance of considering external factors in development and application of discriminant functions for sex estimation should be further explored. PMID:28613039
Prognostic factors of non-functioning pancreatic neuroendocrine tumor revisited: The value of WHO 2010 classification.

PubMed

Bu, Jiyoung; Youn, Sangmin; Kwon, Wooil; Jang, Kee Taek; Han, Sanghyup; Han, Sunjong; You, Younghun; Heo, Jin Seok; Choi, Seong Ho; Choi, Dong Wook

2018-02-01

Various factors have been reported as prognostic factors of non-functional pancreatic neuroendocrine tumors (NF-pNETs). There remains some controversy as to the factors which might actually serve to successfully prognosticate future manifestation and diagnosis of NF-pNETs. As well, consensus regarding management strategy has never been achieved. The aim of this study is to further investigate potential prognostic factors using a large single-center cohort to help determine the management strategy of NF-pNETs. During the time period 1995 through 2013, 166 patients with NF-pNETs who underwent surgery in Samsung Medical Center were entered in a prospective database, and those factors thought to represent predictors of prognosis were tested in uni- and multivariate models. The median follow-up time was 46.5 months; there was a maximum follow-up period of 217 months. The five-year overall survival and disease-free survival rates were 88.5% and 77.0%, respectively. The 2010 WHO classification was found to be the only prognostic factor which affects overall survival and disease-free survival in multivariate analysis. Also, pathologic tumor size and preoperative image tumor size correlated strongly with the WHO grades ( p <0.001, and p <0.001). Our study demonstrates that 2010 WHO classification represents a valuable prognostic factor of NF-pNETs and tumor size on preoperative image correlated with WHO grade. In view of the foregoing, the preoperative image size is thought to represent a reasonable reference with regard to determination and development of treatment strategy of NF-pNETs.
Preparation and infrared/raman classification of 630 spectroscopically encoded styrene copolymers.

PubMed

Fenniri, Hicham; Chun, Sangki; Terreau, Owen; Bravo-Vasquez, Juan-Pablo

2008-01-01

The barcoded resins (BCRs) were introduced recently as a platform for encoded combinatorial chemistry. One of the main challenges yet to be overcome is the demonstration that a large number of BCRs could be generated and classified with high confidence. Here, we describe the synthesis and classification of 630 polystyrene-based copolymers prepared from the combinatorial association of 15 spectroscopically active styrene monomers. Each of the 630 copolymers displayed a unique vibrational fingerprint (infrared and Raman), which was converted into a spectral vector. To each of the 630 copolymers, a vector of the known (reference) composition was assigned. Unknown (prediction) vectors were decoded using multivariate data analysis. From the inner product of the reference and prediction vectors, a correlation map comparing 396 900 copolymer pairs (630 x 630) was generated. In 100% of the cases, the highest correlation was obtained for polymer pairs in which the reference and prediction vectors correspond to copolymers prepared from identical styrene monomers, thus demonstrating the high reliability of this encoding strategy. We have also established that the spectroscopic barcodes generated from the Raman and infrared spectra are independent of the copolymers' morphology (beaded versus bulk polymers). Besides the demonstration of the generality of the polymer barcoding strategy, the analytical methods developed here could in principle be extended to the investigation of the composition and purity of any other synthetic polymer and biopolymer library, or even scaffold-based combinatorial libraries.
EuroFlow antibody panels for standardized n-dimensional flow cytometric immunophenotyping of normal, reactive and malignant leukocytes

PubMed Central

van Dongen, J J M; Lhermitte, L; Böttcher, S; Almeida, J; van der Velden, V H J; Flores-Montero, J; Rawstron, A; Asnafi, V; Lécrevisse, Q; Lucio, P; Mejstrikova, E; Szczepański, T; Kalina, T; de Tute, R; Brüggemann, M; Sedek, L; Cullen, M; Langerak, A W; Mendonça, A; Macintyre, E; Martin-Ayuso, M; Hrusak, O; Vidriales, M B; Orfao, A

2012-01-01

Most consensus leukemia & lymphoma antibody panels consist of lists of markers based on expert opinions, but they have not been validated. Here we present the validated EuroFlow 8-color antibody panels for immunophenotyping of hematological malignancies. The single-tube screening panels and multi-tube classification panels fit into the EuroFlow diagnostic algorithm with entries defined by clinical and laboratory parameters. The panels were constructed in 2–7 sequential design–evaluation–redesign rounds, using novel Infinicyt software tools for multivariate data analysis. Two groups of markers are combined in each 8-color tube: (i) backbone markers to identify distinct cell populations in a sample, and (ii) markers for characterization of specific cell populations. In multi-tube panels, the backbone markers were optimally placed at the same fluorochrome position in every tube, to provide identical multidimensional localization of the target cell population(s). The characterization markers were positioned according to the diagnostic utility of the combined markers. Each proposed antibody combination was tested against reference databases of normal and malignant cells from healthy subjects and WHO-based disease entities, respectively. The EuroFlow studies resulted in validated and flexible 8-color antibody panels for multidimensional identification and characterization of normal and aberrant cells, optimally suited for immunophenotypic screening and classification of hematological malignancies. PMID:22552007
Using color histograms and SPA-LDA to classify bacteria.

PubMed

de Almeida, Valber Elias; da Costa, Gean Bezerra; de Sousa Fernandes, David Douglas; Gonçalves Dias Diniz, Paulo Henrique; Brandão, Deysiane; de Medeiros, Ana Claudia Dantas; Véras, Germano

2014-09-01

In this work, a new approach is proposed to verify the differentiating characteristics of five bacteria (Escherichia coli, Enterococcus faecalis, Streptococcus salivarius, Streptococcus oralis, and Staphylococcus aureus) by using digital images obtained with a simple webcam and variable selection by the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). In this sense, color histograms in the red-green-blue (RGB), hue-saturation-value (HSV), and grayscale channels and their combinations were used as input data, and statistically evaluated by using different multivariate classifiers (Soft Independent Modeling by Class Analogy (SIMCA), Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA), Partial Least Squares Discriminant Analysis (PLS-DA) and Successive Projections Algorithm-Linear Discriminant Analysis (SPA-LDA)). The bacteria strains were cultivated in a nutritive blood agar base layer for 24 h by following the Brazilian Pharmacopoeia, maintaining the status of cell growth and the nature of nutrient solutions under the same conditions. The best result in classification was obtained by using RGB and SPA-LDA, which reached 94 and 100 % of classification accuracy in the training and test sets, respectively. This result is extremely positive from the viewpoint of routine clinical analyses, because it avoids bacterial identification based on phenotypic identification of the causative organism using Gram staining, culture, and biochemical proofs. Therefore, the proposed method presents inherent advantages, promoting a simpler, faster, and low-cost alternative for bacterial identification.
Probability of identification: adulteration of American Ginseng with Asian Ginseng.

PubMed

Harnly, James; Chen, Pei; Harrington, Peter De B

2013-01-01

The AOAC INTERNATIONAL guidelines for validation of botanical identification methods were applied to the detection of Asian Ginseng [Panax ginseng (PG)] as an adulterant for American Ginseng [P. quinquefolius (PQ)] using spectral fingerprints obtained by flow injection mass spectrometry (FIMS). Samples of 100% PQ and 100% PG were physically mixed to provide 90, 80, and 50% PQ. The multivariate FIMS fingerprint data were analyzed using soft independent modeling of class analogy (SIMCA) based on 100% PQ. The Q statistic, a measure of the degree of non-fit of the test samples with the calibration model, was used as the analytical parameter. FIMS was able to discriminate between 100% PQ and 100% PG, and between 100% PQ and 90, 80, and 50% PQ. The probability of identification (POI) curve was estimated based on the SD of 90% PQ. A digital model of adulteration, obtained by mathematically summing the experimentally acquired spectra of 100% PQ and 100% PG in the desired ratios, agreed well with the physical data and provided an easy and more accurate method for constructing the POI curve. Two chemometric modeling methods, SIMCA and fuzzy optimal associative memories, and two classification methods, partial least squares-discriminant analysis and fuzzy rule-building expert systems, were applied to the data. The modeling methods correctly identified the adulterated samples; the classification methods did not.
A graduated food addiction classifications approach significantly differentiates depression, anxiety and stress among people with type 2 diabetes.

PubMed

Raymond, Karren-Lee; Kannis-Dymand, Lee; Lovell, Geoff P

2017-10-01

To examine differences in depression, anxiety, and stress across people with type 2 diabetes mellitus (t2d) classified according to a four level processed food addiction (PFA) severity indicator dichotomy. Four hundred and eight participants with a t2d diagnoses completed an online survey including the Yale Food Addiction Scale (YFAS) and the DASS-21. Based on YFAS symptom counts participants were classified as either: non-PFA; mild-PFA; moderate-PFA; or severe-PFA. Multivariate, λ=0.422, F(9,978.51)=46.286, p<0.001, n p 2 =0.250, and univariate analyses of variance demonstrated that depression F(3,408)=159.891, p<0.001, n p 2 =0.543, anxiety F(3,408)=127.419, p<0.001, n p 2 =0.486, and stress scores F(3,408)=129.714, p<0.001, n p 2 =0.491, significantly and meaningfully increased from one PFA classification level to the next. Furthermore, the proportion of participants with more severe classifications of depression χ 2 (12)=297.820, p<0.001, anxiety χ 2 (12)=271.805, p<0.001, and stress χ 2 (12)=240.875, p<0.001, were significantly higher in the more severe PFA groupings. For people with t2d, PFA is an important and meaningful associate of depression, anxiety, and stress, and that the adopted four level PFA severity indicator dichotomy is valid and useful. Copyright © 2017 Elsevier B.V. All rights reserved.
Hyperspectral image reconstruction using RGB color for foodborne pathogen detection on agar plates

NASA Astrophysics Data System (ADS)

Yoon, Seung-Chul; Shin, Tae-Sung; Park, Bosoon; Lawrence, Kurt C.; Heitschmidt, Gerald W.

2014-03-01

This paper reports the latest development of a color vision technique for detecting colonies of foodborne pathogens grown on agar plates with a hyperspectral image classification model that was developed using full hyperspectral data. The hyperspectral classification model depended on reflectance spectra measured in the visible and near-infrared spectral range from 400 and 1,000 nm (473 narrow spectral bands). Multivariate regression methods were used to estimate and predict hyperspectral data from RGB color values. The six representative non-O157 Shiga-toxin producing Eschetichia coli (STEC) serogroups (O26, O45, O103, O111, O121, and O145) were grown on Rainbow agar plates. A line-scan pushbroom hyperspectral image sensor was used to scan 36 agar plates grown with pure STEC colonies at each plate. The 36 hyperspectral images of the agar plates were divided in half to create training and test sets. The mean Rsquared value for hyperspectral image estimation was about 0.98 in the spectral range between 400 and 700 nm for linear, quadratic and cubic polynomial regression models and the detection accuracy of the hyperspectral image classification model with the principal component analysis and k-nearest neighbors for the test set was up to 92% (99% with the original hyperspectral images). Thus, the results of the study suggested that color-based detection may be viable as a multispectral imaging solution without much loss of prediction accuracy compared to hyperspectral imaging.

Pancreatic abnormalities detected by endoscopic ultrasound (EUS) in patients without clinical signs of pancreatic disease: any difference between standard and Rosemont classification scoring?

PubMed

Petrone, Maria Chiara; Terracciano, Fulvia; Perri, Francesco; Carrara, Silvia; Cavestro, Giulia Martina; Mariani, Alberto; Testoni, Pier Alberto; Arcidiacono, Paolo Giorgio

2014-01-01

The prevalence of nine EUS features of chronic pancreatitis (CP) according to the standard Wiersema classification has been investigated in 489 patients undergoing EUS for an indication not related to pancreatico-biliary disease. We showed that 82 subjects (16.8%) had at least one ductular or parenchymal abnormality. Among them, 18 (3.7% of study population) had ≥3 Wiersema criteria suggestive of CP. Recently, a new classification (Rosemont) of EUS findings consistent, suggestive or indeterminate for CP has been proposed. To stratify healthy subjects into different subgroups on the basis of EUS features of CP according to the Wiersema and Rosemont classifications and to evaluate the agreement in the diagnosis of CP with the two scoring systems. Weighted kappa statistics was computed to evaluate the strength of agreement between the two scoring systems. Univariate and multivariate analysis between any EUS abnormality and habits were performed. Eighty-two EUS videos were reviewed. Using the Wiersema classification, 18 subjects showed ≥3 EUS features suggestive of CP. The EUS diagnosis of CP in these 18 subjects was considered as consistent in only one patient, according to Rosemont classification. Weighted Kappa statistics was 0.34 showing that the strength of agreement was 'fair'. Alcohol use and smoking were identified as risk factors for having pancreatic abnormalities on EUS. The prevalence of EUS features consistent or suggestive of CP in healthy subjects according to the Rosemont classification is lower than that assessed by Wiersema criteria. In that regard the Rosemont classification seems to be more accurate in excluding clinically relevant CP. Overall agreement between the two classifications is fair. Copyright © 2014 IAP and EPC. Published by Elsevier B.V. All rights reserved.
Melanocytoma-like melanoma may be the missing link between benign and malignant uveal melanocytic lesions in humans and dogs: a comparative study.

PubMed

Zoroquiain, Pablo; Mayo-Goldberg, Erin; Alghamdi, Sarah; Alhumaid, Sulaiman; Perlmann, Eduardo; Barros, Paulo; Mayo, Nancy; Burnier, Miguel N

2016-12-01

The cutoff presented in the current classification of canine melanocytic lesions by Wilcock and Pfeiffer is based on the clinical outcome rather than morphological concepts. Classification of tumors based on morphology or molecular signatures is the key to identifying new therapies or prognostic factors. Therefore, the aim of this study was to analyze morphological findings in canine melanocytic lesions based on classic malignant morphologic principles of neoplasia and to compare these features with human uveal melanoma (HUM) samples. In total, 64 canine and 111 human morphologically malignant melanocytic lesions were classified into two groups (melanocytoma-like or classic melanoma) based on the presence or absence of M cells, respectively. Histopathological characteristics were compared between the two groups using the χ-test, t-test, and multivariate discriminant analysis. Among the 64 canine tumors, 28 (43.7%) were classic and 36 (56.3%) were melanocytoma-like melanomas. Smaller tumor size, a higher degree of pigmentation, and lower mitotic activity distinguished melanocytoma-like from classic tumors with an accuracy of 100% for melanocytoma-like lesions. From the human series, only one case showed melanocytoma-like features and had a low risk for metastasis characteristics. Canine uveal melanoma showed a morphological spectrum with features similar to the HUM counterpart (classic melanoma) and overlapped features between uveal melanoma and melanocytoma (melanocytoma-like melanoma). Recognition that the subgroup of melanocytoma-like melanoma may represent the missing link between benign and malignant lesions could help explain the progression of uveal melanoma in dogs; these findings can potentially be translated to HUM.
Vegetation monitoring and classification using NOAA/AVHRR satellite data

NASA Technical Reports Server (NTRS)

Greegor, D. H., Jr.; Norwine, J. R.

1983-01-01

A vegetation gradient model, based on a new surface hydrologic index and NOAA/AVHRR meteorological satellite data, has been analyzed along a 1300 km east-west transect across the state of Texas. The model was developed to test the potential usefulness of such low-resolution data for vegetation stratification and monitoring. Normalized Difference values (ratio of AVHRR bands 1 and 2, considered to be an index of greenness) were determined and evaluated against climatological and vegetation characteristics at 50 sample locations (regular intervals of 0.25 deg longitude) along the transect on five days in 1980. Statistical treatment of the data indicate that a multivariate model incorporating satellite-measured spectral greenness values and a surface hydrologic factor offer promise as a new technique for regional-scale vegetation stratification and monitoring.
Decoding Spontaneous Emotional States in the Human Brain

PubMed Central

Kragel, Philip A.; Knodt, Annchen R.; Hariri, Ahmad R.; LaBar, Kevin S.

2016-01-01

Pattern classification of human brain activity provides unique insight into the neural underpinnings of diverse mental states. These multivariate tools have recently been used within the field of affective neuroscience to classify distributed patterns of brain activation evoked during emotion induction procedures. Here we assess whether neural models developed to discriminate among distinct emotion categories exhibit predictive validity in the absence of exteroceptive emotional stimulation. In two experiments, we show that spontaneous fluctuations in human resting-state brain activity can be decoded into categories of experience delineating unique emotional states that exhibit spatiotemporal coherence, covary with individual differences in mood and personality traits, and predict on-line, self-reported feelings. These findings validate objective, brain-based models of emotion and show how emotional states dynamically emerge from the activity of separable neural systems. PMID:27627738
Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli

PubMed Central

Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.

2016-01-01

Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832
Photos for estimating fuel loadings before and after prescribed burning in the upper coastal plain of the southeast

Treesearch

Eric R. Scholl; Thomas A. Waldrop

1999-01-01

Although prescribed burning is common in the Southeastern United States, most fuel models apply to only western forests. This paper documents a fuel classification system that was developed for plantations of loblolly and longleaf pines for the Upper Coastal Plain region. Multivariate analysis of variance and discriminant function analysis were used to confirm eight...
Optimizing Functional Network Representation of Multivariate Time Series

NASA Astrophysics Data System (ADS)

Zanin, Massimiliano; Sousa, Pedro; Papo, David; Bajo, Ricardo; García-Prieto, Juan; Pozo, Francisco Del; Menasalvas, Ernestina; Boccaletti, Stefano

2012-09-01

By combining complex network theory and data mining techniques, we provide objective criteria for optimization of the functional network representation of generic multivariate time series. In particular, we propose a method for the principled selection of the threshold value for functional network reconstruction from raw data, and for proper identification of the network's indicators that unveil the most discriminative information on the system for classification purposes. We illustrate our method by analysing networks of functional brain activity of healthy subjects, and patients suffering from Mild Cognitive Impairment, an intermediate stage between the expected cognitive decline of normal aging and the more pronounced decline of dementia. We discuss extensions of the scope of the proposed methodology to network engineering purposes, and to other data mining tasks.
Optimizing Functional Network Representation of Multivariate Time Series

PubMed Central

Zanin, Massimiliano; Sousa, Pedro; Papo, David; Bajo, Ricardo; García-Prieto, Juan; Pozo, Francisco del; Menasalvas, Ernestina; Boccaletti, Stefano

2012-01-01

By combining complex network theory and data mining techniques, we provide objective criteria for optimization of the functional network representation of generic multivariate time series. In particular, we propose a method for the principled selection of the threshold value for functional network reconstruction from raw data, and for proper identification of the network's indicators that unveil the most discriminative information on the system for classification purposes. We illustrate our method by analysing networks of functional brain activity of healthy subjects, and patients suffering from Mild Cognitive Impairment, an intermediate stage between the expected cognitive decline of normal aging and the more pronounced decline of dementia. We discuss extensions of the scope of the proposed methodology to network engineering purposes, and to other data mining tasks. PMID:22953051
Recent applications of multivariate data analysis methods in the authentication of rice and the most analyzed parameters: A review.

PubMed

Maione, Camila; Barbosa, Rommel Melgaço

2018-01-24

Rice is one of the most important staple foods around the world. Authentication of rice is one of the most addressed concerns in the present literature, which includes recognition of its geographical origin and variety, certification of organic rice and many other issues. Good results have been achieved by multivariate data analysis and data mining techniques when combined with specific parameters for ascertaining authenticity and many other useful characteristics of rice, such as quality, yield and others. This paper brings a review of the recent research projects on discrimination and authentication of rice using multivariate data analysis and data mining techniques. We found that data obtained from image processing, molecular and atomic spectroscopy, elemental fingerprinting, genetic markers, molecular content and others are promising sources of information regarding geographical origin, variety and other aspects of rice, being widely used combined with multivariate data analysis techniques. Principal component analysis and linear discriminant analysis are the preferred methods, but several other data classification techniques such as support vector machines, artificial neural networks and others are also frequently present in some studies and show high performance for discrimination of rice.
The Decoding Toolbox (TDT): a versatile software package for multivariate analyses of functional imaging data

PubMed Central

Hebart, Martin N.; Görgen, Kai; Haynes, John-Dylan

2015-01-01

The multivariate analysis of brain signals has recently sparked a great amount of interest, yet accessible and versatile tools to carry out decoding analyses are scarce. Here we introduce The Decoding Toolbox (TDT) which represents a user-friendly, powerful and flexible package for multivariate analysis of functional brain imaging data. TDT is written in Matlab and equipped with an interface to the widely used brain data analysis package SPM. The toolbox allows running fast whole-brain analyses, region-of-interest analyses and searchlight analyses, using machine learning classifiers, pattern correlation analysis, or representational similarity analysis. It offers automatic creation and visualization of diverse cross-validation schemes, feature scaling, nested parameter selection, a variety of feature selection methods, multiclass capabilities, and pattern reconstruction from classifier weights. While basic users can implement a generic analysis in one line of code, advanced users can extend the toolbox to their needs or exploit the structure to combine it with external high-performance classification toolboxes. The toolbox comes with an example data set which can be used to try out the various analysis methods. Taken together, TDT offers a promising option for researchers who want to employ multivariate analyses of brain activity patterns. PMID:25610393
Characteristics and Classification of Least Altered Streamflows in Massachusetts

USGS Publications Warehouse

Armstrong, David S.; Parker, Gene W.; Richards, Todd A.

2008-01-01

Streamflow records from 85 streamflow-gaging stations at which streamflows were considered to be least altered were used to characterize natural streamflows within southern New England. Period-of-record streamflow data were used to determine annual hydrographs of median monthly flows. The shapes and magnitudes of annual hydrographs of median monthly flows, normalized by drainage area, differed among stations in different geographic areas of southern New England. These differences were gradational across southern New England and were attributed to differences in basin and climate characteristics. Period-of-record streamflow data were also used to analyze the statistical properties of daily streamflows at 61 stations across southern New England by using L-moment ratios. An L-moment ratio diagram of L-skewness and L-kurtosis showed a continuous gradation in these properties between stations and indicated differences between base-flow dominated and runoff-dominated rivers. Streamflow records from a concurrent period (1960-2004) for 61 stations were used in a multivariate statistical analysis to develop a hydrologic classification of rivers in southern New England. Missing records from 46 of these stations were extended by using a Maintenance of Variation Extension technique. The concurrent-period streamflows were used in the Indicators of Hydrologic Alteration and Hydrologic Index Tool programs to determine 224 hydrologic indices for the 61 stations. Principal-components analysis (PCA) was used to reduce the number of hydrologic indices to 20 that provided nonredundant information. The PCA also indicated that the major patterns of variability in the dataset are related to differences in flow variability and low-flow magnitude among the stations. Hierarchical cluster analysis was used to classify stations into groups with similar hydrologic properties. The cluster analysis classified rivers in southern New England into two broad groups: (1) base-flow dominated rivers, whose statistical properties indicated less flow variability and high magnitudes of low flow, and (2) runoff-dominated rivers, whose statistical properties indicated greater flow variability and lower magnitudes of low flow. A four-cluster classification further classified the runoff-dominated streams into three groups that varied in gradient, elevation, and differences in winter streamflow conditions: high-gradient runoff-dominated rivers, northern runoff-dominated rivers, and southern runoff-dominated rivers. A nine-cluster division indicated that basin size also becomes a distinguishing factor among basins at finer levels of classification. Smaller basins (less than 10 square miles) were classified into different groups than larger basins. A comparison of station classifications indicated that a classification based on multiple hydrologic indices that represent different aspects of the flow regime did not result in the same classification of stations as a classification based on a single type of statistic such as a monthly median. River basins identified by the cluster analysis as having similar hydrologic properties tended to have similar basin and climate characteristics and to be in close proximity to one another. Stations were not classified in the same cluster on the basis of geographic location alone; as a result, boundaries cannot be drawn between geographic regions with similar streamflow characteristics. Rivers with different basin and climate characteristics were classified in different clusters, even if they were in adjacent basins or upstream and downstream within the same basin.
Perception of olive oils sensory defects using a potentiometric taste device.

PubMed

Veloso, Ana C A; Silva, Lucas M; Rodrigues, Nuno; Rebello, Ligia P G; Dias, Luís G; Pereira, José A; Peres, António M

2018-01-01

The capability of perceiving olive oils sensory defects and intensities plays a key role on olive oils quality grade classification since olive oils can only be classified as extra-virgin if no defect can be perceived by a human trained sensory panel. Otherwise, olive oils may be classified as virgin or lampante depending on the median intensity of the defect predominantly perceived and on the physicochemical levels. However, sensory analysis is time-consuming and requires an official sensory panel, which can only evaluate a low number of samples per day. In this work, the potential use of an electronic tongue as a taste sensor device to identify the defect predominantly perceived in olive oils was evaluated. The potentiometric profiles recorded showed that intra- and inter-day signal drifts could be neglected (i.e., relative standard deviations lower than 25%), being not statistically significant the effect of the analysis day on the overall recorded E-tongue sensor fingerprints (P-value = 0.5715, for multivariate analysis of variance using Pillai's trace test), which significantly differ according to the olive oils' sensory defect (P-value = 0.0084, for multivariate analysis of variance using Pillai's trace test). Thus, a linear discriminant model based on 19 potentiometric signal sensors, selected by the simulated annealing algorithm, could be established to correctly predict the olive oil main sensory defect (fusty, rancid, wet-wood or winey-vinegary) with average sensitivity of 75 ± 3% and specificity of 73 ± 4% (repeated K-fold cross-validation variant: 4 folds×10 repeats). Similarly, a linear discriminant model, based on 24 selected sensors, correctly classified 92 ± 3% of the olive oils as virgin or lampante, being an average specificity of 93 ± 3% achieved. The overall satisfactory predictive performances strengthen the feasibility of the developed taste sensor device as a complementary methodology for olive oils' defects analysis and subsequent quality grade classification. Furthermore, the capability of identifying the type of sensory defect of an olive oil may allow establishing helpful insights regarding bad practices of olives or olive oils production, harvesting, transport and storage. Copyright © 2017 Elsevier B.V. All rights reserved.
Electroencephalogram-based decoding cognitive states using convolutional neural network and likelihood ratio based score fusion.

PubMed

Zafar, Raheel; Dass, Sarat C; Malik, Aamir Saeed

2017-01-01

Electroencephalogram (EEG)-based decoding human brain activity is challenging, owing to the low spatial resolution of EEG. However, EEG is an important technique, especially for brain-computer interface applications. In this study, a novel algorithm is proposed to decode brain activity associated with different types of images. In this hybrid algorithm, convolutional neural network is modified for the extraction of features, a t-test is used for the selection of significant features and likelihood ratio-based score fusion is used for the prediction of brain activity. The proposed algorithm takes input data from multichannel EEG time-series, which is also known as multivariate pattern analysis. Comprehensive analysis was conducted using data from 30 participants. The results from the proposed method are compared with current recognized feature extraction and classification/prediction techniques. The wavelet transform-support vector machine method is the most popular currently used feature extraction and prediction method. This method showed an accuracy of 65.7%. However, the proposed method predicts the novel data with improved accuracy of 79.9%. In conclusion, the proposed algorithm outperformed the current feature extraction and prediction method.
An Automated Algorithm to Screen Massive Training Samples for a Global Impervious Surface Classification

NASA Technical Reports Server (NTRS)

Tan, Bin; Brown de Colstoun, Eric; Wolfe, Robert E.; Tilton, James C.; Huang, Chengquan; Smith, Sarah E.

2012-01-01

An algorithm is developed to automatically screen the outliers from massive training samples for Global Land Survey - Imperviousness Mapping Project (GLS-IMP). GLS-IMP is to produce a global 30 m spatial resolution impervious cover data set for years 2000 and 2010 based on the Landsat Global Land Survey (GLS) data set. This unprecedented high resolution impervious cover data set is not only significant to the urbanization studies but also desired by the global carbon, hydrology, and energy balance researches. A supervised classification method, regression tree, is applied in this project. A set of accurate training samples is the key to the supervised classifications. Here we developed the global scale training samples from 1 m or so resolution fine resolution satellite data (Quickbird and Worldview2), and then aggregate the fine resolution impervious cover map to 30 m resolution. In order to improve the classification accuracy, the training samples should be screened before used to train the regression tree. It is impossible to manually screen 30 m resolution training samples collected globally. For example, in Europe only, there are 174 training sites. The size of the sites ranges from 4.5 km by 4.5 km to 8.1 km by 3.6 km. The amount training samples are over six millions. Therefore, we develop this automated statistic based algorithm to screen the training samples in two levels: site and scene level. At the site level, all the training samples are divided to 10 groups according to the percentage of the impervious surface within a sample pixel. The samples following in each 10% forms one group. For each group, both univariate and multivariate outliers are detected and removed. Then the screen process escalates to the scene level. A similar screen process but with a looser threshold is applied on the scene level considering the possible variance due to the site difference. We do not perform the screen process across the scenes because the scenes might vary due to the phenology, solar-view geometry, and atmospheric condition etc. factors but not actual landcover difference. Finally, we will compare the classification results from screened and unscreened training samples to assess the improvement achieved by cleaning up the training samples. Keywords:
Rapid characterization of transgenic and non-transgenic soybean oils by chemometric methods using NIR spectroscopy

NASA Astrophysics Data System (ADS)

Luna, Aderval S.; da Silva, Arnaldo P.; Pinho, Jéssica S. A.; Ferré, Joan; Boqué, Ricard

Near infrared (NIR) spectroscopy and multivariate classification were applied to discriminate soybean oil samples into non-transgenic and transgenic. Principal Component Analysis (PCA) was applied to extract relevant features from the spectral data and to remove the anomalous samples. The best results were obtained when with Support Vectors Machine-Discriminant Analysis (SVM-DA) and Partial Least Squares-Discriminant Analysis (PLS-DA) after mean centering plus multiplicative scatter correction. For SVM-DA the percentage of successful classification was 100% for the training group and 100% and 90% in validation group for non transgenic and transgenic soybean oil samples respectively. For PLS-DA the percentage of successful classification was 95% and 100% in training group for non transgenic and transgenic soybean oil samples respectively and 100% and 80% in validation group for non transgenic and transgenic respectively. The results demonstrate that NIR spectroscopy can provide a rapid, nondestructive and reliable method to distinguish non-transgenic and transgenic soybean oils.
Application of multivariate statistics to vestibular testing: discriminating between Meniere's disease and migraine associated dizziness

NASA Technical Reports Server (NTRS)

Dimitri, P. S.; Wall, C. 3rd; Oas, J. G.; Rauch, S. D.

2001-01-01

Meniere's disease (MD) and migraine associated dizziness (MAD) are two disorders that can have similar symptomatologies, but differ vastly in treatment. Vestibular testing is sometimes used to help differentiate between these disorders, but the inefficiency of a human interpreter analyzing a multitude of variables independently decreases its utility. Our hypothesis was that we could objectively discriminate between patients with MD and those with MAD using select variables from the vestibular test battery. Sinusoidal harmonic acceleration test variables were reduced to three vestibulo-ocular reflex physiologic parameters: gain, time constant, and asymmetry. A combination of these parameters plus a measurement of reduced vestibular response from caloric testing allowed us to achieve a joint classification rate of 91%, independent quadratic classification algorithm. Data from posturography were not useful for this type of differentiation. Overall, our classification function can be used as an unbiased assistant to discriminate between MD and MAD and gave us insight into the pathophysiologic differences between the two disorders.
A computer analysis of ERTS data of the Lake Gregory area of South Australia with particular emphasis on its role in terrain classification for engineering. M.S. Thesis

NASA Technical Reports Server (NTRS)

Lodwick, G. D. (Principal Investigator)

1976-01-01

A digital computer and multivariate statistical techniques were used to analyze 4-band multispectral data. A representation of the original data for each of the four bands allows a certain degree of terrain interpretation; however, variations in appearance of sites within and between bands, without additional criteria for deciding which representation should be preferred, create difficulties for classification. Investigation of the video data groups produced by principal components analysis and cluster analysis techniques shows that effective correlations with classifications of terrain produced by conventional methods could be carried out. The analyses also highlighted underlying relationships between the various elements. The approach used allows large areas (185 cm by 185 cm) to be classified into fundamental units within a matter of hours and can be applied to those parts of the Earth where facilities for conventional studies are poor or lacking.
Multiple-factor classification of a human-modified forest landscape in the Hsuehshan Mountain Range, Taiwan.

PubMed

Berg, Kevan J; Icyeh, Lahuy; Lin, Yih-Ren; Janz, Arnold; Newmaster, Steven G

2016-12-01

Human actions drive landscape heterogeneity, yet most ecosystem classifications omit the role of human influence. This study explores land use history to inform a classification of forestland of the Tayal Mrqwang indigenous people of Taiwan. Our objectives were to determine the extent to which human action drives landscape heterogeneity. We used interviews, field sampling, and multivariate analysis to relate vegetation patterns to environmental gradients and human modification across 76 sites. We identified eleven forest classes. In total, around 70 % of plots were at lower elevations and had a history of shifting cultivation, terrace farming, and settlement that resulted in alder, laurel, oak, pine, and bamboo stands. Higher elevation mixed conifer forests were least disturbed. Arboriculture and selective harvesting were drivers of other conspicuous forest patterns. The findings show that past land uses play a key role in shaping forests, which is important to consider when setting targets to guide forest management.
Analysis of failure in patients with adenoid cystic carcinoma of the head and neck. An international collaborative study.

PubMed

Amit, Moran; Binenbaum, Yoav; Sharma, Kanika; Ramer, Naomi; Ramer, Ilana; Agbetoba, Abib; Miles, Brett; Yang, Xinjie; Lei, Delin; Bjøerndal, Kristine; Godballe, Christian; Mücke, Thomas; Wolff, Klaus-Dietrich; Fliss, Dan; Eckardt, André M; Copelli, Chiara; Sesenna, Enrico; Palmer, Frank; Patel, Snehal; Gil, Ziv

2014-07-01

Adenoid cystic carcinoma (ACC) is a locally aggressive tumor with a high prevalence of distant metastases. The purpose of this study was to identify independent predictors of outcome and to characterize the patterns of failure. An international retrospective review was conducted of 489 patients with ACC treated between 1985 and 2011 in 9 cancer centers worldwide. Five-year overall-survival (OS), disease-specific survival (DSS), and disease-free survival (DFS) were 76%, 80%, and 68%, respectively. Independent predictors of OS and DSS were: age, site, N classification, and presence of distant metastases. N classification, age, and bone invasion were associated with DFS on multivariate analysis. Age, tumor site, orbital invasion, and N classification were independent predictors of distant metastases. The clinical course of ACC is slow but persistent. Paranasal sinus origin is associated with the lowest distant metastases rate but with the poorest outcome. These prognostic estimates should be considered when tailoring treatment for patients with ACC. Copyright © 2013 Wiley Periodicals, Inc.
Prognostic value of the new International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society classification in stage IB lung adenocarcinoma.

PubMed

Xu, C-h; Wang, W; Wei, Y; Hu, H-d; Zou, J; Yan, J; Yu, L-k; Yang, R-s; Wang, Y

2015-10-01

Patients with pathological stage IB lung adenocarcinoma have a variable prognosis, even if received the same treatment. This study investigated the prognostic value of the new International Association for the Study of Lung Cancer, American Thoracic Society, and European Respiratory Society (IASLC/ATS/ERS) lung adenocarcinoma classification in resected stage IB lung adenocarcinoma. We identified 276 patients with pathological stage IB adenocarcinoma who had undergone surgical resection at the Nanjing Chest Hospital between 2005 and 2010. The histological subtypes of all patients were classified according to the 2011 IASLC/ATS/ERS international multidisciplinary lung adenocarcinoma classification. Kaplan-Meier and Cox regression analyses were used to analyze the correlation between the IASLC/ATS/ERS classification and patients' prognosis. Two hundred and seventy-six patients with pathological stage IB adenocarcinoma had an 86.2% 5-year overall survival (OS) and 80.4% 5-year disease-free survival (DFS). Patients with micropapillary and solid predominant tumors had a significantly worse OS and DFS as compared to those with other subtypes predominant tumors (p = 0.003 and 0.001). Multivariate analysis revealed that the new classification was an independent prognostic factor for both OS and DFS of pathological stage IB adenocarcinoma (p = 0.009 and 0.003). Our study revealed that the new IASLC/ATS/ERS classification was an independent prognostic factor of pathological stage IB adenocarcinoma. This new classification is valuable of screening out high risk patients to receive postoperative adjuvant therapy. Copyright © 2015. Published by Elsevier Ltd.

Classification of broiler breast filets according to deboning time using near infrared spectroscopy and multivariate analysis

USDA-ARS?s Scientific Manuscript database

Chicken breast filets were deboned and NIR spectra were collected after 2, 4, and 24 hours. The deboning was performed on pairs of filets to minimize differences due only to the meat and not the deboning time (i.e. right at 2 hours, left at 24; right at 2, left at 4; right at 4, left at 24 hrs). The...
Use of the Köppen-Trewartha climate classification to evaluate climatic refugia in statistically derived ecoregions for the People’s Republic of China

Treesearch

B. Baker; Henry Diaz; William Hargrove; Forrest Hoffman

2010-01-01

Changes in climate as projected by state-of-the-art climate models are likely to result in novel combinations of climate and topo-edaphic factors that will have substantial impacts on the distribution and persistence of natural vegetation and animal species. We have used multivariate techniques to quantify some of these changes; the...
Nonparametric analysis of Minnesota spruce and aspen tree data and LANDSAT data

NASA Technical Reports Server (NTRS)

Scott, D. W.; Jee, R.

1984-01-01

The application of nonparametric methods in data-intensive problems faced by NASA is described. The theoretical development of efficient multivariate density estimators and the novel use of color graphics workstations are reviewed. The use of nonparametric density estimates for data representation and for Bayesian classification are described and illustrated. Progress in building a data analysis system in a workstation environment is reviewed and preliminary runs presented.
Biometrics from the carbon isotope ratio analysis of amino acids in human hair.

PubMed

Jackson, Glen P; An, Yan; Konstantynova, Kateryna I; Rashaid, Ayat H B

2015-01-01

This study compares and contrasts the ability to classify individuals into different grouping factors through either bulk isotope ratio analysis or amino-acid-specific isotope ratio analysis of human hair. Using LC-IRMS, we measured the isotope ratios of 14 amino acids in hair proteins independently, and leucine/isoleucine as a co-eluting pair, to provide 15 variables for classification. Multivariate analysis confirmed that the essential amino acids and non-essential amino acids were mostly independent variables in the classification rules, thereby enabling the separation of dietary factors of isotope intake from intrinsic or phenotypic factors of isotope fractionation. Multivariate analysis revealed at least two potential sources of non-dietary factors influencing the carbon isotope ratio values of the amino acids in human hair: body mass index (BMI) and age. These results provide evidence that compound-specific isotope ratio analysis has the potential to go beyond region-of-origin or geospatial movements of individuals-obtainable through bulk isotope measurements-to the provision of physical and characteristic traits about the individuals, such as age and BMI. Further development and refinement, for example to genetic, metabolic, disease and hormonal factors could ultimately be of great assistance in forensic and clinical casework. Copyright © 2014 Forensic Science Society. Published by Elsevier Ireland Ltd. All rights reserved.
Classification of edible oils by employing 31P and 1H NMR spectroscopy in combination with multivariate statistical analysis. A proposal for the detection of seed oil adulteration in virgin olive oils.

PubMed

Vigli, Georgia; Philippidis, Angelos; Spyros, Apostolos; Dais, Photis

2003-09-10

A combination of (1)H NMR and (31)P NMR spectroscopy and multivariate statistical analysis was used to classify 192 samples from 13 types of vegetable oils, namely, hazelnut, sunflower, corn, soybean, sesame, walnut, rapeseed, almond, palm, groundnut, safflower, coconut, and virgin olive oils from various regions of Greece. 1,2-Diglycerides, 1,3-diglycerides, the ratio of 1,2-diglycerides to total diglycerides, acidity, iodine value, and fatty acid composition determined upon analysis of the respective (1)H NMR and (31)P NMR spectra were selected as variables to establish a classification/prediction model by employing discriminant analysis. This model, obtained from the training set of 128 samples, resulted in a significant discrimination among the different classes of oils, whereas 100% of correct validated assignments for 64 samples were obtained. Different artificial mixtures of olive-hazelnut, olive-corn, olive-sunflower, and olive-soybean oils were prepared and analyzed by (1)H NMR and (31)P NMR spectroscopy. Subsequent discriminant analysis of the data allowed detection of adulteration as low as 5% w/w, provided that fresh virgin olive oil samples were used, as reflected by their high 1,2-diglycerides to total diglycerides ratio (D > or = 0.90).
Rapid discrimination of sea buckthorn berries from different H. rhamnoides subspecies by multi-step IR spectroscopy coupled with multivariate data analysis

NASA Astrophysics Data System (ADS)

Liu, Yue; Zhang, Ying; Zhang, Jing; Fan, Gang; Tu, Ya; Sun, Suqin; Shen, Xudong; Li, Qingzhu; Zhang, Yi

2018-03-01

As an important ethnic medicine, sea buckthorn was widely used to prevent and treat various diseases due to its nutritional and medicinal properties. According to the Chinese Pharmacopoeia, sea buckthorn was originated from H. rhamnoides, which includes five subspecies distributed in China. Confusion and misidentification usually occurred due to their similar morphology, especially in dried and powdered forms. Additionally, these five subspecies have vital differences in quality and physiological efficacy. This paper focused on the quick classification and identification method of sea buckthorn berry powders from five H. rhamnoides subspecies using multi-step IR spectroscopy coupled with multivariate data analysis. The holistic chemical compositions revealed by the FT-IR spectra demonstrated that flavonoids, fatty acids and sugars were the main chemical components. Further, the differences in FT-IR spectra regarding their peaks, positions and intensities were used to identify H. rhamnoides subspecies samples. The discrimination was achieved using principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA). The results showed that the combination of multi-step IR spectroscopy and chemometric analysis offered a simple, fast and reliable method for the classification and identification of the sea buckthorn berry powders from different H. rhamnoides subspecies.
Polarization in Raman spectroscopy helps explain bone brittleness in genetic mouse models

NASA Astrophysics Data System (ADS)

Makowski, Alexander J.; Pence, Isaac J.; Uppuganti, Sasidhar; Zein-Sabatto, Ahbid; Huszagh, Meredith C.; Mahadevan-Jansen, Anita; Nyman, Jeffry S.

2014-11-01

Raman spectroscopy (RS) has been extensively used to characterize bone composition. However, the link between bone biomechanics and RS measures is not well established. Here, we leveraged the sensitivity of RS polarization to organization, thereby assessing whether RS can explain differences in bone toughness in genetic mouse models for which traditional RS peak ratios are not informative. In the selected mutant mice-activating transcription factor 4 (ATF4) or matrix metalloproteinase 9 (MMP9) knock-outs-toughness is reduced but differences in bone strength do not exist between knock-out and corresponding wild-type controls. To incorporate differences in the RS of bone occurring at peak shoulders, a multivariate approach was used. Full spectrum principal components analysis of two paired, orthogonal bone orientations (relative to laser polarization) improved genotype classification and correlation to bone toughness when compared to traditional peak ratios. When applied to femurs from wild-type mice at 8 and 20 weeks of age, the principal components of orthogonal bone orientations improved age classification but not the explanation of the maturation-related increase in strength. Overall, increasing polarization information by collecting spectra from two bone orientations improves the ability of multivariate RS to explain variance in bone toughness, likely due to polarization sensitivity to organizational changes in both mineral and collagen.
Extracting galactic structure parameters from multivariated density estimation

NASA Technical Reports Server (NTRS)

Chen, B.; Creze, M.; Robin, A.; Bienayme, O.

1992-01-01

Multivariate statistical analysis, including includes cluster analysis (unsupervised classification), discriminant analysis (supervised classification) and principle component analysis (dimensionlity reduction method), and nonparameter density estimation have been successfully used to search for meaningful associations in the 5-dimensional space of observables between observed points and the sets of simulated points generated from a synthetic approach of galaxy modelling. These methodologies can be applied as the new tools to obtain information about hidden structure otherwise unrecognizable, and place important constraints on the space distribution of various stellar populations in the Milky Way. In this paper, we concentrate on illustrating how to use nonparameter density estimation to substitute for the true densities in both of the simulating sample and real sample in the five-dimensional space. In order to fit model predicted densities to reality, we derive a set of equations which include n lines (where n is the total number of observed points) and m (where m: the numbers of predefined groups) unknown parameters. A least-square estimation will allow us to determine the density law of different groups and components in the Galaxy. The output from our software, which can be used in many research fields, will also give out the systematic error between the model and the observation by a Bayes rule.
Multivariate pattern analysis of MEG and EEG: A comparison of representational structure in time and space.

PubMed

Cichy, Radoslaw Martin; Pantazis, Dimitrios

2017-09-01

Multivariate pattern analysis of magnetoencephalography (MEG) and electroencephalography (EEG) data can reveal the rapid neural dynamics underlying cognition. However, MEG and EEG have systematic differences in sampling neural activity. This poses the question to which degree such measurement differences consistently bias the results of multivariate analysis applied to MEG and EEG activation patterns. To investigate, we conducted a concurrent MEG/EEG study while participants viewed images of everyday objects. We applied multivariate classification analyses to MEG and EEG data, and compared the resulting time courses to each other, and to fMRI data for an independent evaluation in space. We found that both MEG and EEG revealed the millisecond spatio-temporal dynamics of visual processing with largely equivalent results. Beyond yielding convergent results, we found that MEG and EEG also captured partly unique aspects of visual representations. Those unique components emerged earlier in time for MEG than for EEG. Identifying the sources of those unique components with fMRI, we found the locus for both MEG and EEG in high-level visual cortex, and in addition for MEG in low-level visual cortex. Together, our results show that multivariate analyses of MEG and EEG data offer a convergent and complimentary view on neural processing, and motivate the wider adoption of these methods in both MEG and EEG research. Copyright © 2017 Elsevier Inc. All rights reserved.
Multivariate Pattern Analysis Reveals Category-Related Organization of Semantic Representations in Anterior Temporal Cortex.

PubMed

Malone, Patrick S; Glezer, Laurie S; Kim, Judy; Jiang, Xiong; Riesenhuber, Maximilian

2016-09-28

The neural substrates of semantic representation have been the subject of much controversy. The study of semantic representations is complicated by difficulty in disentangling perceptual and semantic influences on neural activity, as well as in identifying stimulus-driven, "bottom-up" semantic selectivity unconfounded by top-down task-related modulations. To address these challenges, we trained human subjects to associate pseudowords (TPWs) with various animal and tool categories. To decode semantic representations of these TPWs, we used multivariate pattern classification of fMRI data acquired while subjects performed a semantic oddball detection task. Crucially, the classifier was trained and tested on disjoint sets of TPWs, so that the classifier had to use the semantic information from the training set to correctly classify the test set. Animal and tool TPWs were successfully decoded based on fMRI activity in spatially distinct subregions of the left medial anterior temporal lobe (LATL). In addition, tools (but not animals) were successfully decoded from activity in the left inferior parietal lobule. The tool-selective LATL subregion showed greater functional connectivity with left inferior parietal lobule and ventral premotor cortex, indicating that each LATL subregion exhibits distinct patterns of connectivity. Our findings demonstrate category-selective organization of semantic representations in LATL into spatially distinct subregions, continuing the lateral-medial segregation of activation in posterior temporal cortex previously observed in response to images of animals and tools, respectively. Together, our results provide evidence for segregation of processing hierarchies for different classes of objects and the existence of multiple, category-specific semantic networks in the brain. The location and specificity of semantic representations in the brain are still widely debated. We trained human participants to associate specific pseudowords with various animal and tool categories, and used multivariate pattern classification of fMRI data to decode the semantic representations of the trained pseudowords. We found that: (1) animal and tool information was organized in category-selective subregions of medial left anterior temporal lobe (LATL); (2) tools, but not animals, were encoded in left inferior parietal lobe; and (3) LATL subregions exhibited distinct patterns of functional connectivity with category-related regions across cortex. Our findings suggest that semantic knowledge in LATL is organized in category-related subregions, providing evidence for the existence of multiple, category-specific semantic representations in the brain. Copyright © 2016 the authors 0270-6474/16/3610089-08$15.00/0.
Grey matter volume patterns in thalamic nuclei are associated with familial risk for schizophrenia.

PubMed

Pergola, Giulio; Trizio, Silvestro; Di Carlo, Pasquale; Taurisano, Paolo; Mancini, Marina; Amoroso, Nicola; Nettis, Maria Antonietta; Andriola, Ileana; Caforio, Grazia; Popolizio, Teresa; Rampino, Antonio; Di Giorgio, Annabella; Bertolino, Alessandro; Blasi, Giuseppe

2017-02-01

Previous evidence suggests reduced thalamic grey matter volume (GMV) in patients with schizophrenia (SCZ). However, it is not considered an intermediate phenotype for schizophrenia, possibly because previous studies did not assess the contribution of individual thalamic nuclei and employed univariate statistics. Here, we hypothesized that multivariate statistics would reveal an association of GMV in different thalamic nuclei with familial risk for schizophrenia. We also hypothesized that accounting for the heterogeneity of thalamic GMV in healthy controls would improve the detection of subjects at familial risk for the disorder. We acquired MRI scans for 96 clinically stable SCZ, 55 non-affected siblings of patients with schizophrenia (SIB), and 249 HC. The thalamus was parceled into seven regions of interest (ROIs). After a canonical univariate analysis, we used GMV estimates of thalamic ROIs, together with total thalamic GMV and premorbid intelligence, as features in Random Forests to classify HC, SIB, and SCZ. Then, we computed a Misclassification Index for each individual and tested the improvement in SIB detection after excluding a subsample of HC misclassified as patients. Random Forests discriminated SCZ from HC (accuracy=81%) and SIB from HC (accuracy=75%). Left anteromedial thalamic volumes were significantly associated with both multivariate classifications (p<0.05). Excluding HC misclassified as SCZ improved greatly HC vs. SIB classification (Cohen's d=1.39). These findings suggest that multivariate statistics identify a familial background associated with thalamic GMV reduction in SCZ. They also suggest the relevance of inter-individual variability of GMV patterns for the discrimination of individuals at familial risk for the disorder. Copyright © 2016 Elsevier B.V. All rights reserved.
Quantifying uncertainty in high-resolution coupled hydrodynamic-ecosystem models

NASA Astrophysics Data System (ADS)

Allen, J. I.; Somerfield, P. J.; Gilbert, F. J.

2007-01-01

Marine ecosystem models are becoming increasingly complex and sophisticated, and are being used to estimate the effects of future changes in the earth system with a view to informing important policy decisions. Despite their potential importance, far too little attention has been, and is generally, paid to model errors and the extent to which model outputs actually relate to real-world processes. With the increasing complexity of the models themselves comes an increasing complexity among model results. If we are to develop useful modelling tools for the marine environment we need to be able to understand and quantify the uncertainties inherent in the simulations. Analysing errors within highly multivariate model outputs, and relating them to even more complex and multivariate observational data, are not trivial tasks. Here we describe the application of a series of techniques, including a 2-stage self-organising map (SOM), non-parametric multivariate analysis, and error statistics, to a complex spatio-temporal model run for the period 1988-1989 in the Southern North Sea, coinciding with the North Sea Project which collected a wealth of observational data. We use model output, large spatio-temporally resolved data sets and a combination of methodologies (SOM, MDS, uncertainty metrics) to simplify the problem and to provide tractable information on model performance. The use of a SOM as a clustering tool allows us to simplify the dimensions of the problem while the use of MDS on independent data grouped according to the SOM classification allows us to validate the SOM. The combination of classification and uncertainty metrics allows us to pinpoint the variables and associated processes which require attention in each region. We recommend the use of this combination of techniques for simplifying complex comparisons of model outputs with real data, and analysis of error distributions.
Identifying HIV associated neurocognitive disorder using large-scale Granger causality analysis on resting-state functional MRI

NASA Astrophysics Data System (ADS)

DSouza, Adora M.; Abidin, Anas Z.; Leistritz, Lutz; Wismüller, Axel

2017-02-01

We investigate the applicability of large-scale Granger Causality (lsGC) for extracting a measure of multivariate information flow between pairs of regional brain activities from resting-state functional MRI (fMRI) and test the effectiveness of these measures for predicting a disease state. Such pairwise multivariate measures of interaction provide high-dimensional representations of connectivity profiles for each subject and are used in a machine learning task to distinguish between healthy controls and individuals presenting with symptoms of HIV Associated Neurocognitive Disorder (HAND). Cognitive impairment in several domains can occur as a result of HIV infection of the central nervous system. The current paradigm for assessing such impairment is through neuropsychological testing. With fMRI data analysis, we aim at non-invasively capturing differences in brain connectivity patterns between healthy subjects and subjects presenting with symptoms of HAND. To classify the extracted interaction patterns among brain regions, we use a prototype-based learning algorithm called Generalized Matrix Learning Vector Quantization (GMLVQ). Our approach to characterize connectivity using lsGC followed by GMLVQ for subsequent classification yields good prediction results with an accuracy of 87% and an area under the ROC curve (AUC) of up to 0.90. We obtain a statistically significant improvement (p<0.01) over a conventional Granger causality approach (accuracy = 0.76, AUC = 0.74). High accuracy and AUC values using our multivariate method to connectivity analysis suggests that our approach is able to better capture changes in interaction patterns between different brain regions when compared to conventional Granger causality analysis known from the literature.
Intelligent quotient estimation of mental retarded people from different psychometric instruments using artificial neural networks.

PubMed

Di Nuovo, Alessandro G; Di Nuovo, Santo; Buono, Serafino

2012-02-01

The estimation of a person's intelligence quotient (IQ) by means of psychometric tests is indispensable in the application of psychological assessment to several fields. When complex tests as the Wechsler scales, which are the most commonly used and universally recognized parameter for the diagnosis of degrees of retardation, are not applicable, it is necessary to use other psycho-diagnostic tools more suited for the subject's specific condition. But to ensure a homogeneous diagnosis it is necessary to reach a common metric, thus, the aim of our work is to build models able to estimate accurately and reliably the Wechsler IQ, starting from different psycho-diagnostic tools. Four different psychometric tests (Leiter international performance scale; coloured progressive matrices test; the mental development scale; psycho educational profile), along with the Wechsler scale, were administered to a group of 40 mentally retarded subjects, with various pathologies, and control persons. The obtained database is used to evaluate Wechsler IQ estimation models starting from the scores obtained in the other tests. Five modelling methods, two statistical and three from machine learning, that belong to the family of artificial neural networks (ANNs) are employed to build the estimator. Several error metrics for estimated IQ and for retardation level classification are defined to compare the performance of the various models with univariate and multivariate analyses. Eight empirical studies show that, after ten-fold cross-validation, best average estimation error is of 3.37 IQ points and mental retardation level classification error of 7.5%. Furthermore our experiments prove the superior performance of ANN methods over statistical regression ones, because in all cases considered ANN models show the lowest estimation error (from 0.12 to 0.9 IQ points) and the lowest classification error (from 2.5% to 10%). Since the estimation performance is better than the confidence interval of Wechsler scales (five IQ points), we consider models built very accurate and reliable and they can be used into help clinical diagnosis. Therefore a computer software based on the results of our work is currently used in a clinical center and empirical trails confirm its validity. Furthermore positive results in our multivariate studies suggest new approaches for clinicians. Copyright © 2011 Elsevier B.V. All rights reserved.
Distinctive channel geometry and riparian vegetation: A geomorphic classification for arid ephemeral streams

NASA Astrophysics Data System (ADS)

Sutfin, N.; Shaw, J. R.; Wohl, E. E.; Cooper, D.

2012-12-01

Interactions between hydrology, channel form, and riparian vegetation along arid ephemeral streams are not thoroughly understood and current stream classifications do not adequately represent variability in channel geometry and associated riparian communities. Relatively infrequent hydrologic disturbances in dryland environments are responsible for creation and maintenance of channel form that supports riparian communities. To investigate the influence of channel characteristics on riparian vegetation in the arid southwestern United States, we develop a geomorphic classification for arid ephemeral streams based on the degree of confinement and the composition of confining material that provide constraints on available moisture. Our conceptual model includes five stream types: 1) bedrock channels entirely confined by exposed bedrock and devoid of persistent alluvium; 2) bedrock with alluvium channels at least partially confined by bedrock but containing enough alluvium to create bedforms that persist through time; 3) incised alluvium channels bound only by unconsolidated alluvial material into which they are incised; 4) braided washes that exhibit multi-thread, braided characteristics regardless of the composition of confining material; and 5) piedmont headwater 0-2nd order streams (Strahler) confined only by unconsolidated alluvium and which initiate as secondary channels on piedmont surfaces. Eighty-six study reaches representing the five stream types were surveyed on the U.S. Army Yuma Proving Ground in the Sonoran Desert of southwestern Arizona. Non-parametric multivariate analysis of variance (PERMANOVA) indicates significant differences between the five stream types with regards to channel geometry (i.e., stream gradient, width-to-depth ratio, the ratio between valley width and channel width (Wv/Wc), shear stress, and unit stream power) and riparian vegetation (i.e., presence and canopy coverage by species, canopy stratum, and life form). Discriminant analysis of the physical driving variables is being conducted to produce a model that predicts stream type and resulting riparian vegetation communities based on channel geometry. This model will be tested on a separate set of 15 study reaches surveyed on the Barry M. Goldwater Air Force Range in southern Arizona. The resulting classification will provide a basis for examining relationships between hydrology, channel and watershed characteristics, riparian vegetation and ecosystem sensitivity of ephemeral streams in arid regions of the American Southwest.
Textural Analysis and Substrate Classification in the Nearshore Region of Lake Superior Using High-Resolution Multibeam Bathymetry

NASA Astrophysics Data System (ADS)

Dennison, Andrew G.

Classification of the seafloor substrate can be done with a variety of methods. These methods include Visual (dives, drop cameras); mechanical (cores, grab samples); acoustic (statistical analysis of echosounder returns). Acoustic methods offer a more powerful and efficient means of collecting useful information about the bottom type. Due to the nature of an acoustic survey, larger areas can be sampled, and by combining the collected data with visual and mechanical survey methods provide greater confidence in the classification of a mapped region. During a multibeam sonar survey, both bathymetric and backscatter data is collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on bottom type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, i.e a muddy area from a rocky area, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing of high-resolution multibeam data can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. The development of a new classification method is described here. It is based upon the analysis of textural features in conjunction with ground truth sampling. The processing and classification result of two geologically distinct areas in nearshore regions of Lake Superior; off the Lester River,MN and Amnicon River, WI are presented here, using the Minnesota Supercomputer Institute's Mesabi computing cluster for initial processing. Processed data is then calibrated using ground truth samples to conduct an accuracy assessment of the surveyed areas. From analysis of high-resolution bathymetry data collected at both survey sites is was possible to successfully calculate a series of measures that describe textural information about the lake floor. Further processing suggests that the features calculated capture a significant amount of statistical information about the lake floor terrain as well. Two sources of error, an anomalous heave and refraction error significantly deteriorated the quality of the processed data and resulting validate results. Ground truth samples used to validate the classification methods utilized for both survey sites, however, resulted in accuracy values ranging from 5 -30 percent at the Amnicon River, and between 60-70 percent for the Lester River. The final results suggest that this new processing methodology does adequately capture textural information about the lake floor and does provide an acceptable classification in the absence of significant data quality issues.
Smoking prevalence and seizure control in Chinese males with epilepsy.

PubMed

Gao, Hui; Sander, Josemir W; Du, Xudong; Chen, Jiani; Zhu, Cairong; Zhou, Dong

2017-08-01

Smoking has a negative effect on most diseases, yet it is under-investigated in people with epilepsy; thus its role is not clear in the general population with epilepsy. We performed a retrospective pilot study on males with epilepsy to determine the smoking rate and its relationship with seizure control using univariate analysis to calculate odds ratios (ORs) and also used a multi-variate logistic regression model. The smoking rate in our sample of 278 individuals was 25.5%, which is lower than the general Chinese population smoking rate among males of 52.1%. We used two classifications: the first classified epilepsy as generalized, or by presumed topographic origin (temporal, frontal, parietal and occipital). The second classified the dominant seizure type of an individual as generalized tonic clonic seizure (GTCS), myoclonic seizure (MS), complex partial seizure (CPS), simple partial seizure (SPS), and secondary GTCS (sGTCS). The univariable analysis of satisfactory seizure control profile and smoking rate in both classifications showed a trend towards a beneficial effect of smoking although most were not statistically significant. Considering medication is an important confounding factor that would largely influence seizure control, we also conducted multi-variable analysis for both classifications with drug numbers and dosage. The result of our model also suggested that smoking is a protective factor. Our findings seem to suggest that smoking could have a potential role in seizure control although confounders need exploration particularly in view of the potential long term health effects. Replication in a much larger sample is needed as well as case control studies to elucidate this issue. Copyright © 2017 Elsevier Inc. All rights reserved.
Study for Updated Gout Classification Criteria (SUGAR): identification of features to classify gout

PubMed Central

Taylor, William J.; Fransen, Jaap; Jansen, Tim L.; Dalbeth, Nicola; Schumacher, H. Ralph; Brown, Melanie; Louthrenoo, Worawit; Vazquez-Mellado, Janitzia; Eliseev, Maxim; McCarthy, Geraldine; Stamp, Lisa K.; Perez-Ruiz, Fernando; Sivera, Francisca; Ea, Hang-Korng; Gerritsen, Martijn; Scire, Carlo; Cavagna, Lorenzo; Lin, Chingtsai; Chou, Yin-Yi; Tausche, Anne-Kathrin; Vargas-Santos, Ana Beatriz; Janssen, Matthijs; Chen, Jiunn-Horng; Slot, Ole; Cimmino, Marco A.; Uhlig, Till; Neogi, Tuhina

2015-01-01

Objective To determine which clinical, laboratory and imaging features most accurately distinguished gout from non-gout. Methods A cross-sectional study of consecutive rheumatology clinic patients with at least one swollen joint or subcutaneous tophus. Gout was defined by synovial fluid or tophus aspirate microscopy by certified examiners in all patients. The sample was randomly divided into a model development (2/3) and test sample (1/3). Univariate and multivariate association between clinical features and MSU-defined gout was determined using logistic regression modelling. Shrinkage of regression weights was performed to prevent over-fitting of the final model. Latent class analysis was conducted to identify patterns of joint involvement. Results In total, 983 patients were included. Gout was present in 509 (52%). In the development sample (n=653), these features were selected for the final model (multivariate OR) joint erythema (2.13), difficulty walking (7.34), time to maximal pain < 24 hours (1.32), resolution by 2 weeks (3.58), tophus (7.29), MTP1 ever involved (2.30), location of currently tender joints: Other foot/ankle (2.28), MTP1 (2.82), serum urate level > 6 mg/dl (0.36 mmol/l) (3.35), ultrasound double contour sign (7.23), Xray erosion or cyst (2.49). The final model performed adequately in the test set with no evidence of misfit, high discrimination and predictive ability. MTP1 involvement was the most common joint pattern (39.4%) in gout cases. Conclusion Ten key discriminating features have been identified for further evaluation for new gout classification criteria. Ultrasound findings and degree of uricemia add discriminating value, and will significantly contribute to more accurate classification criteria. PMID:25777045
Incorporation of N0 Stage with Insufficient Numbers of Lymph Nodes into N1 Stage in the Seventh Edition of the TNM Classification Improves Prediction of Prognosis in Gastric Cancer: Results of a Single-Institution Study of 1258 Chinese Patients.

PubMed

Li, Bofei; Li, Yuanfang; Wang, Wei; Qiu, Haibo; Seeruttun, Sharvesh Raj; Fang, Cheng; Chen, Yongming; Liang, Yao; Li, Wei; Chen, Yingbo; Sun, Xiaowei; Guan, Yuanxiang; Zhan, Youqing; Zhou, Zhiwei

2016-01-01

This study examined the prognosis of the "node-negative with eLNs ≤ 15" designation and the additional value of incorporating it into the pN1 designation in the seventh edition of the N classification. From January 2000 to September 2010, a total of 1258 gastric cancer patients (patients with eLNs > 15 or node-negative with eLNs ≤ 15) undergoing radical gastric resection were enrolled in this study. We incorporated node-negative patients with eLNs ≤ 15 into pN1 and compared this designation with the current 7th edition UICC N stage for 3, 5-year overall survival by univariate and multivariate analysis. Homogeneity, discriminatory ability, and monotonicity of gradients in the hypothetical N stage and the UICC N stage were compared using linear trend χ2, likelihood ratio χ2 statistics, and Akaike information criterion (AIC) calculations. Node-negative patients with eLNs ≤ 15 had worse survival compared with those with eLNs > 15. In univariate and multivariate analyses, the hypothetical N stage showed superiority to the 7th edition pN staging. The hypothetical staging system had higher linear trend and likelihood ratio χ (2) scores and smaller AIC values compared with those for the TNM system, which represented the optimum prognostic stratification. Node-negative patients with eLNs ≤ 15 can be considered to be incorporated into the pN1 stage in the 7th edition of the TNM classification.
Tumor invasiveness defined by IASLC/ATS/ERS classification of ground-glass nodules can be predicted by quantitative CT parameters.

PubMed

Zhou, Qian-Jun; Zheng, Zhi-Chun; Zhu, Yong-Qiao; Lu, Pei-Ji; Huang, Jia; Ye, Jian-Ding; Zhang, Jie; Lu, Shun; Luo, Qing-Quan

2017-05-01

To investigate the potential value of CT parameters to differentiate ground-glass nodules between noninvasive adenocarcinoma and invasive pulmonary adenocarcinoma (IPA) as defined by IASLC/ATS/ERS classification. We retrospectively reviewed 211 patients with pathologically proved stage 0-IA lung adenocarcinoma which appeared as subsolid nodules, from January 2012 to January 2013 including 137 pure ground glass nodules (pGGNs) and 74 part-solid nodules (PSNs). Pathological data was classified under the 2011 IASLC/ATS/ERS classification. Both quantitative and qualitative CT parameters were used to determine the tumor invasiveness between noninvasive adenocarcinomas and IPAs. There were 154 noninvasive adenocarcinomas and 57 IPAs. In pGGNs, CT size and area, one-dimensional mean CT value and bubble lucency were significantly different between noninvasive adenocarcinomas and IPAs on univariate analysis. Multivariate regression and ROC analysis revealed that CT size and one-dimensional mean CT value were predictive of noninvasive adenocarcinomas compared to IPAs. Optimal cutoff value was 13.60 mm (sensitivity, 75.0%; specificity, 99.6%), and -583.60 HU (sensitivity, 68.8%; specificity, 66.9%). In PSNs, there were significant differences in CT size and area, solid component area, solid proportion, one-dimensional mean and maximum CT value, three-dimensional (3D) mean CT value between noninvasive adenocarcinomas and IPAs on univariate analysis. Multivariate and ROC analysis showed that CT size and 3D mean CT value were significantly differentiators. Optimal cutoff value was 19.64 mm (sensitivity, 53.7%; specificity, 93.9%), -571.63 HU (sensitivity, 85.4%; specificity, 75.8%). For pGGNs, CT size and one-dimensional mean CT value are determinants for tumor invasiveness. For PSNs, tumor invasiveness can be predicted by CT size and 3D mean CT value.

Land Cover Classification in a Complex Urban-Rural Landscape with Quickbird Imagery

PubMed Central

Moran, Emilio Federico.

2010-01-01

High spatial resolution images have been increasingly used for urban land use/cover classification, but the high spectral variation within the same land cover, the spectral confusion among different land covers, and the shadow problem often lead to poor classification performance based on the traditional per-pixel spectral-based classification methods. This paper explores approaches to improve urban land cover classification with Quickbird imagery. Traditional per-pixel spectral-based supervised classification, incorporation of textural images and multispectral images, spectral-spatial classifier, and segmentation-based classification are examined in a relatively new developing urban landscape, Lucas do Rio Verde in Mato Grosso State, Brazil. This research shows that use of spatial information during the image classification procedure, either through the integrated use of textural and spectral images or through the use of segmentation-based classification method, can significantly improve land cover classification performance. PMID:21643433
Prognostic factors and risk stratification in patients with castration-resistant prostate cancer receiving docetaxel-based chemotherapy.

PubMed

Yamashita, Shimpei; Kohjimoto, Yasuo; Iguchi, Takashi; Koike, Hiroyuki; Kusumoto, Hiroki; Iba, Akinori; Kikkawa, Kazuro; Kodama, Yoshiki; Matsumura, Nagahide; Hara, Isao

2016-03-22

While novel drugs have been developed, docetaxel remains one of the standard initial systemic therapies for castration-resistant prostate cancer (CRPC) patients. Despite the excellent anti-tumor effect of docetaxel, its severe adverse effects sometimes distress patients. Therefore, it would be very helpful to predict the efficacy of docetaxel before treatment. The aims of this study were to evaluate the potential value of patient characteristics in predicting overall survival (OS) and to develop a risk classification for CRPC patients treated with docetaxel-based chemotherapy. This study included 79 patients with CRPC treated with docetaxel. The variables, including patient characteristics at diagnosis and at the start of chemotherapy, were retrospectively collected. Prognostic factors predicting OS were analyzed using the Cox proportional hazard model. Risk stratification for overall survival was determined based on the results of multivariate analysis. PSA response ≥50 % was observed in 55 (69.6 %) of all patients, and the median OS was 22.5 months. The multivariate analysis showed that age, serum PSA level at the start of chemotherapy, and Hb were independent prognostic factors for OS. In addition, ECOG performance status (PS) and the CRP-to-albumin ratio were not significant but were considered possible predictors for OS. Risk stratification according to the number of these risk factors could effectively stratify CRPC patients treated with docetaxel in terms of OS. Age, serum PSA level at the start of chemotherapy, and Hb were identified as independent prognostic factors of OS. ECOG PS and the CRP-to-albumin ratio were not significant, but were considered possible predictors for OS in Japanese CRPC patients treated with docetaxel. Risk stratification based on these factors could be helpful for estimating overall survival.
The incidence and prevalence of pterygium in South Korea: A 10-year population-based Korean cohort study.

PubMed

Rim, Tyler Hyungtaek; Kang, Min Jae; Choi, Moonjung; Seo, Kyoung Yul; Kim, Sung Soo

2017-01-01

Although numerous population-based studies have reported the prevalences and risk factors for pterygium, information regarding the incidence of pterygium is scarce. This population-based cohort study aimed to evaluate the South Korean incidence and prevalence of pterygium. We retrospectively obtained data from a nationally representative sample of 1,116,364 South Koreans in the Korea National Health Insurance Service National Sample Cohort (NHIS-NSC). The associated sociodemographic factors were evaluated using multivariable Cox regression analysis, and the hazard ratios and confidence intervals were calculated. Pterygium was defined based on the Korean Classification of Diseases code, and surgically removed pterygium was defined as cases that required surgical removal. We identified 21,465 pterygium cases and 8,338 surgically removed pterygium cases during the study period. The overall incidences were 2.1 per 1,000 person-years for pterygium and 0.8 per 1,000 person-years for surgically removed pterygium. Among subjects who were ≥40 years old, the incidences were 4.3 per 1,000 person-years for pterygium and 1.7 per 1,000 person-years for surgically removed pterygium. The overall prevalences were 1.9% for pterygium and 0.6% for surgically removed pterygium, and the prevalences increased to 3.8% for pterygium and 1.4% for surgically removed pterygium among subjects who were ≥40 years old. The incidences of pterygium decreased according to year. The incidence and prevalence of pterygium were highest among 60-79-year-old individuals. Increasing age, female sex, and living in a relatively rural area were associated with increased risks of pterygium and surgically removed pterygium in the multivariable Cox regression analysis. Our analyses of South Korean national insurance claims data revealed a decreasing trend in the incidence of pterygium during the study period.
River reach classification for the Greater Mekong Region at high spatial resolution

NASA Astrophysics Data System (ADS)

Ouellet Dallaire, C.; Lehner, B.

2014-12-01

River classifications have been used in river health and ecological assessments as coarse proxies to represent aquatic biodiversity when comprehensive biological and/or species data is unavailable. Currently there are no river classifications or biological data available in a consistent format for the extent of the Greater Mekong Region (GMR; including the Irrawaddy, the Salween, the Chao Praya, the Mekong and the Red River basins). The current project proposes a new river habitat classification for the region, facilitated by the HydroSHEDS (HYDROlogical SHuttle Elevation Derivatives at multiple Scales) database at 500m pixel resolution. The classification project is based on the Global River Classification framework relying on the creation of multiple sub-classifications based on different disciplines. The resulting classes from the sub-classification are later combined into final classes to create a holistic river reach classification. For the GMR, a final habitat classification was created based on three sub-classifications: a hydrological sub-classification based only on discharge indices (river size and flow variability); a physio-climatic sub-classification based on large scale indices of climate and elevation (biomes, ecoregions and elevation); and a geomorphological sub-classification based on local morphology (presence of floodplains, reach gradient and sand transport). Key variables and thresholds were identified in collaboration with local experts to ensure that regional knowledge was included. The final classification is composed 54 unique final classes based on 3 sub-classifications with less than 15 classes each. The resulting classifications are driven by abiotic variables and do not include biological data, but they represent a state-of-the art product based on best available data (mostly global data). The most common river habitat type is the "dry broadleaf, low gradient, very small river". These classifications could be applied in a wide range of hydro-ecological assessments and useful for a variety of stakeholders such as NGO, governments and researchers.
Missing data exploration: highlighting graphical presentation of missing pattern.

PubMed

Zhang, Zhongheng

2015-12-01

Functions shipped with R base can fulfill many tasks of missing data handling. However, because the data volume of electronic medical record (EMR) system is always very large, more sophisticated methods may be helpful in data management. The article focuses on missing data handling by using advanced techniques. There are three types of missing data, that is, missing completely at random (MCAR), missing at random (MAR) and not missing at random (NMAR). This classification system depends on how missing values are generated. Two packages, Multivariate Imputation by Chained Equations (MICE) and Visualization and Imputation of Missing Values (VIM), provide sophisticated functions to explore missing data pattern. In particular, the VIM package is especially helpful in visual inspection of missing data. Finally, correlation analysis provides information on the dependence of missing data on other variables. Such information is useful in subsequent imputations.
Classification of reaches in the Missouri and lower Yellowstone Rivers based on flow characteristics

USGS Publications Warehouse

Pegg, Mark A.; Pierce, Clay L.

2002-01-01

Several aspects of flow have been shown to be important determinants of biological community structure and function in streams, yet direct application of this approach to large rivers has been limited. Using a multivariate approach, we grouped flow gauges into hydrologically similar units in the Missouri and lower Yellowstone Rivers and developed a model based on flow variability parameters that could be used to test hypotheses about the role of flow in determining aquatic community structure. This model could also be used for future comparisons as the hydrological regime changes. A suite of hydrological parameters for the recent, post-impoundment period (1 October 1966–30 September 1996) for each of 15 gauges along the Missouri and lower Yellowstone Rivers were initially used. Preliminary graphical exploration identified five variables for use in further multivariate analyses. Six hydrologically distinct units composed of gauges exhibiting similar flow characteristics were then identified using cluster analysis. Discriminant analyses identified the three most influential variables as flow per unit drainage area, coefficient of variation of mean annual flow, and flow constancy. One surprising result was the relative similarity of flow regimes between the two uppermost and three lowermost gauges, despite large differences in magnitude of flow and separation by roughly 3000 km. Our results synthesize, simplify and interpret the complex changes in flow occurring along the Missouri and lower Yellowstone Rivers, and provide an objective grouping for future tests of how these changes may affect biological communities.
Object-Based Random Forest Classification of Land Cover from Remotely Sensed Imagery for Industrial and Mining Reclamation

NASA Astrophysics Data System (ADS)

Chen, Y.; Luo, M.; Xu, L.; Zhou, X.; Ren, J.; Zhou, J.

2018-04-01

The RF method based on grid-search parameter optimization could achieve a classification accuracy of 88.16 % in the classification of images with multiple feature variables. This classification accuracy was higher than that of SVM and ANN under the same feature variables. In terms of efficiency, the RF classification method performs better than SVM and ANN, it is more capable of handling multidimensional feature variables. The RF method combined with object-based analysis approach could highlight the classification accuracy further. The multiresolution segmentation approach on the basis of ESP scale parameter optimization was used for obtaining six scales to execute image segmentation, when the segmentation scale was 49, the classification accuracy reached the highest value of 89.58 %. The classification accuracy of object-based RF classification was 1.42 % higher than that of pixel-based classification (88.16 %), and the classification accuracy was further improved. Therefore, the RF classification method combined with object-based analysis approach could achieve relatively high accuracy in the classification and extraction of land use information for industrial and mining reclamation areas. Moreover, the interpretation of remotely sensed imagery using the proposed method could provide technical support and theoretical reference for remotely sensed monitoring land reclamation.
Delineation of estuarine management areas using multivariate geostatistics: the case of Sado Estuary.

PubMed

Caeiro, Sandra; Goovaerts, Pierre; Painho, Marco; Costa, M Helena

2003-09-15

The Sado Estuary is a coastal zone located in the south of Portugal where conflicts between conservation and development exist because of its location near industrialized urban zones and its designation as a natural reserve. The aim of this paper is to evaluate a set of multivariate geostatistical approaches to delineate spatially contiguous regions of sediment structure for Sado Estuary. These areas will be the supporting infrastructure of an environmental management system for this estuary. The boundaries of each homogeneous area were derived from three sediment characterization attributes through three different approaches: (1) cluster analysis of dissimilarity matrix function of geographical separation followed by indicator kriging of the cluster data, (2) discriminant analysis of kriged values of the three sediment attributes, and (3) a combination of methods 1 and 2. Final maximum likelihood classification was integrated into a geographical information system. All methods generated fairly spatially contiguous management areas that reproduce well the environment of the estuary. Map comparison techniques based on kappa statistics showed thatthe resultant three maps are similar, supporting the choice of any of the methods as appropriate for management of the Sado Estuary. However, the results of method 1 seem to be in better agreement with estuary behavior, assessment of contamination sources, and previous work conducted at this site.
Partial Least Squares for Discrimination in fMRI Data

PubMed Central

Andersen, Anders H.; Rayens, William S.; Liu, Yushu; Smith, Charles D.

2011-01-01

Multivariate methods for discrimination were used in the comparison of brain activation patterns between groups of cognitively normal women who are at either high or low Alzheimer's disease risk based on family history and apolipoprotein-E4 status. Linear discriminant analysis (LDA) was preceded by dimension reduction using either principal component analysis (PCA), partial least squares (PLS), or a new oriented partial least squares (OrPLS) method. The aim was to identify a spatial pattern of functionally connected brain regions that was differentially expressed by the risk groups and yielded optimal classification accuracy. Multivariate dimension reduction is required prior to LDA when the data contains more feature variables than there are observations on individual subjects. Whereas PCA has been commonly used to identify covariance patterns in neuroimaging data, this approach only identifies gross variability and is not capable of distinguishing among-groups from within-groups variability. PLS and OrPLS provide a more focused dimension reduction by incorporating information on class structure and therefore lead to more parsimonious models for discrimination. Performance was evaluated in terms of the cross-validated misclassification rates. The results support the potential of using fMRI as an imaging biomarker or diagnostic tool to discriminate individuals with disease or high risk. PMID:22227352
Saturnius minutus n. sp. and S. dimitrovi n. sp. (Digenea: Hemiuridae) from Mugil cephalus L. (Teleostei: Mugilidae), with a multivariate morphological analysis of the Mediterranean species of Saturnius Manter, 1969.

PubMed

Blasco-Costa, I; Pankov, P; Gibson, D I; Balbuena, J A; Raga, J A; Sarabeev, V L; Kostadinova, A

2006-09-01

Three species of the bunocotyline genus Saturnius Manter, 1969 are described from the stomach lining of mugilid fishes of the Mediterranean and Black Seas. Two of the species are new: S. minutus n. sp. occurs in Mugil cephalus off the Mediterranean coast of Spain; and S. dimitrovi n. sp., a parasite of M. cephalus off the Bulgarian Black Sea coast and the Spanish Mediterranean coast, was originally described as S. papernai by Dimitrov et al. (1998). In addition, S. papernai Overstreet, 1977 is redescribed from M. cephalus off the Spanish Mediterranean coast and from Liza aurata and L. saliens off the Bulgarian Black Sea coast. The three species are distinguished morphometrically using univariate and multivariate analyses. These results were verified using Linear Discriminant Analysis which correctly allocated all specimens to their species designations based on morphology (i.e. 100% successful classification rate) and assigned almost all specimens to the correct population (locality). The following variables were selected for optimal separation between samples: the length of the forebody, ventral sucker and posterior testis, the length and width of the posteriormost pseudosegment, and the width of the muscular flange at ventral sucker level.
Multivariate Analysis as a Method for Evaluating the Conceptual Perceptions of Korean Medicine Students regarding Phlegm Pattern

PubMed Central

Kim, Hyungsuk; Park, Young-Jae; Park, Young-Bae

2013-01-01

Individuals may perceive the concepts in Korean medicine pattern classification differently because it is performed according to the integration of a variety of information. Therefore, analysis about individual perspective is very important for examining the cross-sectional perspective state of Korean medicine concepts and developing both the clinical guideline including diagnosis and the curriculum of Korean medicine colleges. Moreover, because this conceptual difference is thought to begin with college education, it is worthwhile to observe students' viewpoints. So, we suggested multivariate analysis to explore the dimensional structure of Korean medicine students' conceptual perceptions regarding phlegm pattern. We surveyed 326 students divided into 5 groups based on their year of study. Data were analyzed using multidimensional scaling and factor analysis. Within-group difference was the smallest for third-year students, who have received Korean medicine education in full for the first time. With the exception of first-year students, the conceptual map revealed that each group's mean perceptions of phlegm pattern were distributed in almost linear fashion. To determine the effect of education, we investigated the preference rankings and scores of each symptom. We also extracted factors to identify latent variables and to compare the between-group conceptual characteristics regarding phlegm pattern. PMID:24062789
Land Cover Analysis by Using Pixel-Based and Object-Based Image Classification Method in Bogor

NASA Astrophysics Data System (ADS)

Amalisana, Birohmatin; Rokhmatullah; Hernina, Revi

2017-12-01

The advantage of image classification is to provide earth’s surface information like landcover and time-series changes. Nowadays, pixel-based image classification technique is commonly performed with variety of algorithm such as minimum distance, parallelepiped, maximum likelihood, mahalanobis distance. On the other hand, landcover classification can also be acquired by using object-based image classification technique. In addition, object-based classification uses image segmentation from parameter such as scale, form, colour, smoothness and compactness. This research is aimed to compare the result of landcover classification and its change detection between parallelepiped pixel-based and object-based classification method. Location of this research is Bogor with 20 years range of observation from 1996 until 2016. This region is famous as urban areas which continuously change due to its rapid development, so that time-series landcover information of this region will be interesting.
Disentangling Environmental and Anthropogenic Impacts on the Distribution of Unintentionally Introduced Invasive Alien Insects in Mainland China

PubMed Central

Zhao, Cai-Yun; Xu, Jing; Liu, Xiao-Yan

2017-01-01

Abstract Globalization increases the opportunities for unintentionally introduced invasive alien species, especially for insects, and most of these species could damage ecosystems and cause economic loss in China. In this study, we analyzed drivers of the distribution of unintentionally introduced invasive alien insects. Based on the number of unintentionally introduced invasive alien insects and their presence/absence records in each province in mainland China, regression trees were built to elucidate the roles of environmental and anthropogenic factors on the number distribution and similarity of species composition of these insects. Classification and regression trees indicated climatic suitability (the mean temperature in January) and human economic activity (sum of total freight) are primary drivers for the number distribution pattern of unintentionally introduced invasive alien insects at provincial scale, while only environmental factors (the mean January temperature, the annual precipitation and the areas of provinces) significantly affect the similarity of them based on the multivariate regression trees. PMID:28973576
Measurement of the ν _{μ } energy spectrum with IceCube-79

NASA Astrophysics Data System (ADS)

Aartsen, M. G.; Ackermann, M.; Adams, J.; Aguilar, J. A.; Ahlers, M.; Ahrens, M.; Al Samarai, I.; Altmann, D.; Andeen, K.; Anderson, T.; Ansseau, I.; Anton, G.; Archinger, M.; Argüelles, C.; Auffenberg, J.; Axani, S.; Bagherpour, H.; Bai, X.; Barwick, S. W.; Baum, V.; Bay, R.; Beatty, J. J.; Becker Tjus, J.; Becker, K.-H.; BenZvi, S.; Berley, D.; Bernardini, E.; Besson, D. Z.; Binder, G.; Bindig, D.; Blaufuss, E.; Blot, S.; Bohm, C.; Börner, M.; Bos, F.; Bose, D.; Böser, S.; Botner, O.; Bradascio, F.; Braun, J.; Brayeur, L.; Bretz, H.-P.; Bron, S.; Burgman, A.; Carver, T.; Casier, M.; Cheung, E.; Chirkin, D.; Christov, A.; Clark, K.; Classen, L.; Coenders, S.; Collin, G. H.; Conrad, J. M.; Cowen, D. F.; Cross, R.; Day, M.; de André, J. P. A. M.; De Clercq, C.; del Pino Rosendo, E.; Dembinski, H.; De Ridder, S.; Desiati, P.; de Vries, K. D.; de Wasseige, G.; de With, M.; DeYoung, T.; Díaz-Vélez, J. C.; di Lorenzo, V.; Dujmovic, H.; Dumm, J. P.; Dunkman, M.; Eberhardt, B.; Ehrhardt, T.; Eichmann, B.; Eller, P.; Euler, S.; Evenson, P. A.; Fahey, S.; Fazely, A. R.; Feintzeig, J.; Felde, J.; Filimonov, K.; Finley, C.; Flis, S.; Fösig, C.-C.; Franckowiak, A.; Friedman, E.; Fuchs, T.; Gaisser, T. K.; Gallagher, J.; Gerhardt, L.; Ghorbani, K.; Giang, W.; Gladstone, L.; Glauch, T.; Glüsenkamp, T.; Goldschmidt, A.; Gonzalez, J. G.; Grant, D.; Griffith, Z.; Haack, C.; Hallgren, A.; Halzen, F.; Hansen, E.; Hansmann, T.; Hanson, K.; Hebecker, D.; Heereman, D.; Helbing, K.; Hellauer, R.; Hickford, S.; Hignight, J.; Hill, G. C.; Hoffman, K. D.; Hoffmann, R.; Hoshina, K.; Huang, F.; Huber, M.; Hultqvist, K.; In, S.; Ishihara, A.; Jacobi, E.; Japaridze, G. S.; Jeong, M.; Jero, K.; Jones, B. J. P.; Kang, W.; Kappes, A.; Karg, T.; Karle, A.; Katz, U.; Kauer, M.; Keivani, A.; Kelley, J. L.; Kheirandish, A.; Kim, J.; Kim, M.; Kintscher, T.; Kiryluk, J.; Kittler, T.; Klein, S. R.; Kohnen, G.; Koirala, R.; Kolanoski, H.; Konietz, R.; Köpke, L.; Kopper, C.; Kopper, S.; Koskinen, D. J.; Kowalski, M.; Krings, K.; Kroll, M.; Krückl, G.; Krüger, C.; Kunnen, J.; Kunwar, S.; Kurahashi, N.; Kuwabara, T.; Kyriacou, A.; Labare, M.; Lanfranchi, J. L.; Larson, M. J.; Lauber, F.; Lennarz, D.; Lesiak-Bzdak, M.; Leuermann, M.; Lu, L.; Lünemann, J.; Madsen, J.; Maggi, G.; Mahn, K. B. M.; Mancina, S.; Maruyama, R.; Mase, K.; Maunu, R.; McNally, F.; Meagher, K.; Medici, M.; Meier, M.; Menne, T.; Merino, G.; Meures, T.; Miarecki, S.; Micallef, J.; Momenté, G.; Montaruli, T.; Moulai, M.; Nahnhauer, R.; Naumann, U.; Neer, G.; Niederhausen, H.; Nowicki, S. C.; Nygren, D. R.; Obertacke Pollmann, A.; Olivas, A.; O'Murchadha, A.; Palczewski, T.; Pandya, H.; Pankova, D. V.; Peiffer, P.; Penek, Ö.; Pepper, J. A.; Pérez de los Heros, C.; Pieloth, D.; Pinat, E.; Price, P. B.; Przybylski, G. T.; Quinnan, M.; Raab, C.; Rädel, L.; Rameez, M.; Rawlins, K.; Reimann, R.; Relethford, B.; Relich, M.; Resconi, E.; Rhode, W.; Richman, M.; Riedel, B.; Robertson, S.; Rongen, M.; Rott, C.; Ruhe, T.; Ryckbosch, D.; Rysewyk, D.; Sabbatini, L.; Sanchez Herrera, S. E.; Sandrock, A.; Sandroos, J.; Sarkar, S.; Satalecka, K.; Schlunder, P.; Schmidt, T.; Schoenen, S.; Schöneberg, S.; Schumacher, L.; Seckel, D.; Seunarine, S.; Soldin, D.; Song, M.; Spiczak, G. M.; Spiering, C.; Stachurska, J.; Stanev, T.; Stasik, A.; Stettner, J.; Steuer, A.; Stezelberger, T.; Stokstad, R. G.; Stößl, A.; Ström, R.; Strotjohann, N. L.; Sullivan, G. W.; Sutherland, M.; Taavola, H.; Taboada, I.; Tatar, J.; Tenholt, F.; Ter-Antonyan, S.; Terliuk, A.; Tešić, G.; Tilav, S.; Toale, P. A.; Tobin, M. N.; Toscano, S.; Tosi, D.; Tselengidou, M.; Tung, C. F.; Turcati, A.; Unger, E.; Usner, M.; Vandenbroucke, J.; van Eijndhoven, N.; Vanheule, S.; van Rossem, M.; van Santen, J.; Vehring, M.; Voge, M.; Vogel, E.; Vraeghe, M.; Walck, C.; Wallace, A.; Wallraff, M.; Wandkowsky, N.; Waza, A.; Weaver, Ch.; Weiss, M. J.; Wendt, C.; Westerhoff, S.; Whelan, B. J.; Wickmann, S.; Wiebe, K.; Wiebusch, C. H.; Wille, L.; Williams, D. R.; Wills, L.; Wolf, M.; Wood, T. R.; Woolsey, E.; Woschnagg, K.; Xu, D. L.; Xu, X. W.; Xu, Y.; Yanez, J. P.; Yodh, G.; Yoshida, S.; Zoll, M.

2017-10-01

IceCube is a neutrino observatory deployed in the glacial ice at the geographic South Pole. The ν _μ energy unfolding described in this paper is based on data taken with IceCube in its 79-string configuration. A sample of muon neutrino charged-current interactions with a purity of 99.5% was selected by means of a multivariate classification process based on machine learning. The subsequent unfolding was performed using the software Truee. The resulting spectrum covers an E_ν -range of more than four orders of magnitude from 125 GeV to 3.2 PeV. Compared to the Honda atmospheric neutrino flux model, the energy spectrum shows an excess of more than 1.9 σ in four adjacent bins for neutrino energies E_ν ≥ 177.8 {TeV}. The obtained spectrum is fully compatible with previous measurements of the atmospheric neutrino flux and recent IceCube measurements of a flux of high-energy astrophysical neutrinos.
Clinical study of quantitative diagnosis of early cervical cancer based on the classification of acetowhitening kinetics

NASA Astrophysics Data System (ADS)

Wu, Tao; Cheung, Tak-Hong; Yim, So-Fan; Qu, Jianan Y.

2010-03-01

A quantitative colposcopic imaging system for the diagnosis of early cervical cancer is evaluated in a clinical study. This imaging technology based on 3-D active stereo vision and motion tracking extracts diagnostic information from the kinetics of acetowhitening process measured from the cervix of human subjects in vivo. Acetowhitening kinetics measured from 137 cervical sites of 57 subjects are analyzed and classified using multivariate statistical algorithms. Cross-validation methods are used to evaluate the performance of the diagnostic algorithms. The results show that an algorithm for screening precancer produced 95% sensitivity (SE) and 96% specificity (SP) for discriminating normal and human papillomavirus (HPV)-infected tissues from cervical intraepithelial neoplasia (CIN) lesions. For a diagnostic algorithm, 91% SE and 90% SP are achieved for discriminating normal tissue, HPV infected tissue, and low-grade CIN lesions from high-grade CIN lesions. The results demonstrate that the quantitative colposcopic imaging system could provide objective screening and diagnostic information for early detection of cervical cancer.
Computer-Assisted Decision Support for Student Admissions Based on Their Predicted Academic Performance.

PubMed

Muratov, Eugene; Lewis, Margaret; Fourches, Denis; Tropsha, Alexander; Cox, Wendy C

2017-04-01

Objective. To develop predictive computational models forecasting the academic performance of students in the didactic-rich portion of a doctor of pharmacy (PharmD) curriculum as admission-assisting tools. Methods. All PharmD candidates over three admission cycles were divided into two groups: those who completed the PharmD program with a GPA ≥ 3; and the remaining candidates. Random Forest machine learning technique was used to develop a binary classification model based on 11 pre-admission parameters. Results. Robust and externally predictive models were developed that had particularly high overall accuracy of 77% for candidates with high or low academic performance. These multivariate models were highly accurate in predicting these groups to those obtained using undergraduate GPA and composite PCAT scores only. Conclusion. The models developed in this study can be used to improve the admission process as preliminary filters and thus quickly identify candidates who are likely to be successful in the PharmD curriculum.
Measurement of the $$\

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aartsen, M. G.; Ackermann, M.; Adams, J.

IceCube is a neutrino observatory deployed in the glacial ice at the geographic South Pole. The ν μ energy unfolding described in this paper is based on data taken with IceCube in its 79-string configuration. A sample of muon neutrino charged-current interactions with a purity of 99.5% was selected by means of a multivariate classification process based on machine learning. The subsequent unfolding was performed using the software Truee. The resulting spectrum covers an E ν-range of more than four orders of magnitude from 125 GeV to 3.2 PeV. Compared to the Honda atmospheric neutrino flux model, the energy spectrum shows anmore » excess of more than 1.9σ in four adjacent bins for neutrino energies E ν ≥ 177.8TeV. The obtained spectrum is fully compatible with previous measurements of the atmospheric neutrino flux and recent IceCube measurements of a flux of high-energy astrophysical neutrinos.« less
Measurement of the $$\

DOE PAGES

Aartsen, M. G.; Ackermann, M.; Adams, J.; ...

2017-10-20

IceCube is a neutrino observatory deployed in the glacial ice at the geographic South Pole. The ν μ energy unfolding described in this paper is based on data taken with IceCube in its 79-string configuration. A sample of muon neutrino charged-current interactions with a purity of 99.5% was selected by means of a multivariate classification process based on machine learning. The subsequent unfolding was performed using the software Truee. The resulting spectrum covers an E ν-range of more than four orders of magnitude from 125 GeV to 3.2 PeV. Compared to the Honda atmospheric neutrino flux model, the energy spectrum shows anmore » excess of more than 1.9σ in four adjacent bins for neutrino energies E ν ≥ 177.8TeV. The obtained spectrum is fully compatible with previous measurements of the atmospheric neutrino flux and recent IceCube measurements of a flux of high-energy astrophysical neutrinos.« less
Improved classification accuracy in 1- and 2-dimensional NMR metabolomics data using the variance stabilising generalised logarithm transformation

PubMed Central

Parsons, Helen M; Ludwig, Christian; Günther, Ulrich L; Viant, Mark R

2007-01-01

Background Classifying nuclear magnetic resonance (NMR) spectra is a crucial step in many metabolomics experiments. Since several multivariate classification techniques depend upon the variance of the data, it is important to first minimise any contribution from unwanted technical variance arising from sample preparation and analytical measurements, and thereby maximise any contribution from wanted biological variance between different classes. The generalised logarithm (glog) transform was developed to stabilise the variance in DNA microarray datasets, but has rarely been applied to metabolomics data. In particular, it has not been rigorously evaluated against other scaling techniques used in metabolomics, nor tested on all forms of NMR spectra including 1-dimensional (1D) 1H, projections of 2D 1H, 1H J-resolved (pJRES), and intact 2D J-resolved (JRES). Results Here, the effects of the glog transform are compared against two commonly used variance stabilising techniques, autoscaling and Pareto scaling, as well as unscaled data. The four methods are evaluated in terms of the effects on the variance of NMR metabolomics data and on the classification accuracy following multivariate analysis, the latter achieved using principal component analysis followed by linear discriminant analysis. For two of three datasets analysed, classification accuracies were highest following glog transformation: 100% accuracy for discriminating 1D NMR spectra of hypoxic and normoxic invertebrate muscle, and 100% accuracy for discriminating 2D JRES spectra of fish livers sampled from two rivers. For the third dataset, pJRES spectra of urine from two breeds of dog, the glog transform and autoscaling achieved equal highest accuracies. Additionally we extended the glog algorithm to effectively suppress noise, which proved critical for the analysis of 2D JRES spectra. Conclusion We have demonstrated that the glog and extended glog transforms stabilise the technical variance in NMR metabolomics datasets. This significantly improves the discrimination between sample classes and has resulted in higher classification accuracies compared to unscaled, autoscaled or Pareto scaled data. Additionally we have confirmed the broad applicability of the glog approach using three disparate datasets from different biological samples using 1D NMR spectra, 1D projections of 2D JRES spectra, and intact 2D JRES spectra. PMID:17605789
Magnetic resonance imaging-based measures predictive of short-term surgical outcome in patients with Chiari malformation Type I: a pilot study.

PubMed

Alperin, Noam; Loftus, James Ryan; Bagci, Ahmet M; Lee, Sang H; Oliu, Carlos J; Shah, Ashish H; Green, Barth A

2017-01-01

OBJECTIVE This study identifies quantitative imaging-based measures in patients with Chiari malformation Type I (CM-I) that are associated with positive outcomes after suboccipital decompression with duraplasty. METHODS Fifteen patients in whom CM-I was newly diagnosed underwent MRI preoperatively and 3 months postoperatively. More than 20 previously described morphological and physiological parameters were derived to assess quantitatively the impact of surgery. Postsurgical clinical outcomes were assessed in 2 ways, based on resolution of the patient's chief complaint and using a modified Chicago Chiari Outcome Scale (CCOS). Statistical analyses were performed to identify measures that were different between the unfavorable- and favorable-outcome cohorts. Multivariate analysis was used to identify the strongest predictors of outcome. RESULTS The strongest physiological parameter predictive of outcome was the preoperative maximal cord displacement in the upper cervical region during the cardiac cycle, which was significantly larger in the favorable-outcome subcohorts for both outcome types (p < 0.05). Several hydrodynamic measures revealed significantly larger preoperative-to-postoperative changes in the favorable-outcome subcohort. Predictor sets for the chief-complaint classification included the cord displacement, percent venous drainage through the jugular veins, and normalized cerebral blood flow with 93.3% accuracy. Maximal cord displacement combined with intracranial volume change predicted outcome based on the modified CCOS classification with similar accuracy. CONCLUSIONS Tested physiological measures were stronger predictors of outcome than the morphological measures in patients with CM-I. Maximal cord displacement and intracranial volume change during the cardiac cycle together with a measure that reflects the cerebral venous drainage pathway emerged as likely predictors of decompression outcome in patients with CM-I.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.