Ajayi, Alex A; Syed, Moin
2014-10-01
This study used a person-oriented analytic approach to identify meaningful patterns of barriers-focused racial socialization and perceived racial discrimination experiences in a sample of 295 late adolescents. Using cluster analysis, three distinct groups were identified: Low Barrier Socialization-Low Discrimination, High Barrier Socialization-Low Discrimination, and High Barrier Socialization-High Discrimination clusters. These groups were substantively unique in terms of the frequency of racial socialization messages about bias preparation and out-group mistrust its members received and their actual perceived discrimination experiences. Further, individuals in the High Barrier Socialization-High Discrimination cluster reported significantly higher depressive symptoms than those in the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. However, no differences in adjustment were observed between the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. Overall, the findings highlight important individual differences in how young people of color experience their race and how these differences have significant implications on psychological adjustment. Copyright © 2014 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Keshtkaran, Mohammad Reza; Yang, Zhi
2017-06-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.
NASA Astrophysics Data System (ADS)
Keshtkaran, Mohammad Reza; Yang, Zhi
2017-06-01
Objective. Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. Approach. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Main results. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. Significance. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.
NASA Technical Reports Server (NTRS)
Ballew, G.
1977-01-01
The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.
NASA Technical Reports Server (NTRS)
Ballew, G.
1977-01-01
The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed using Johnson's HICLUS program. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.
Cluster analysis and prediction of treatment outcomes for chronic rhinosinusitis.
Soler, Zachary M; Hyer, J Madison; Rudmik, Luke; Ramakrishnan, Viswanathan; Smith, Timothy L; Schlosser, Rodney J
2016-04-01
Current clinical classifications of chronic rhinosinusitis (CRS) have weak prognostic utility regarding treatment outcomes. Simplified discriminant analysis based on unsupervised clustering has identified novel phenotypic subgroups of CRS, but prognostic utility is unknown. We sought to determine whether discriminant analysis allows prognostication in patients choosing surgery versus continued medical management. A multi-institutional prospective study of patients with CRS in whom initial medical therapy failed who then self-selected continued medical management or surgical treatment was used to separate patients into 5 clusters based on a previously described discriminant analysis using total Sino-Nasal Outcome Test-22 (SNOT-22) score, age, and missed productivity. Patients completed the SNOT-22 at baseline and for 18 months of follow-up. Baseline demographic and objective measures included olfactory testing, computed tomography, and endoscopy scoring. SNOT-22 outcomes for surgical versus continued medical treatment were compared across clusters. Data were available on 690 patients. Baseline differences in demographics, comorbidities, objective disease measures, and patient-reported outcomes were similar to previous clustering reports. Three of 5 clusters identified by means of discriminant analysis had improved SNOT-22 outcomes with surgical intervention when compared with continued medical management (surgery was a mean of 21.2 points better across these 3 clusters at 6 months, P < .05). These differences were sustained at 18 months of follow-up. Two of 5 clusters had similar outcomes when comparing surgery with continued medical management. A simplified discriminant analysis based on 3 common clinical variables is able to cluster patients and provide prognostic information regarding surgical treatment versus continued medical management in patients with CRS. Copyright © 2015 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis
ERIC Educational Resources Information Center
Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan
2016-01-01
Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
Byrd, Christy M; Carter Andrews, Dorinda J
2016-08-01
Although there exists a healthy body of literature related to discrimination in schools, this research has primarily focused on racial or ethnic discrimination as perceived and experienced by students of color. Few studies examine students' perceptions of discrimination from a variety of sources, such as adults and peers, their descriptions of the discrimination, or the frequency of discrimination in the learning environment. Middle and high school students in a Midwestern school district (N=1468) completed surveys identifying whether they experienced discrimination from seven sources (e.g., peers, teachers, administrators), for seven reasons (e.g., gender, race/ethnicity, religion), and in eight forms (e.g., punished more frequently, called names, excluded from social groups). The sample was 52% White, 15% Black/African American, 14% Multiracial, and 17% Other. Latent class analysis was used to cluster individuals based on reported sources of, reasons for, and forms of discrimination. Four clusters were found, and ANOVAs were used to test for differences between clusters on perceptions of school climate, relationships with teachers, perceptions that the school was a "good school," and engagement. The Low Discrimination cluster experienced the best outcomes, whereas an intersectional cluster experienced the most discrimination and the worst outcomes. The results confirm existing research on the negative effects of discrimination. Additionally, the paper adds to the literature by highlighting the importance of an intersectional approach to examining students' perceptions of in-school discrimination. Copyright © 2016 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Unsupervised spike sorting based on discriminative subspace learning.
Keshtkaran, Mohammad Reza; Yang, Zhi
2014-01-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. In this paper, we present two unsupervised spike sorting algorithms based on discriminative subspace learning. The first algorithm simultaneously learns the discriminative feature subspace and performs clustering. It uses histogram of features in the most discriminative projection to detect the number of neurons. The second algorithm performs hierarchical divisive clustering that learns a discriminative 1-dimensional subspace for clustering in each level of the hierarchy until achieving almost unimodal distribution in the subspace. The algorithms are tested on synthetic and in-vivo data, and are compared against two widely used spike sorting methods. The comparative results demonstrate that our spike sorting methods can achieve substantially higher accuracy in lower dimensional feature space, and they are highly robust to noise. Moreover, they provide significantly better cluster separability in the learned subspace than in the subspace obtained by principal component analysis or wavelet transform.
Groundwater quality assessment of urban Bengaluru using multivariate statistical techniques
NASA Astrophysics Data System (ADS)
Gulgundi, Mohammad Shahid; Shetty, Amba
2018-03-01
Groundwater quality deterioration due to anthropogenic activities has become a subject of prime concern. The objective of the study was to assess the spatial and temporal variations in groundwater quality and to identify the sources in the western half of the Bengaluru city using multivariate statistical techniques. Water quality index rating was calculated for pre and post monsoon seasons to quantify overall water quality for human consumption. The post-monsoon samples show signs of poor quality in drinking purpose compared to pre-monsoon. Cluster analysis (CA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the groundwater quality data measured on 14 parameters from 67 sites distributed across the city. Hierarchical cluster analysis (CA) grouped the 67 sampling stations into two groups, cluster 1 having high pollution and cluster 2 having lesser pollution. Discriminant analysis (DA) was applied to delineate the most meaningful parameters accounting for temporal and spatial variations in groundwater quality of the study area. Temporal DA identified pH as the most important parameter, which discriminates between water quality in the pre-monsoon and post-monsoon seasons and accounts for 72% seasonal assignation of cases. Spatial DA identified Mg, Cl and NO3 as the three most important parameters discriminating between two clusters and accounting for 89% spatial assignation of cases. Principal component analysis was applied to the dataset obtained from the two clusters, which evolved three factors in each cluster, explaining 85.4 and 84% of the total variance, respectively. Varifactors obtained from principal component analysis showed that groundwater quality variation is mainly explained by dissolution of minerals from rock water interactions in the aquifer, effect of anthropogenic activities and ion exchange processes in water.
Mapping Informative Clusters in a Hierarchial Framework of fMRI Multivariate Analysis
Xu, Rui; Zhen, Zonglei; Liu, Jia
2010-01-01
Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies. PMID:21152081
Richardson, Bridget L; Macon, Tamarie A; Mustafaa, Faheemah N; Bogan, Erin D; Cole-Lewis, Yasmin; Chavous, Tabbye M
2015-06-01
Research links racial identity to important developmental outcomes among African American adolescents, but less is known about the contextual experiences that shape youths' racial identity. In a sample of 491 African American adolescents (48% female), associations of youth-reported experiences of racial discrimination and parental messages about preparation for racial bias with adolescents' later racial identity were examined. Cluster analysis resulted in four profiles of adolescents varying in reported frequency of racial discrimination from teachers and peers at school and frequency of parental racial discrimination coping messages during adolescents' 8th grade year. Boys were disproportionately over-represented in the cluster of youth experiencing more frequent discrimination but receiving fewer parental discrimination coping messages, relative to the overall sample. Also examined were clusters of adolescents' 11th grade racial identity attitudes about the importance of race (centrality), personal group affect (private regard), and perceptions of societal beliefs about African Americans (public regard). Girls and boys did not differ in their representation in racial identity clusters, but 8th grade discrimination/parent messages clusters were associated with 11th grade racial identity cluster membership, and these associations varied across gender groups. Boys experiencing more frequent discrimination but fewer parental coping messages were over-represented in the racial identity cluster characterized by low centrality, low private regard, and average public regard. The findings suggest that adolescents who experience racial discrimination but receive fewer parental supports for negotiating and coping with discrimination may be at heightened risk for internalizing stigmatizing experiences. Also, the findings suggest the need to consider the context of gender in adolescents' racial discrimination and parental racial socialization.
Phung, Dung; Huang, Cunrui; Rutherford, Shannon; Dwirahmadi, Febi; Chu, Cordia; Wang, Xiaoming; Nguyen, Minh; Nguyen, Nga Huy; Do, Cuong Manh; Nguyen, Trung Hieu; Dinh, Tuan Anh Diep
2015-05-01
The present study is an evaluation of temporal/spatial variations of surface water quality using multivariate statistical techniques, comprising cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA). Eleven water quality parameters were monitored at 38 different sites in Can Tho City, a Mekong Delta area of Vietnam from 2008 to 2012. Hierarchical cluster analysis grouped the 38 sampling sites into three clusters, representing mixed urban-rural areas, agricultural areas and industrial zone. FA/PCA resulted in three latent factors for the entire research location, three for cluster 1, four for cluster 2, and four for cluster 3 explaining 60, 60.2, 80.9, and 70% of the total variance in the respective water quality. The varifactors from FA indicated that the parameters responsible for water quality variations are related to erosion from disturbed land or inflow of effluent from sewage plants and industry, discharges from wastewater treatment plants and domestic wastewater, agricultural activities and industrial effluents, and contamination by sewage waste with faecal coliform bacteria through sewer and septic systems. Discriminant analysis (DA) revealed that nephelometric turbidity units (NTU), chemical oxygen demand (COD) and NH₃ are the discriminating parameters in space, affording 67% correct assignation in spatial analysis; pH and NO₂ are the discriminating parameters according to season, assigning approximately 60% of cases correctly. The findings suggest a possible revised sampling strategy that can reduce the number of sampling sites and the indicator parameters responsible for large variations in water quality. This study demonstrates the usefulness of multivariate statistical techniques for evaluation of temporal/spatial variations in water quality assessment and management.
NASA Astrophysics Data System (ADS)
Crawford, I.; Ruske, S.; Topping, D. O.; Gallagher, M. W.
2015-07-01
In this paper we present improved methods for discriminating and quantifying Primary Biological Aerosol Particles (PBAP) by applying hierarchical agglomerative cluster analysis to multi-parameter ultra violet-light induced fluorescence (UV-LIF) spectrometer data. The methods employed in this study can be applied to data sets in excess of 1×106 points on a desktop computer, allowing for each fluorescent particle in a dataset to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient dataset. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4) where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best performing methods were applied to the BEACHON-RoMBAS ambient dataset where it was found that the z-score and range normalisation methods yield similar results with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP) where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of bacterial aerosol concentration by a factor of 5. We suggest that this likely due to errors arising from misatrribution due to poor centroid definition and failure to assign particles to a cluster as a result of the subsampling and comparative attribution method employed by WASP. The methods used here allow for the entire fluorescent population of particles to be analysed yielding an explict cluster attribution for each particle, improving cluster centroid definition and our capacity to discriminate and quantify PBAP meta-classes compared to previous approaches.
Partially supervised speaker clustering.
Tang, Hao; Chu, Stephen Mingyu; Hasegawa-Johnson, Mark; Huang, Thomas S
2012-05-01
Content-based multimedia indexing, retrieval, and processing as well as multimedia databases demand the structuring of the media content (image, audio, video, text, etc.), one significant goal being to associate the identity of the content to the individual segments of the signals. In this paper, we specifically address the problem of speaker clustering, the task of assigning every speech utterance in an audio stream to its speaker. We offer a complete treatment to the idea of partially supervised speaker clustering, which refers to the use of our prior knowledge of speakers in general to assist the unsupervised speaker clustering process. By means of an independent training data set, we encode the prior knowledge at the various stages of the speaker clustering pipeline via 1) learning a speaker-discriminative acoustic feature transformation, 2) learning a universal speaker prior model, and 3) learning a discriminative speaker subspace, or equivalently, a speaker-discriminative distance metric. We study the directional scattering property of the Gaussian mixture model (GMM) mean supervector representation of utterances in the high-dimensional space, and advocate exploiting this property by using the cosine distance metric instead of the euclidean distance metric for speaker clustering in the GMM mean supervector space. We propose to perform discriminant analysis based on the cosine distance metric, which leads to a novel distance metric learning algorithm—linear spherical discriminant analysis (LSDA). We show that the proposed LSDA formulation can be systematically solved within the elegant graph embedding general dimensionality reduction framework. Our speaker clustering experiments on the GALE database clearly indicate that 1) our speaker clustering methods based on the GMM mean supervector representation and vector-based distance metrics outperform traditional speaker clustering methods based on the “bag of acoustic features” representation and statistical model-based distance metrics, 2) our advocated use of the cosine distance metric yields consistent increases in the speaker clustering performance as compared to the commonly used euclidean distance metric, 3) our partially supervised speaker clustering concept and strategies significantly improve the speaker clustering performance over the baselines, and 4) our proposed LSDA algorithm further leads to state-of-the-art speaker clustering performance.
Multi-class ERP-based BCI data analysis using a discriminant space self-organizing map.
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
Emotional or non-emotional image stimulus is recently applied to event-related potential (ERP) based brain computer interfaces (BCI). Though the classification performance is over 80% in a single trial, a discrimination between those ERPs has not been considered. In this research we tried to clarify the discriminability of four-class ERP-based BCI target data elicited by desk, seal, spider images and letter intensifications. A conventional self organizing map (SOM) and newly proposed discriminant space SOM (ds-SOM) were applied, then the discriminabilites were visualized. We also classify all pairs of those ERPs by stepwise linear discriminant analysis (SWLDA) and verify the visualization of discriminabilities. As a result, the ds-SOM showed understandable visualization of the data with a shorter computational time than the traditional SOM. We also confirmed the clear boundary between the letter cluster and the other clusters. The result was coherent with the classification performances by SWLDA. The method might be helpful not only for developing a new BCI paradigm, but also for the big data analysis.
Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.
Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V
2007-01-01
The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
NASA Astrophysics Data System (ADS)
Crawford, I.; Ruske, S.; Topping, D. O.; Gallagher, M. W.
2015-11-01
In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs) by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF) spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4) where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio-hydro-atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen-Rocky Mountain Biogenic Aerosol Study) ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP) where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of bacterial aerosol concentration by a factor of 5. We suggest that this likely due to errors arising from misattribution due to poor centroid definition and failure to assign particles to a cluster as a result of the subsampling and comparative attribution method employed by WASP. The methods used here allow for the entire fluorescent population of particles to be analysed, yielding an explicit cluster attribution for each particle and improving cluster centroid definition and our capacity to discriminate and quantify PBAP meta-classes compared to previous approaches.
ERIC Educational Resources Information Center
Banks, Kira Hudson; Kohn-Wood, Laura P.
2007-01-01
This study examined the association between racial identity profiles, discrimination, and mental health outcomes. African American college students (N = 194) completed measures of racial discrimination, racial identity, college hassles, and depressive symptoms. Four meaningful profiles emerged through a cluster analysis of seven dimensions of…
Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar
2018-06-07
Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.
On-Line Pattern Analysis and Recognition System. OLPARS VI. Software Reference Manual,
1982-06-18
Discriminant Analysis Data Transformation, Feature Extraction, Feature Evaluation Cluster Analysis, Classification Computer Software 20Z. ABSTRACT... cluster /scatter cut-off value, (2) change the one-space bin factor, (3) change from long prompts to short prompts or vice versa, (4) change the...value, a cluster plot is displayed, otherwise a scatter plot is shown. if option 1 is selected, the program requests that a new value be input
Data Mining of University Philanthropic Giving: Cluster-Discriminant Analysis and Pareto Effects
ERIC Educational Resources Information Center
Le Blanc, Louis A.; Rucks, Conway T.
2009-01-01
A large sample of 33,000 university alumni records were cluster-analyzed to generate six groups relatively unique in their respective attribute values. The attributes used to cluster the former students included average gift to the university's foundation and to the alumni association for the same institution. Cluster detection is useful in this…
NASA Astrophysics Data System (ADS)
Giniyatullin, K. G.; Valeeva, A. A.; Smirnova, E. V.
2017-08-01
Particle-size distribution in soddy-podzolic and light gray forest soils of the Botanical Garden of Kazan Federal University has been studied. The cluster analysis of data on the samples from genetic soil horizons attests to the lithological heterogeneity of the profiles of all the studied soils. It is probable that they are developed from the two-layered sediments with the upper colluvial layer underlain by the alluvial layer. According to the discriminant analysis, the major contribution to the discrimination of colluvial and alluvial layers is that of the fraction >0.25 mm. The results of canonical analysis show that there is only one significant discriminant function that separates alluvial and colluvial sediments on the investigated territory. The discriminant function correlates with the contents of fractions 0.05-0.01, 0.25-0.05, and >0.25 mm. Classification functions making it possible to distinguish between alluvial and colluvial sediments have been calculated. Statistical assessment of particle-size distribution data obtained for the plow horizons on ten plowed fields within the garden indicates that this horizon is formed from colluvial sediments. We conclude that the contents of separate fractions and their ratios cannot be used as a universal criterion of the lithological heterogeneity. However, adequate combination of the cluster and discriminant analyses makes it possible to give a comprehensive assessment of the lithology of soil samples from data on the contents of sand and silt fractions, which considerably increases the information value and reliability of the results.
The differentiation of camel breeds based on meat measurements using discriminant analysis.
Al-Atiyat, Raed Mahmoud; Suliman, Gamal; AlSuhaibani, Entissar; El-Waziry, Ahmad; Al-Owaimer, Abdullah; Basmaeil, Saeid
2016-06-01
The meat productivity of camel in the tropics is still under investigation for identification of better meat breed or type. Therefore, four one-humped Saudi Arabian (SA) camel breeds, Majaheem, Maghateer, Hamrah, and Safrah were experimented in order to differentiate them from each other based on meat measurements. The measurements were biometrical meat traits measured on six intact males from each breed. The results showed higher values of the Majaheem breed than that obtained for the other breeds except few cases such dressing percentage and rib-eye area. In differentiation analysis, the most discriminating meat variables were myofibrillar protein index, meat color components (L* and a*, b*), and cooking loss. Consequently, the Safrah and the Majaheem breeds presented the largest dissimilarity as evidenced by their multivariate means. The canonical discriminant analysis allowed an additional understanding of the differentiation between breeds. Furthermore, two large clusters, one formed by Hamrah and Maghateer in one group along with Safrah. These classifications may assign each breed into one cluster considering they are better as meat producers. The Majaheem was clustered alone in another cluster that might be a result of being better as milk producers. Nevertheless, the productivity type of the camel breeds of SA needs further morphology and genetic descriptions.
Kwon, Yong-Kook; Ahn, Myung Suk; Park, Jong Suk; Liu, Jang Ryol; In, Dong Su; Min, Byung Whan; Kim, Suk Weon
2013-01-01
To determine whether Fourier transform (FT)-IR spectral analysis combined with multivariate analysis of whole-cell extracts from ginseng leaves can be applied as a high-throughput discrimination system of cultivation ages and cultivars, a total of total 480 leaf samples belonging to 12 categories corresponding to four different cultivars (Yunpung, Kumpung, Chunpung, and an open-pollinated variety) and three different cultivation ages (1 yr, 2 yr, and 3 yr) were subjected to FT-IR. The spectral data were analyzed by principal component analysis and partial least squares-discriminant analysis. A dendrogram based on hierarchical clustering analysis of the FT-IR spectral data on ginseng leaves showed that leaf samples were initially segregated into three groups in a cultivation age-dependent manner. Then, within the same cultivation age group, leaf samples were clustered into four subgroups in a cultivar-dependent manner. The overall prediction accuracy for discrimination of cultivars and cultivation ages was 94.8% in a cross-validation test. These results clearly show that the FT-IR spectra combined with multivariate analysis from ginseng leaves can be applied as an alternative tool for discriminating of ginseng cultivars and cultivation ages. Therefore, we suggest that this result could be used as a rapid and reliable F1 hybrid seed-screening tool for accelerating the conventional breeding of ginseng. PMID:24558311
NASA Astrophysics Data System (ADS)
Luo, Congpei; He, Tao; Chun, Ze
2013-04-01
Dendrobium is a commonly used and precious herb in Traditional Chinese Medicine. The high biodiversity of Dendrobium and the therapeutic needs require tools for the correct and fast discrimination of different Dendrobium species. This study investigates Fourier transform infrared spectroscopy followed by cluster analysis for discrimination and chemical phylogenetic study of seven Dendrobium species. Despite the general pattern of the IR spectra, different intensities, shapes, peak positions were found in the IR spectra of these samples, especially in the range of 1800-800 cm-1. The second derivative transformation and alcoholic extracting procedure obviously enlarged the tiny spectral differences among these samples. The results indicated each Dendrobium species had a characteristic IR spectra profile, which could be used to discriminate them. The similarity coefficients among the samples were analyzed based on their second derivative IR spectra, which ranged from 0.7632 to 0.9700, among the seven Dendrobium species, and from 0.5163 to 0.9615, among the ethanol extracts. A dendrogram was constructed based on cluster analysis the IR spectra for studying the chemical phylogenetic relationships among the samples. The results indicated that D. denneanum and D. crepidatum could be the alternative resources to substitute D. chrysotoxum, D. officinale and D. nobile which were officially recorded in Chinese Pharmacopoeia. In conclusion, with the advantages of high resolution, speediness and convenience, the experimental approach can successfully discriminate and construct the chemical phylogenetic relationships of the seven Dendrobium species.
Leung, S C; Fung, W K; Wong, K H
1999-01-01
The relative bit density variation graphs of 207 specimen credit cards processed by 12 encoding machines were examined first visually, and then classified by means of hierarchical cluster analysis. Twenty-nine credit cards being treated as 'questioned' samples were tested by way of cluster analysis against 'controls' derived from known encoders. It was found that hierarchical cluster analysis provided a high accuracy of identification with all 29 'questioned' samples classified correctly. On the other hand, although visual comparison of jitter graphs was less discriminating, it was nevertheless capable of giving a reasonably accurate result.
Ten-year performance of ponderosa pine provenances in the Great Plains of North America
Ralph A. Read
1983-01-01
A cluster and discriminant analysis based on nine of the best plantations, partitioned the seed provenance populations into six geographic clusters according to their consistency of performance in the plantations.The Northcentral Nebraska cluster of three provenances performed consistently well above average in all plantations. These easternmost...
Whole brain white matter connectivity analysis using machine learning: An application to autism.
Zhang, Fan; Savadjiev, Peter; Cai, Weidong; Song, Yang; Rathi, Yogesh; Tunç, Birkan; Parker, Drew; Kapur, Tina; Schultz, Robert T; Makris, Nikos; Verma, Ragini; O'Donnell, Lauren J
2018-05-15
In this paper, we propose an automated white matter connectivity analysis method for machine learning classification and characterization of white matter abnormality via identification of discriminative fiber tracts. The proposed method uses diffusion MRI tractography and a data-driven approach to find fiber clusters corresponding to subdivisions of the white matter anatomy. Features extracted from each fiber cluster describe its diffusion properties and are used for machine learning. The method is demonstrated by application to a pediatric neuroimaging dataset from 149 individuals, including 70 children with autism spectrum disorder (ASD) and 79 typically developing controls (TDC). A classification accuracy of 78.33% is achieved in this cross-validation study. We investigate the discriminative diffusion features based on a two-tensor fiber tracking model. We observe that the mean fractional anisotropy from the second tensor (associated with crossing fibers) is most affected in ASD. We also find that local along-tract (central cores and endpoint regions) differences between ASD and TDC are helpful in differentiating the two groups. These altered diffusion properties in ASD are associated with multiple robustly discriminative fiber clusters, which belong to several major white matter tracts including the corpus callosum, arcuate fasciculus, uncinate fasciculus and aslant tract; and the white matter structures related to the cerebellum, brain stem, and ventral diencephalon. These discriminative fiber clusters, a small part of the whole brain tractography, represent the white matter connections that could be most affected in ASD. Our results indicate the potential of a machine learning pipeline based on white matter fiber clustering. Copyright © 2017 Elsevier Inc. All rights reserved.
Prediction models for clustered data: comparison of a random intercept and standard regression model
2013-01-01
Background When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Methods Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. Results The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. Conclusion The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters. PMID:23414436
Bouwmeester, Walter; Twisk, Jos W R; Kappen, Teus H; van Klei, Wilton A; Moons, Karel G M; Vergouwe, Yvonne
2013-02-15
When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters.
Panagopoulos, G P; Angelopoulou, D; Tzirtzilakis, E E; Giannoulopoulos, P
2016-10-01
This paper presents an innovated method for the discrimination of groundwater samples in common groups representing the hydrogeological units from where they have been pumped. This method proved very efficient even in areas with complex hydrogeological regimes. The proposed method requires chemical analyses of water samples only for major ions, meaning that it is applicable to most of cases worldwide. Another benefit of the method is that it gives a further insight of the aquifer hydrogeochemistry as it provides the ions that are responsible for the discrimination of the group. The procedure begins with cluster analysis of the dataset in order to classify the samples in the corresponding hydrogeological unit. The feasibility of the method is proven from the fact that the samples of volcanic origin were separated into two different clusters, namely the lava units and the pyroclastic-ignimbritic aquifer. The second step is the discriminant analysis of the data which provides the functions that distinguish the groups from each other and the most significant variables that define the hydrochemical composition of the aquifer. The whole procedure was highly successful as the 94.7 % of the samples were classified to the correct aquifer system. Finally, the resulted functions can be safely used to categorize samples of either unknown or doubtful origin improving thus the quality and the size of existing hydrochemical databases.
Noninvasive analysis of the sputum transcriptome discriminates clinical phenotypes of asthma.
Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D; Ober, Carole; Nicolae, Dan L; Barnes, Kathleen C; London, Stephanie J; Gilliland, Frank; Weiss, Scott T; Raby, Benjamin A; Cohn, Lauren; Chupp, Geoffrey L
2015-05-15
The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10(-6)) and hospitalization (P = 0.01), respectively. There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma.
Noninvasive Analysis of the Sputum Transcriptome Discriminates Clinical Phenotypes of Asthma
Yan, Xiting; Chu, Jen-Hwa; Gomez, Jose; Koenigs, Maria; Holm, Carole; He, Xiaoxuan; Perez, Mario F.; Zhao, Hongyu; Mane, Shrikant; Martinez, Fernando D.; Ober, Carole; Nicolae, Dan L.; Barnes, Kathleen C.; London, Stephanie J.; Gilliland, Frank; Weiss, Scott T.; Raby, Benjamin A.; Cohn, Lauren
2015-01-01
Rationale: The airway transcriptome includes genes that contribute to the pathophysiologic heterogeneity seen in individuals with asthma. Objectives: We analyzed sputum gene expression for transcriptomic endotypes of asthma (TEA), gene signatures that discriminate phenotypes of disease. Methods: Gene expression in the sputum and blood of patients with asthma was measured using Affymetrix microarrays. Unsupervised clustering analysis based on pathways from the Kyoto Encyclopedia of Genes and Genomes was used to identify TEA clusters. Logistic regression analysis of matched blood samples defined an expression profile in the circulation to determine the TEA cluster assignment in a cohort of children with asthma to replicate clinical phenotypes. Measurements and Main Results: Three TEA clusters were identified. TEA cluster 1 had the most subjects with a history of intubation (P = 0.05), a lower prebronchodilator FEV1 (P = 0.006), a higher bronchodilator response (P = 0.03), and higher exhaled nitric oxide levels (P = 0.04) compared with the other TEA clusters. TEA cluster 2, the smallest cluster, had the most subjects that were hospitalized for asthma (P = 0.04). TEA cluster 3, the largest cluster, had normal lung function, low exhaled nitric oxide levels, and lower inhaled steroid requirements. Evaluation of TEA clusters in children confirmed that TEA clusters 1 and 2 are associated with a history of intubation (P = 5.58 × 10−6) and hospitalization (P = 0.01), respectively. Conclusions: There are common patterns of gene expression in the sputum and blood of children and adults that are associated with near-fatal, severe, and milder asthma. PMID:25763605
Nemmi, Federico; Saint-Aubert, Laure; Adel, Djilali; Salabert, Anne-Sophie; Pariente, Jérémie; Barbeau, Emmanuel; Payoux, Pierre; Péran, Patrice
2014-01-01
Purpose AV-45 amyloid biomarker is known to show uptake in white matter in patients with Alzheimer’s disease (AD) but also in healthy population. This binding; thought to be of a non-specific lipophilic nature has not yet been investigated. The aim of this study was to determine the differential pattern of AV-45 binding in healthy and pathological populations in white matter. Methods We recruited 24 patients presenting with AD at early stage and 17 matched, healthy subjects. We used an optimized PET-MRI registration method and an approach based on intensity histogram using several indexes. We compared the results of the intensity histogram analyses with a more canonical approach based on target-to-cerebellum Standard Uptake Value (SUVr) in white and grey matters using MANOVA and discriminant analyses. A cluster analysis on white and grey matter histograms was also performed. Results White matter histogram analysis revealed significant differences between AD and healthy subjects, which were not revealed by SUVr analysis. However, white matter histograms was not decisive to discriminate groups, and indexes based on grey matter only showed better discriminative power than SUVr. The cluster analysis divided our sample in two clusters, showing different uptakes in grey but also in white matter. Conclusion These results demonstrate that AV-45 binding in white matter conveys subtle information not detectable using SUVr approach. Although it is not better than standard SUVr to discriminate AD patients from healthy subjects, this information could reveal white matter modifications. PMID:24573658
Hu, Boran; Yue, Yaqing; Zhu, Yong; Wen, Wen; Zhang, Fengmin; Hardie, Jim W
2015-01-01
Proton nuclear magnetic resonance spectroscopy coupled multivariate analysis (1H NMR-PCA/PLS-DA) is an important tool for the discrimination of wine products. Although 1H NMR has been shown to discriminate wines of different cultivars, a grape genetic component of the discrimination has been inferred only from discrimination of cultivars of undefined genetic homology and in the presence of many confounding environmental factors. We aimed to confirm the influence of grape genotypes in the absence of those factors. We applied 1H NMR-PCA/PLS-DA and hierarchical cluster analysis (HCA) to wines from five, variously genetically-related grapevine (V. vinifera) cultivars; all grown similarly on the same site and vinified similarly. We also compared the semi-quantitative profiles of the discriminant metabolites of each cultivar with previously reported chemical analyses. The cultivars were clearly distinguishable and there was a general correlation between their grouping and their genetic homology as revealed by recent genomic studies. Between cultivars, the relative amounts of several of the cultivar-related discriminant metabolites conformed closely with reported chemical analyses. Differences in grape-derived metabolites associated with genetic differences alone are a major source of 1H NMR-based discrimination of wines and 1H NMR has the capacity to discriminate between very closely related cultivars. The study confirms that genetic variation among grape cultivars alone can account for the discrimination of wine by 1H NMR-PCA/PLS and indicates that 1H NMR spectra of wine of single grape cultivars may in future be used in tandem with hierarchical cluster analysis to elucidate genetic lineages and metabolomic relations of grapevine cultivars. In the absence of genetic information, for example, where predecessor varieties are no longer extant, this may be a particularly useful approach.
Latino/a depression and smoking: an analysis through the lenses of culture, gender, and ethnicity.
Lorenzo-Blanco, Elma I; Cortina, Lilia M
2013-06-01
Rates of major depressive disorder (MDD) and cigarette smoking increase with Latino/a acculturation, but this varies by gender and ethnic subgroup. We investigated how lived experiences (i.e., discrimination, family conflict, family cohesion, familismo) clustered together in the everyday lives of Latina/os. We further examined associations of cluster profile and Latino/a subgroup with MDD and smoking, and tested whether gender moderated these associations. Data came from the National Latino Asian American Study, which included 2,554 Latino/as (48 % female; mean age = 38.02 years). K-means cluster analysis revealed six profiles of experience, which varied by gender and socio-cultural characteristics. Proportionately more women than men were in groups with problematic family lives. Acculturated Latino/as were disproportionately represented in profiles reporting frequent discrimination, family conflict, and a lack of shared family values and cohesion. Profiles characterized by high discrimination and family problems also predicted elevated risk for MDD and smoking. Findings suggest that Latino/a acculturation comes jointly with increased discrimination, increased family conflict, and reduced family cohesion and shared family values, exacerbating risk for MDD and smoking. This research on pathways to depression and smoking can inform the development of targeted assessment, prevention, and intervention strategies, tailored to the needs of Latino/as.
Spectral reflectance of surface soils - A statistical analysis
NASA Technical Reports Server (NTRS)
Crouse, K. R.; Henninger, D. L.; Thompson, D. R.
1983-01-01
The relationship of the physical and chemical properties of soils to their spectral reflectance as measured at six wavebands of Thematic Mapper (TM) aboard NASA's Landsat-4 satellite was examined. The results of performing regressions of over 20 soil properties on the six TM bands indicated that organic matter, water, clay, cation exchange capacity, and calcium were the properties most readily predicted from TM data. The middle infrared bands, bands 5 and 7, were the best bands for predicting soil properties, and the near infrared band, band 4, was nearly as good. Clustering 234 soil samples on the TM bands and characterizing the clusters on the basis of soil properties revealed several clear relationships between properties and reflectance. Discriminant analysis found organic matter, fine sand, base saturation, sand, extractable acidity, and water to be significant in discriminating among clusters.
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6-7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification.
ASTM clustering for improving coal analysis by near-infrared spectroscopy.
Andrés, J M; Bona, M T
2006-11-15
Multivariate analysis techniques have been applied to near-infrared (NIR) spectra coals to investigate the relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding predictor variables. In this work, a whole set of coal samples was grouped into six more homogeneous clusters following the ASTM reference method for classification prior to the application of calibration methods to each coal set. The results obtained showed a considerable improvement of the error determination compared with the calibration for the whole sample set. For some groups, the established calibrations approached the quality required by the ASTM/ISO norms for laboratory analysis. To predict property values for a new coal sample it is necessary the assignation of that sample to its respective group. Thus, the discrimination and classification ability of coal samples by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS) in the NIR range was also studied by applying Soft Independent Modelling of Class Analogy (SIMCA) and Linear Discriminant Analysis (LDA) techniques. Modelling of the groups by SIMCA led to overlapping models that cannot discriminate for unique classification. On the other hand, the application of Linear Discriminant Analysis improved the classification of the samples but not enough to be satisfactory for every group considered.
Using sperm morphometry and multivariate analysis to differentiate species of gray Mazama
Duarte, José Maurício Barbanti
2016-01-01
There is genetic evidence that the two species of Brazilian gray Mazama, Mazama gouazoubira and Mazama nemorivaga, belong to different genera. This study identified significant differences that separated them into distinct groups, based on characteristics of the spermatozoa and ejaculate of both species. The characteristics that most clearly differentiated between the species were ejaculate colour, white for M. gouazoubira and reddish for M. nemorivaga, and sperm head dimensions. Multivariate analysis of sperm head dimension and format data accurately discriminated three groups for species with total percentage of misclassified of 0.71. The individual analysis, by animal, and the multivariate analysis have also discriminated correctly all five animals (total percentage of misclassified of 13.95%), and the canonical plot has shown three different clusters: Cluster 1, including individuals of M. nemorivaga; Cluster 2, including two individuals of M. gouazoubira; and Cluster 3, including a single individual of M. gouazoubira. The results obtained in this work corroborate the hypothesis of the formation of new genera and species for gray Mazama. Moreover, the easily applied method described herein can be used as an auxiliary tool to identify sibling species of other taxonomic groups. PMID:28018612
NASA Astrophysics Data System (ADS)
Bocsi, Jozsef; Mittag, Anja; Pierzchalski, Arkadiusz; Osmancik, Pavel; Dähnert, Ingo; Tárnok, Attila
2011-02-01
Introduction: Methylprednisolone (MP) is frequently preoperatively administered in children undergoing open heart surgery. The aim of this medication is to inhibit overshooting immune responses. Earlier studies demonstrated cellular and humoral immunological changes in pediatric patients undergoing heart surgeries with and without MP administration. Here in a retrospective study we investigated the modulation of the cellular immune response by MP. The aim was to identify suitable parameters characterizing MP effects by cluster analysis. Methods: Blood samples were analysed from two aged matched groups with surgical correction of septum defects. Group without MP treatment consisted of 10 patients; MP was administered on 21 patients (median dose: 11mg/kg) before cardiopulmonary bypass (CPB). EDTA anticoagulated blood was obtained 24 h preoperatively, after anesthesia, at CPB begin and end (CPB2), 4h, 24h, 48h after surgery, at discharge and at out-patient followup (8.2; 3.3-12.2 month after surgery; median and IQR). Flow cytometry showed the biggest MP relevant changes at CPB2 and 4h postoperatively. They were used for clustering analysis. Classification was made by discriminant analysis and cluster analysis by means of Genes@work software. Results & conclusion: 146 parameters were obtained from analysis. Cross-validation revealed several parameters being able to discriminate between MP groups and to identify immune modulation. MP administration resulted in a delayed activation of monocytes, increased ratio of neutrophils, reduced T-lymphocytes counts. Cluster analysis demonstrated that classification of patients is possible based on the identified cytomics parameters. Further investigation of these parameters might help to understand the MP effects in pediatric open heart surgery.
Dynamic clustering detection through multi-valued descriptors of dermoscopic images.
Cozza, Valentina; Guarracino, Maria Rosario; Maddalena, Lucia; Baroni, Adone
2011-09-10
This paper introduces a dynamic clustering methodology based on multi-valued descriptors of dermoscopic images. The main idea is to support medical diagnosis to decide if pigmented skin lesions belonging to an uncertain set are nearer to malignant melanoma or to benign nevi. Melanoma is the most deadly skin cancer, and early diagnosis is a current challenge for clinicians. Most data analysis algorithms for skin lesions discrimination focus on segmentation and extraction of features of categorical or numerical type. As an alternative approach, this paper introduces two new concepts: first, it considers multi-valued data that scalar variables not only describe but also intervals or histogram variables; second, it introduces a dynamic clustering method based on Wasserstein distance to compare multi-valued data. The overall strategy of analysis can be summarized into the following steps: first, a segmentation of dermoscopic images allows to identify a set of multi-valued descriptors; second, we performed a discriminant analysis on a set of images where there is an a priori classification so that it is possible to detect which features discriminate the benign and malignant lesions; and third, we performed the proposed dynamic clustering method on the uncertain cases, which need to be associated to one of the two previously mentioned groups. Results based on clinical data show that the grading of specific descriptors associated to dermoscopic characteristics provides a novel way to characterize uncertain lesions that can help the dermatologist's diagnosis. Copyright © 2011 John Wiley & Sons, Ltd.
Wang, Wanping; Liu, Mingyue; Wang, Jing; Tian, Rui; Dong, Junqiang; Liu, Qi; Zhao, Xianping; Wang, Yuanfang
2014-01-01
Screening indexes of tumor serum markers for benign and malignant solitary pulmonary nodules (SPNs) were analyzed to find the optimum method for diagnosis. Enzyme-linked immunosorbent assays, an automatic immune analyzer and radioimmunoassay methods were used to examine the levels of 8 serum markers in 164 SPN patients, and the sensitivity for differential diagnosis of malignant or benign SPN was compared for detection using a single plasma marker or a combination of markers. The results for serological indicators that closely relate to benign and malignant SPNs were screened using the Fisher discriminant analysis and a non-conditional logistic regression analysis method, respectively. The results were then verified by the k-means clustering analysis method. The sensitivity when using a combination of serum markers to detect SPN was higher than that using a single marker. By Fisher discriminant analysis, cytokeratin 19 fragments (CYFRA21-1), carbohydrate antigen 125 (CA125), squamous cell carcinoma antigen (SCC) and breast cancer antigen (CA153), which relate to the benign and malignant SPNs, were screened. Through non-conditional logistic regression analysis, CYFRA21-1, SCC and CA153 were obtained. Using the k-means clustering analysis, the cophenetic correlation coefficient (0.940) obtained by the Fisher discriminant analysis was higher than that obtained with logistic regression analysis (0.875). This study indicated that the Fisher discriminant analysis functioned better in screening out serum markers to recognize the benign and malignant SPN. The combined detection of CYFRA21-1, CA125, SCC and CA153 is an effective way to distinguish benign and malignant SPN, and will find an important clinical application in the early diagnosis of SPN. © 2014 S. Karger GmbH, Freiburg.
Han, Bangxing; Peng, Huasheng; Yan, Hui
2016-01-01
Mugua is a common Chinese herbal medicine. There are three main medicinal origin places in China, Xuancheng City Anhui Province, Qijiang District Chongqing City, Yichang City, Hubei Province, and suitable for food origin places Linyi City Shandong Province. To construct a qualitative analytical method to identify the origin of medicinal Mugua by near infrared spectroscopy (NIRS). Partial least squares discriminant analysis (PLSDA) model was established after the Mugua derived from five different origins were preprocessed by the original spectrum. Moreover, the hierarchical cluster analysis was performed. The result showed that PLSDA model was established. According to the relationship of the origins-related important score and wavenumber, and K-mean cluster analysis, the Muguas derived from different origins were effectively identified. NIRS technology can quickly and accurately identify the origin of Mugua, provide a new method and technology for the identification of Chinese medicinal materials. After preprocessed by D1+autoscale, more peaks were increased in the preprocessed Mugua in the near infrared spectrumFive latent variable scores could reflect the information related to the origin place of MuguaOrigins of Mugua were well-distinguished according to K. mean value clustering analysis. Abbreviations used: TCM: Traditional Chinese Medicine, NIRS: Near infrared spectroscopy, SG: Savitzky-Golay smoothness, D1: First derivative, D2: Second derivative, SNV: Standard normal variable transformation, MSC: Multiplicative scatter correction, PLSDA: Partial least squares discriminant analysis, LV: Latent variable, VIP scores: Important score.
Neuro- and social-cognitive clustering highlights distinct profiles in adults with anorexia nervosa.
Renwick, Beth; Musiat, Peter; Lose, Anna; DeJong, Hannah; Broadbent, Hannah; Kenyon, Martha; Loomes, Rachel; Watson, Charlotte; Ghelani, Shreena; Serpell, Lucy; Richards, Lorna; Johnson-Sabine, Eric; Boughton, Nicky; Treasure, Janet; Schmidt, Ulrike
2015-01-01
This study aimed to explore the neuro- and social-cognitive profile of a consecutive series of adult outpatients with anorexia nervosa (AN) when compared with widely available age and gender matched historical control data. The relationship between performance profiles, clinical characteristics, service utilization, and treatment adherence was also investigated. Consecutively recruited outpatients with a broad diagnosis of AN (restricting subtype AN-R: n = 44, binge-purge subtype AN-BP: n = 33 or Eating Disorder Not Otherwise Specified-AN subtype EDNOS-AN: n = 23) completed a comprehensive set of neurocognitive (set-shifting, central coherence) and social-cognitive measures (Emotional Theory of Mind). Data were subjected to hierarchical cluster analysis and a discriminant function analysis. Three separate, meaningful clusters emerged. Cluster 1 (n = 45) showed overall average to high average neuro- and social- cognitive performance, Cluster 2 (n = 38) showed mixed performance characterized by distinct strengths and weaknesses, and Cluster 3 (n = 17) showed poor overall performance (Autism Spectrum disorder (ASD) like cluster). The three clusters did not differ in terms of eating disorder symptoms, comorbid features or service utilization and treatment adherence. A discriminant function analysis confirmed that the clusters were best characterized by performance in perseveration and set-shifting measures. The findings suggest that considerable neuro- and social-cognitive heterogeneity exists in patients with AN, with a subset showing ASD-like features. The value of this method of profiling in predicting longer term patient outcomes and in guiding development of etiologically targeted treatments remains to be seen. © 2014 Wiley Periodicals, Inc.
Classification of different types of beer according to their colour characteristics
NASA Astrophysics Data System (ADS)
Nikolova, Kr T.; Gabrova, R.; Boyadzhiev, D.; Pisanova, E. S.; Ruseva, J.; Yanakiev, D.
2017-01-01
Twenty-two samples from different beers have been investigated in two colour systems - XYZ and SIELab - and have been characterised according to their colour parameters. The goals of the current study were to conduct correlation and discriminant analysis and to find the inner relation between the studied indices. K-means cluster has been used to compare and group the tested types of beer based on their similarity. To apply the K-Cluster analysis it is required that the number of clusters be determined in advance. The variant K = 4 was worked out. The first cluster unified all bright beers, the second one contained samples with fruits, the third one contained samples with addition of lemon, the fourth unified the samples of dark beers. By applying the discriminant analysis it is possible to help selections in the establishment of the type of beer. The proposed model correctly describes the types of beer on the Bulgarian market and it can be used for determining the affiliation of the beer which is not used in obtained model. One sample has been chosen from each cluster and the digital image has been obtained. It confirms the color parameters in the color system XYZ and SIELab. These facts can be used for elaboration for express estimation of beer by color.
Medvedovici, Andrei; Albu, Florin; Naşcu-Briciu, Rodica Domnica; Sârbu, Costel
2014-02-01
Discrimination power evaluation of UV-Vis and (±) electrospray ionization/mass spectrometric techniques, (ESI-MS) individually considered or coupled as detectors to reversed phase liquid chromatography (RPLC) in the characterization of Ginkgo Biloba standardized extracts, is used in herbal medicines and/or dietary supplements with the help of Fuzzy hierarchical clustering (FHC). Seventeen batches of Ginkgo Biloba commercially available standardized extracts from seven manufacturers were measured during experiments. All extracts were within the criteria of the official monograph dedicated to dried refined and quantified Ginkgo extracts, in the European Pharmacopoeia. UV-Vis and (±) ESI-MS spectra of the bulk standardized extracts in methanol were acquired. Additionally, an RPLC separation based on a simple gradient elution profile was applied to the standardized extracts. Detection was made through monitoring UV absorption at 220 nm wavelength or the total ion current (TIC) produced through (±) ESI-MS analysis. FHC was applied to raw, centered and scaled data sets, for evaluating the discrimination power of the method with respect to the origins of the extracts and to the batch to batch variability. The discrimination power increases with the increase of the intrinsic selectivity of the spectral technique being used: UV-Vis
Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu
2013-01-01
DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.
Feng, Sujuan; Qian, Xiaosong; Li, Han; Zhang, Xiaodong
2017-12-01
The aim of the present study was to investigate the effectiveness of the miR-17-92 cluster as a disease progression marker in prostate cancer (PCa). Reverse transcription-quantitative polymerase chain reaction analysis was used to detect the microRNA (miR)-17-92 cluster expression levels in tissues from patients with PCa or benign prostatic hyperplasia (BPH), in addition to in PCa and BPH cell lines. Spearman correlation was used for comparison and estimation of correlations between miRNA expression levels and clinicopathological characteristics such as the Gleason score and prostate-specific antigen (PSA). Receiver operating curve (ROC) analysis was performed for evaluation of specificity and sensitivity of miR-17-92 cluster expression levels for discriminating patients with PCa from patients with BPH. Kaplan-Meier analysis was plotted to investigate the predictive potential of miR-17-92 cluster for PCa biochemical recurrence. Expression of the majority of miRNAs in the miR-17-92 cluster was identified to be significantly increased in PCa tissues and cell lines. Bivariate correlation analysis indicated that the high expression of unregulated miRNAs was positively correlated with Gleason grade, but had no significant association with PSA. ROC curves demonstrated that high expression of miR-17-92 cluster predicted a higher diagnostic accuracy compared with PSA. Improved discriminating quotients were observed when combinations of unregulated miRNAs with PSA were used. Survival analysis confirmed a high combined miRNA score of miR-17-92 cluster was associated with shorter biochemical recurrence interval. miR-17-92 cluster could be a potential diagnostic and prognostic biomarker for PCa, and the combination of the miR-17-92 cluster and serum PSA may enhance the accuracy for diagnosis of PCa.
[Men who have sex with men and human immunodeficiency virus testing in dental practice].
Elizondo, Jesús Eduardo; Treviño, Ana Cecilia; Violant, Deborah; Rivas-Estilla, Ana María; Álvarez, Mario Moisés
To explore the attitudes of men who have sex with men (MSM) towards the implementation of rapid HIV-1/2 testing in the dental practice, and to evaluate MSM's perceptions of stigma and discrimination related to sexual orientation by dental care professionals. Cross-sectional study using a self-administered, anonymous, structured analytical questionnaire answered by 185 MSM in Mexico. The survey included sociodemographic variables, MSM's perceptions towards public and private dental providers, and dental services, as well as their perception towards rapid HIV-1/2 testing in the dental practice. In addition, the perception of stigma and discrimination associated with their sexual orientation was explored by designing a psychometric Likert-type scale. The statistical analysis included factor analysis and non-hierarchical cluster analysis. 86.5% of the respondents expressed their willingness to take a rapid HIV-1/2 screening test during their dental visit. Nevertheless, 91.9% of them considered it important that dental professionals must be well-trained before administering any rapid HIV-1/2 tests. Factor analysis revealed two factors: experiences of sexual orientation stigma and discrimination in dental settings, and feelings of concern about the attitude of the dentist and dental staff towards their sexual orientation. Based on these factors and cluster analysis, three user profiles were identified: users who have not experienced stigma and discrimination (90.3%); users who have not experienced stigma and discrimination, but feel a slight concern (8.1%), and users who have experienced some form of discrimination and feel concern (1.6%). The dental practice may represent a potential location for rapid HIV-1/2 testing contributing to early HIV infection diagnosis. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
NASA Astrophysics Data System (ADS)
Ghebremedhin, Meron; Yesupriya, Shubha; Luka, Janos; Crane, Nicole J.
2015-03-01
Recent studies have demonstrated the potential advantages of the use of Raman spectroscopy in the biomedical field due to its rapidity and noninvasive nature. In this study, Raman spectroscopy is applied as a method for differentiating between bacteria isolates for Gram status and Genus species. We created models for identifying 28 bacterial isolates using spectra collected with a 785 nm laser excitation Raman spectroscopic system. In order to investigate the groupings of these samples, partial least squares discriminant analysis (PLSDA) and hierarchical cluster analysis (HCA) was implemented. In addition, cluster analyses of the isolates were performed using various data types consisting of, biochemical tests, gene sequence alignment, high resolution melt (HRM) analysis and antimicrobial susceptibility tests of minimum inhibitory concentration (MIC) and degree of antimicrobial resistance (SIR). In order to evaluate the ability of these models to correctly classify bacterial isolates using solely Raman spectroscopic data, a set of 14 validation samples were tested using the PLSDA models and consequently the HCA models. External cluster evaluation criteria of purity and Rand index were calculated at different taxonomic levels to compare the performance of clustering using Raman spectra as well as the other datasets. Results showed that Raman spectra performed comparably, and in some cases better than, the other data types with Rand index and purity values up to 0.933 and 0.947, respectively. This study clearly demonstrates that the discrimination of bacterial species using Raman spectroscopic data and hierarchical cluster analysis is possible and has the potential to be a powerful point-of-care tool in clinical settings.
ERIC Educational Resources Information Center
Rhee, Eunjeong; Lee, Bo Hyun; Kim, Boyoung; Ha, Gyuyoung; Lee, Sang Min
2016-01-01
The current study investigated how the five components of planned happenstance skills are related to vocational identity statuses. For determination of relationships, cluster and discriminant analyses were conducted sequentially on a sample of 515 university students in South Korea. Cluster analysis revealed vocational identity statuses to be…
Zhu, Yong; Wen, Wen; Zhang, Fengmin; Hardie, Jim W.
2015-01-01
Background and Aims Proton nuclear magnetic resonance spectroscopy coupled multivariate analysis (1H NMR-PCA/PLS-DA) is an important tool for the discrimination of wine products. Although 1H NMR has been shown to discriminate wines of different cultivars, a grape genetic component of the discrimination has been inferred only from discrimination of cultivars of undefined genetic homology and in the presence of many confounding environmental factors. We aimed to confirm the influence of grape genotypes in the absence of those factors. Methods and Results We applied 1H NMR-PCA/PLS-DA and hierarchical cluster analysis (HCA) to wines from five, variously genetically-related grapevine (V. vinifera) cultivars; all grown similarly on the same site and vinified similarly. We also compared the semi-quantitative profiles of the discriminant metabolites of each cultivar with previously reported chemical analyses. The cultivars were clearly distinguishable and there was a general correlation between their grouping and their genetic homology as revealed by recent genomic studies. Between cultivars, the relative amounts of several of the cultivar-related discriminant metabolites conformed closely with reported chemical analyses. Conclusions Differences in grape-derived metabolites associated with genetic differences alone are a major source of 1H NMR-based discrimination of wines and 1H NMR has the capacity to discriminate between very closely related cultivars. Significance of the Study The study confirms that genetic variation among grape cultivars alone can account for the discrimination of wine by 1H NMR-PCA/PLS and indicates that 1H NMR spectra of wine of single grape cultivars may in future be used in tandem with hierarchical cluster analysis to elucidate genetic lineages and metabolomic relations of grapevine cultivars. In the absence of genetic information, for example, where predecessor varieties are no longer extant, this may be a particularly useful approach. PMID:26658757
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6–7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification. PMID:24086666
DBSCAN-based ROI extracted from SAR images and the discrimination of multi-feature ROI
NASA Astrophysics Data System (ADS)
He, Xin Yi; Zhao, Bo; Tan, Shu Run; Zhou, Xiao Yang; Jiang, Zhong Jin; Cui, Tie Jun
2009-10-01
The purpose of the paper is to extract the region of interest (ROI) from the coarse detected synthetic aperture radar (SAR) images and discriminate if the ROI contains a target or not, so as to eliminate the false alarm, and prepare for the target recognition. The automatic target clustering is one of the most difficult tasks in the SAR-image automatic target recognition system. The density-based spatial clustering of applications with noise (DBSCAN) relies on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN was first used in the SAR image processing, which has many excellent features: only two insensitivity parameters (radius of neighborhood and minimum number of points) are needed; clusters of arbitrary shapes which fit in with the coarse detected SAR images can be discovered; and the calculation time and memory can be reduced. In the multi-feature ROI discrimination scheme, we extract several target features which contain the geometry features such as the area discriminator and Radon-transform based target profile discriminator, the distribution characteristics such as the EFF discriminator, and the EM scattering property such as the PPR discriminator. The synthesized judgment effectively eliminates the false alarms.
Stewart, C M; Newlands, S D; Perachio, A A
2004-12-01
Rapid and accurate discrimination of single units from extracellular recordings is a fundamental process for the analysis and interpretation of electrophysiological recordings. We present an algorithm that performs detection, characterization, discrimination, and analysis of action potentials from extracellular recording sessions. The program was entirely written in LabVIEW (National Instruments), and requires no external hardware devices or a priori information about action potential shapes. Waveform events are detected by scanning the digital record for voltages that exceed a user-adjustable trigger. Detected events are characterized to determine nine different time and voltage levels for each event. Various algebraic combinations of these waveform features are used as axis choices for 2-D Cartesian plots of events. The user selects axis choices that generate distinct clusters. Multiple clusters may be defined as action potentials by manually generating boundaries of arbitrary shape. Events defined as action potentials are validated by visual inspection of overlain waveforms. Stimulus-response relationships may be identified by selecting any recorded channel for comparison to continuous and average cycle histograms of binned unit data. The algorithm includes novel aspects of feature analysis and acquisition, including higher acquisition rates for electrophysiological data compared to other channels. The program confirms that electrophysiological data may be discriminated with high-speed and efficiency using algebraic combinations of waveform features derived from high-speed digital records.
Zou, Ying-Min; Ni, Ke; Yang, Zhuo-Ya; Li, Ying; Cai, Xin-Lu; Xie, Dong-Jie; Zhang, Rui-Ting; Zhou, Fu-Chun; Li, Wen-Xiu; Lui, Simon S Y; Shum, David H K; Cheung, Eric F C; Chan, Raymond C K
2018-05-01
Emotion deficits may be the basis of negative symptoms in schizophrenia patients and they are prevalent in these patients. However, inconsistent findings about emotion deficits in schizophrenia suggest that there may be subtypes. The present study aimed to examine and profile experiential pleasure, emotional regulation and expression in patients with schizophrenia. A set of checklists specifically capturing experiential pleasure, emotional regulation, emotion expression, depressive symptoms and anhedonia were administered to 146 in-patients with schizophrenia and 73 demographically-matched healthy controls. Psychiatric symptoms and negative symptoms were also evaluated by a trained psychiatrist for patients with schizophrenia. Two-stage cluster analysis and discriminant function analysis were used to analyze the profile of these measures in patients with schizophrenia. We found a three-cluster solution. Cluster 1 (n=41) was characterized by a deficit in experiential pleasure and emotional regulation, Cluster 2 (n=47) was characterized by a general deficit in experiential pleasure, emotional regulation and emotion expression, and Cluster 3 (n=57) was characterized by a deficit in emotion expression. Results of a discriminant function analysis indicated that the three groups were reasonably discrete. The present findings suggest that schizophrenia patients can be classified into three subtypes based on experiential pleasure, emotional regulation and emotion expression, which are characterized by distinct clinical representations. Copyright © 2017 Elsevier B.V. All rights reserved.
Samsir, Sri A'jilah; Bunawan, Hamidun; Yen, Choong Chee; Noor, Normah Mohd
2016-09-01
In this dataset, we distinguish 15 accessions of Garcinia mangostana from Peninsular Malaysia using Fourier transform-infrared spectroscopy coupled with chemometric analysis. We found that the position and intensity of characteristic peaks at 3600-3100 cm(-) (1) in IR spectra allowed discrimination of G. mangostana from different locations. Further principal component analysis (PCA) of all the accessions suggests the two main clusters were formed: samples from Johor, Melaka, and Negeri Sembilan (South) were clustered together in one group while samples from Perak, Kedah, Penang, Selangor, Kelantan, and Terengganu (North and East Coast) were in another clustered group.
De Luca, Michele; Restuccia, Donatella; Clodoveo, Maria Lisa; Puoci, Francesco; Ragno, Gaetano
2016-07-01
Chemometric discrimination of extra virgin olive oils (EVOO) from whole and stoned olive pastes was carried out by using Fourier transform infrared (FTIR) data and partial least squares-discriminant analysis (PLS1-DA) approach. Four Italian commercial EVOO brands, all in both whole and stoned version, were considered in this study. The adopted chemometric methodologies were able to describe the different chemical features in phenolic and volatile compounds contained in the two types of oil by using unspecific IR spectral information. Principal component analysis (PCA) was employed in cluster analysis to capture data patterns and to highlight differences between technological processes and EVOO brands. The PLS1-DA algorithm was used as supervised discriminant analysis to identify the different oil extraction procedures. Discriminant analysis was extended to the evaluation of possible adulteration by addition of aliquots of oil from whole paste to the most valuable oil from stoned olives. The statistical parameters from external validation of all the PLS models were very satisfactory, with low root mean square error of prediction (RMSEP) and relative error (RE%). Copyright © 2016 Elsevier Ltd. All rights reserved.
Pfeiffenberger, Erik; Chaleil, Raphael A.G.; Moal, Iain H.
2017-01-01
ABSTRACT Reliable identification of near‐native poses of docked protein–protein complexes is still an unsolved problem. The intrinsic heterogeneity of protein–protein interactions is challenging for traditional biophysical or knowledge based potentials and the identification of many false positive binding sites is not unusual. Often, ranking protocols are based on initial clustering of docked poses followed by the application of an energy function to rank each cluster according to its lowest energy member. Here, we present an approach of cluster ranking based not only on one molecular descriptor (e.g., an energy function) but also employing a large number of descriptors that are integrated in a machine learning model, whereby, an extremely randomized tree classifier based on 109 molecular descriptors is trained. The protocol is based on first locally enriching clusters with additional poses, the clusters are then characterized using features describing the distribution of molecular descriptors within the cluster, which are combined into a pairwise cluster comparison model to discriminate near‐native from incorrect clusters. The results show that our approach is able to identify clusters containing near‐native protein–protein complexes. In addition, we present an analysis of the descriptors with respect to their power to discriminate near native from incorrect clusters and how data transformations and recursive feature elimination can improve the ranking performance. Proteins 2017; 85:528–543. © 2016 Wiley Periodicals, Inc. PMID:27935158
Identification and DUS Testing of Rice Varieties through Microsatellite Markers.
Pourabed, Ehsan; Jazayeri Noushabadi, Mohammad Reza; Jamali, Seyed Hossein; Moheb Alipour, Naser; Zareyan, Abbas; Sadeghi, Leila
2015-01-01
Identification and registration of new rice varieties are very important to be free from environmental effects and using molecular markers that are more reliable. The objectives of this study were, first, the identification and distinction of 40 rice varieties consisting of local varieties of Iran, improved varieties, and IRRI varieties using PIC, and discriminating power, second, cluster analysis based on Dice similarity coefficient and UPGMA algorithm, and, third, determining the ability of microsatellite markers to separate varieties utilizing the best combination of markers. For this research, 12 microsatellite markers were used. In total, 83 polymorphic alleles (6.91 alleles per locus) were found. In addition, the variation of PIC was calculated from 0.52 to 0.9. The results of cluster analysis showed the complete discrimination of varieties from each other except for IR58025A and IR58025B. Moreover, cluster analysis could detect the most of the improved varieties from local varieties. Based on the best combination of markers analysis, five pair primers together have shown the same results of all markers for detection among all varieties. Considering the results of this research, we can propose that microsatellite markers can be used as a complementary tool for morphological characteristics in DUS tests.
Raman spectroscopy of normal oral buccal mucosa tissues: study on intact and incised biopsies
NASA Astrophysics Data System (ADS)
Deshmukh, Atul; Singh, S. P.; Chaturvedi, Pankaj; Krishna, C. Murali
2011-12-01
Oral squamous cell carcinoma is one of among the top 10 malignancies. Optical spectroscopy, including Raman, is being actively pursued as alternative/adjunct for cancer diagnosis. Earlier studies have demonstrated the feasibility of classifying normal, premalignant, and malignant oral ex vivo tissues. Spectral features showed predominance of lipids and proteins in normal and cancer conditions, respectively, which were attributed to membrane lipids and surface proteins. In view of recent developments in deep tissue Raman spectroscopy, we have recorded Raman spectra from superior and inferior surfaces of 10 normal oral tissues on intact, as well as incised, biopsies after separation of epithelium from connective tissue. Spectral variations and similarities among different groups were explored by unsupervised (principal component analysis) and supervised (linear discriminant analysis, factorial discriminant analysis) methodologies. Clusters of spectra from superior and inferior surfaces of intact tissues show a high overlap; whereas spectra from separated epithelium and connective tissue sections yielded clear clusters, though they also overlap on clusters of intact tissues. Spectra of all four groups of normal tissues gave exclusive clusters when tested against malignant spectra. Thus, this study demonstrates that spectra recorded from the superior surface of an intact tissue may have contributions from deeper layers but has no bearing from the classification of a malignant tissues point of view.
Voss, Andreas; Fischer, Claudia; Schroeder, Rico; Figulla, Hans R; Goernig, Matthias
2012-07-01
The objectives of this study were to introduce a new type of heart-rate variability analysis improving risk stratification in patients with idiopathic dilated cardiomyopathy (DCM) and to provide additional information about impaired heart beat generation in these patients. Beat-to-beat intervals (BBI) of 30-min ECGs recorded from 91 DCM patients and 21 healthy subjects were analyzed applying the lagged segmented Poincaré plot analysis (LSPPA) method. LSPPA includes the Poincaré plot reconstruction with lags of 1-100, rotating the cloud of points, its normalized segmentation adapted to their standard deviations, and finally, a frequency-dependent clustering. The lags were combined into eight different clusters representing specific frequency bands within 0.012-1.153 Hz. Statistical differences between low- and high-risk DCM could be found within the clusters II-VIII (e.g., cluster IV: 0.033-0.038 Hz; p = 0.0002; sensitivity = 85.7 %; specificity = 71.4 %). The multivariate statistics led to a sensitivity of 92.9 %, specificity of 85.7 % and an area under the curve of 92.1 % discriminating these patient groups. We introduced the LSPPA method to investigate time correlations in BBI time series. We found that LSPPA contributes considerably to risk stratification in DCM and yields the highest discriminant power in the low and very low-frequency bands.
Zautner, Andreas Erich; Masanta, Wycliffe Omurwa; Tareen, Abdul Malik; Weig, Michael; Lugert, Raimond; Groß, Uwe; Bader, Oliver
2013-11-07
Campylobacter jejuni, the most common bacterial pathogen causing gastroenteritis, shows a wide genetic diversity. Previously, we demonstrated by the combination of multi locus sequence typing (MLST)-based UPGMA-clustering and analysis of 16 genetic markers that twelve different C. jejuni subgroups can be distinguished. Among these are two prominent subgroups. The first subgroup contains the majority of hyperinvasive strains and is characterized by a dimeric form of the chemotaxis-receptor Tlp7(m+c). The second has an extended amino acid metabolism and is characterized by the presence of a periplasmic asparaginase (ansB) and gamma-glutamyl-transpeptidase (ggt). Phyloproteomic principal component analysis (PCA) hierarchical clustering of MALDI-TOF based intact cell mass spectrometry (ICMS) spectra was able to group particular C. jejuni subgroups of phylogenetic related isolates in distinct clusters. Especially the aforementioned Tlp7(m+c)(+) and ansB+/ ggt+ subgroups could be discriminated by PCA. Overlay of ICMS spectra of all isolates led to the identification of characteristic biomarker ions for these specific C. jejuni subgroups. Thus, mass peak shifts can be used to identify the C. jejuni subgroup with an extended amino acid metabolism. Although the PCA hierarchical clustering of ICMS-spectra groups the tested isolates into a different order as compared to MLST-based UPGMA-clustering, the isolates of the indicator-groups form predominantly coherent clusters. These clusters reflect phenotypic aspects better than phylogenetic clustering, indicating that the genes corresponding to the biomarker ions are phylogenetically coupled to the tested marker genes. Thus, PCA clustering could be an additional tool for analyzing the relatedness of bacterial isolates.
Meltzer, H Y; Matsubara, S; Lee, J C
1989-10-01
The pKi values of 13 reference typical and 7 reference atypical antipsychotic drugs (APDs) for rat striatal dopamine D-1 and D-2 receptor binding sites and cortical serotonin (5-HT2) receptor binding sites were determined. The atypical antipsychotics had significantly lower pKi values for the D-2 but not 5-HT2 binding sites. There was a trend for a lower pKi value for the D-1 binding site for the atypical APD. The 5-HT2 and D-1 pKi values were correlated for the typical APD whereas the 5-HT2 and D-2 pKi values were correlated for the atypical APD. A stepwise discriminant function analysis to determine the independent contribution of each pKi value for a given binding site to the classification as a typical or atypical APD entered the D-2 pKi value first, followed by the 5-HT2 pKi value. The D-1 pKi value was not entered. A discriminant function analysis correctly classified 19 of 20 of these compounds plus 14 of 17 additional test compounds as typical or atypical APD for an overall correct classification rate of 89.2%. The major contributors to the discriminant function were the D-2 and 5-HT2 pKi values. A cluster analysis based only on the 5-HT2/D2 ratio grouped 15 of 17 atypical + one typical APD in one cluster and 19 of 20 typical + two atypical APDs in a second cluster, for an overall correct classification rate of 91.9%. When the stepwise discriminant function was repeated for all 37 compounds, only the D-2 and 5-HT2 pKi values were entered into the discriminant function.(ABSTRACT TRUNCATED AT 250 WORDS)
Differential gene expression profiles of peripheral blood mononuclear cells in childhood asthma.
Kong, Qian; Li, Wen-Jing; Huang, Hua-Rong; Zhong, Ying-Qiang; Fang, Jian-Pei
2015-05-01
Asthma is a common childhood disease with strong genetic components. This study compared whole-genome expression differences between asthmatic young children and healthy controls to identify gene signatures of childhood asthma. Total RNA extracted from peripheral blood mononuclear cells (PBMC) was subjected to microarray analysis. QRT-PCR was performed to verify the microarray results. Classification and functional characterization of differential genes were illustrated by hierarchical clustering and gene ontology analysis. Multiple logistic regression (MLR) analysis, receiver operating characteristic (ROC) curve analysis, and discriminate power were used to scan asthma-specific diagnostic markers. For fold-change>2 and p < 0.05, there were 758 named differential genes. The results of QRT-PCR confirmed successfully the array data. Hierarchical clustering divided 29 highly possible genes into seven categories and the genes in the same cluster were likely to possess similar expression patterns or functions. Gene ontology analysis presented that differential genes primarily enriched in immune response, response to stress or stimulus, and regulation of apoptosis in biological process. MLR and ROC curve analysis revealed that the combination of ADAM33, Smad7, and LIGHT possessed excellent discriminating power. The combination of ADAM33, Smad7, and LIGHT would be a reliable and useful childhood asthma model for prediction and diagnosis.
Bowers, Andrew; Saltuklaroglu, Tim; Harkrider, Ashley; Cuellar, Megan
2013-01-01
Background Constructivist theories propose that articulatory hypotheses about incoming phonetic targets may function to enhance perception by limiting the possibilities for sensory analysis. To provide evidence for this proposal, it is necessary to map ongoing, high-temporal resolution changes in sensorimotor activity (i.e., the sensorimotor μ rhythm) to accurate speech and non-speech discrimination performance (i.e., correct trials.) Methods Sixteen participants (15 female and 1 male) were asked to passively listen to or actively identify speech and tone-sweeps in a two-force choice discrimination task while the electroencephalograph (EEG) was recorded from 32 channels. The stimuli were presented at signal-to-noise ratios (SNRs) in which discrimination accuracy was high (i.e., 80–100%) and low SNRs producing discrimination performance at chance. EEG data were decomposed using independent component analysis and clustered across participants using principle component methods in EEGLAB. Results ICA revealed left and right sensorimotor µ components for 14/16 and 13/16 participants respectively that were identified on the basis of scalp topography, spectral peaks, and localization to the precentral and postcentral gyri. Time-frequency analysis of left and right lateralized µ component clusters revealed significant (pFDR<.05) suppression in the traditional beta frequency range (13–30 Hz) prior to, during, and following syllable discrimination trials. No significant differences from baseline were found for passive tasks. Tone conditions produced right µ beta suppression following stimulus onset only. For the left µ, significant differences in the magnitude of beta suppression were found for correct speech discrimination trials relative to chance trials following stimulus offset. Conclusions Findings are consistent with constructivist, internal model theories proposing that early forward motor models generate predictions about likely phonemic units that are then synthesized with incoming sensory cues during active as opposed to passive processing. Future directions and possible translational value for clinical populations in which sensorimotor integration may play a functional role are discussed. PMID:23991030
Bowers, Andrew; Saltuklaroglu, Tim; Harkrider, Ashley; Cuellar, Megan
2013-01-01
Constructivist theories propose that articulatory hypotheses about incoming phonetic targets may function to enhance perception by limiting the possibilities for sensory analysis. To provide evidence for this proposal, it is necessary to map ongoing, high-temporal resolution changes in sensorimotor activity (i.e., the sensorimotor μ rhythm) to accurate speech and non-speech discrimination performance (i.e., correct trials.). Sixteen participants (15 female and 1 male) were asked to passively listen to or actively identify speech and tone-sweeps in a two-force choice discrimination task while the electroencephalograph (EEG) was recorded from 32 channels. The stimuli were presented at signal-to-noise ratios (SNRs) in which discrimination accuracy was high (i.e., 80-100%) and low SNRs producing discrimination performance at chance. EEG data were decomposed using independent component analysis and clustered across participants using principle component methods in EEGLAB. ICA revealed left and right sensorimotor µ components for 14/16 and 13/16 participants respectively that were identified on the basis of scalp topography, spectral peaks, and localization to the precentral and postcentral gyri. Time-frequency analysis of left and right lateralized µ component clusters revealed significant (pFDR<.05) suppression in the traditional beta frequency range (13-30 Hz) prior to, during, and following syllable discrimination trials. No significant differences from baseline were found for passive tasks. Tone conditions produced right µ beta suppression following stimulus onset only. For the left µ, significant differences in the magnitude of beta suppression were found for correct speech discrimination trials relative to chance trials following stimulus offset. Findings are consistent with constructivist, internal model theories proposing that early forward motor models generate predictions about likely phonemic units that are then synthesized with incoming sensory cues during active as opposed to passive processing. Future directions and possible translational value for clinical populations in which sensorimotor integration may play a functional role are discussed.
The MMPI-2 in sexual harassment and discrimination litigants.
Long, Barbara; Rouse, Steven V; Nelsen, R Owen; Butcher, James N
2004-06-01
In order to understand patterns of respondents on validity and clinical scales, this study analyzed archival Minnesota Multiphasic Personality Inventory 2s (MMPI-2s) produced by 192 women and 14 men who initiated legal claims of ongoing emotional harm related to workplace sexual harassment and discrimination. The MMPI-2s were administered as a part of a comprehensive psychiatric forensic evaluation of the claimants' current psychological condition. All validity and clinical scale scores were manually entered into the computer, and codetype and cluster analyses were obtained. Among the women, 28% produced a "normal limits" profile, providing no MMPI-2 support for their claims of ongoing emotional distress. Cluster analysis of the validity scales of the remaining profiles produced four distinctive clusters of profiles representing different approaches to the test items. Copyright 2004 Wiley Periodicals, Inc.
Varietal discrimination of hop pellets by near and mid infrared spectroscopy.
Machado, Julio C; Faria, Miguel A; Ferreira, Isabel M P L V O; Páscoa, Ricardo N M J; Lopes, João A
2018-04-01
Hop is one of the most important ingredients of beer production and several varieties are commercialized. Therefore, it is important to find an eco-real-time-friendly-low-cost technique to distinguish and discriminate hop varieties. This paper describes the development of a method based on vibrational spectroscopy techniques, namely near- and mid-infrared spectroscopy, for the discrimination of 33 commercial hop varieties. A total of 165 samples (five for each hop variety) were analysed by both techniques. Principal component analysis, hierarchical cluster analysis and partial least squares discrimination analysis were the chemometric tools used to discriminate positively the hop varieties. After optimizing the spectral regions and pre-processing methods a total of 94.2% and 96.6% correct hop varieties discrimination were obtained for near- and mid-infrared spectroscopy, respectively. The results obtained demonstrate the suitability of these vibrational spectroscopy techniques to discriminate different hop varieties and consequently their potential to be used as an authenticity tool. Compared with the reference procedures normally used for hops variety discrimination these techniques are quicker, cost-effective, non-destructive and eco-friendly. Copyright © 2017 Elsevier B.V. All rights reserved.
Detecting Outliers in Factor Analysis Using the Forward Search Algorithm
ERIC Educational Resources Information Center
Mavridis, Dimitris; Moustaki, Irini
2008-01-01
In this article we extend and implement the forward search algorithm for identifying atypical subjects/observations in factor analysis models. The forward search has been mainly developed for detecting aberrant observations in regression models (Atkinson, 1994) and in multivariate methods such as cluster and discriminant analysis (Atkinson, Riani,…
NASA Astrophysics Data System (ADS)
Liu, Wen; Zhang, Yuying; Yang, Si; Han, Donghai
2018-05-01
A new technique to identify the floral resources of honeys is demanded. Terahertz time-domain attenuated total reflection spectroscopy combined with chemometrics methods was applied to discriminate different categorizes (Medlar honey, Vitex honey, and Acacia honey). Principal component analysis (PCA), cluster analysis (CA) and partial least squares-discriminant analysis (PLS-DA) have been used to find information of the botanical origins of honeys. Spectral range also was discussed to increase the precision of PLS-DA model. The accuracy of 88.46% for validation set was obtained, using PLS-DA model in 0.5-1.5 THz. This work indicated terahertz time-domain attenuated total reflection spectroscopy was an available approach to evaluate the quality of honey rapidly.
ERIC Educational Resources Information Center
Glover, Robert H.; Mills, Michael R.
A research design, decision support system, and results of a comparative analysis of enrollment and financial strength (of private institutions granting masters and doctoral degrees) are presented. Cluster analysis, discriminant analysis, multiple regression, and an interactive decision support system are used to compare the enrollment and…
Taxonomic discrimination of higher plants by pyrolysis mass spectrometry.
Kim, S W; Ban, S H; Chung, H J; Choi, D W; Choi, P S; Yoo, O J; Liu, J R
2004-02-01
Pyrolysis mass spectrometry (PyMS) is a rapid, simple, high-resolution analytical method based on thermal degradation of complex material in a vacuum and has been widely applied to the discrimination of closely related microbial strains. Leaf samples of six species and one variety of higher plants (Rosa multiflora, R. multiflora var. platyphylla, Sedum kamtschaticum, S. takesimense, S. sarmentosum, Hepatica insularis, and H. asiatica) were subjected to PyMS for spectral fingerprinting. Principal component analysis of PyMS data was not able to discriminate these plants in discrete clusters. However, canonical variate analysis of PyMS data separated these plants from one another. A hierarchical dendrogram based on canonical variate analysis was in agreement with the known taxonomy of the plants at the variety level. These results indicate that PyMS is able to discriminate higher plants based on taxonomic classification at the family, genus, species, and variety level.
NASA Astrophysics Data System (ADS)
Kumar, Raj; Sharma, Vishal
2017-03-01
The present research is focused on the analysis of writing inks using destructive UV-Vis spectroscopy (dissolution of ink by the solvent) and non-destructive diffuse reflectance UV-Vis-NIR spectroscopy along with Chemometrics. Fifty seven samples of blue ballpoint pen inks were analyzed under optimum conditions to determine the differences in spectral features of inks among same and different manufacturers. Normalization was performed on the spectroscopic data before chemometric analysis. Principal Component Analysis (PCA) and K-mean cluster analysis were used on the data to ascertain whether the blue ballpoint pen inks could be differentiated by their UV-Vis/UV-Vis NIR spectra. The discriminating power is calculated by qualitative analysis by the visual comparison of the spectra (absorbance peaks), produced by the destructive and non-destructive methods. In the latter two methods, the pairwise comparison is made by incorporating the clustering method. It is found that chemometric method provides better discriminating power (98.72% and 99.46%, in destructive and non-destructive, respectively) in comparison to the qualitative analysis (69.67%).
The use of multicomponent statistical analysis in hydrogeological environmental research.
Lambrakis, Nicolaos; Antonakos, Andreas; Panagopoulos, George
2004-04-01
The present article examines the possibilities of investigating NO(3)(-) spread in aquifers by applying multicomponent statistical methods (factor, cluster and discriminant analysis) on hydrogeological, hydrochemical, and environmental parameters. A 4-R-Mode factor model determined from the analysis showed its useful role in investigating hydrogeological parameters affecting NO(3)(-) concentration, such as its dilution by upcoming groundwater of the recharge areas. The relationship between NO(3)(-) concentration and agricultural activities can be determined sufficiently by the first factor which relies on NO(3)(-) and SO(4)(2-) of the same origin-that of agricultural fertilizers. The other three factors of R-Mode analysis are not connected directly to the NO(3)(-) problem. They do however, by extracting the role of the unsaturated zone, show an interesting relationship between organic matter content, thickness and saturated hydraulic conductivity. The application of Hirerarchical Cluster Analysis, based on all possible combinations of classification method, showed two main groups of samples. The first group comprises samples from the edges and the second from the central part of the study area. By the application of Discriminant Analysis it was shown that NO(3)(-) and SO(4)(2-) ions are the most significant variables in the discriminant function. Therefore, the first group is considered to comprise all samples from areas not influenced by fertilizers lying on the edges of contaminating activities such as crop cultivation, while the second comprises all the other samples.
Sensitivity and specificity of univariate MRI analysis of experimentally degraded cartilage
Lin, Ping-Chang; Reiter, David A.; Spencer, Richard G.
2010-01-01
MRI is increasingly used to evaluate cartilage in tissue constructs, explants, and animal and patient studies. However, while mean values of MR parameters, including T1, T2, magnetization transfer rate km, apparent diffusion coefficient ADC, and the dGEMRIC-derived fixed charge density, correlate with tissue status, the ability to classify tissue according to these parameters has not been explored. Therefore, the sensitivity and specificity with which each of these parameters was able to distinguish between normal and trypsin- degraded, and between normal and collagenase-degraded, cartilage explants were determined. Initial analysis was performed using a training set to determine simple group means to which parameters obtained from a validation set were compared. T1 and ADC showed the greatest ability to discriminate between normal and degraded cartilage. Further analysis with k-means clustering, which eliminates the need for a priori identification of sample status, generally performed comparably. Use of fuzzy c-means (FCM) clustering to define centroids likewise did not result in improvement in discrimination. Finally, a FCM clustering approach in which validation samples were assigned in a probabilistic fashion to control and degraded groups was implemented, reflecting the range of tissue characteristics seen with cartilage degradation. PMID:19705467
Delineating Scholarly Types of College and University Faculty Members
ERIC Educational Resources Information Center
Park, Toby J.; Braxton, John M.
2013-01-01
This study was conducted using cluster analysis as well as discriminant analysis to empirically identify types of faculty based on their patterns of performance of scholarship reflective of one or more of Boyer's four domains of scholarship. (Contains 5 tables and 1 figure.)
USDA-ARS?s Scientific Manuscript database
The combination of gas chromatography and pattern recognition (GC/PR) analysis is a powerful tool for investigating complicated biological problems. Clustering, mapping, discriminant development, etc. are necessary to analyze realistically large chromatographic data sets and to seek meaningful relat...
Identification and DUS Testing of Rice Varieties through Microsatellite Markers
Pourabed, Ehsan; Jazayeri Noushabadi, Mohammad Reza; Jamali, Seyed Hossein; Moheb Alipour, Naser; Zareyan, Abbas; Sadeghi, Leila
2015-01-01
Identification and registration of new rice varieties are very important to be free from environmental effects and using molecular markers that are more reliable. The objectives of this study were, first, the identification and distinction of 40 rice varieties consisting of local varieties of Iran, improved varieties, and IRRI varieties using PIC, and discriminating power, second, cluster analysis based on Dice similarity coefficient and UPGMA algorithm, and, third, determining the ability of microsatellite markers to separate varieties utilizing the best combination of markers. For this research, 12 microsatellite markers were used. In total, 83 polymorphic alleles (6.91 alleles per locus) were found. In addition, the variation of PIC was calculated from 0.52 to 0.9. The results of cluster analysis showed the complete discrimination of varieties from each other except for IR58025A and IR58025B. Moreover, cluster analysis could detect the most of the improved varieties from local varieties. Based on the best combination of markers analysis, five pair primers together have shown the same results of all markers for detection among all varieties. Considering the results of this research, we can propose that microsatellite markers can be used as a complementary tool for morphological characteristics in DUS tests. PMID:25755666
Zhang, Shaoliang; Lorenzo, Alberto; Gómez, Miguel-Angel; Mateus, Nuno; Gonçalves, Bruno; Sampaio, Jaime
2018-04-20
The aim of this study was: (i) to group basketball players into similar clusters based on a combination of anthropometric characteristics and playing experience; and (ii) explore the distribution of players (included starters and non-starters) from different levels of teams within the obtained clusters. The game-related statistics from 699 regular season balanced games were analyzed using a two-step cluster model and a discriminant analysis. The clustering process allowed identifying five different player profiles: Top height and weight (HW) with low experience, TopHW-LowE; Middle HW with middle experience, MiddleHW-MiddleE; Middle HW with top experience, MiddleHW-TopE; Low HW with low experience, LowHW-LowE; Low HW with middle experience, LowHW-MiddleE. Discriminant analysis showed that TopHW-LowE group was highlighted by two-point field goals made and missed, offensive and defensive rebounds, blocks, and personal fouls; whereas the LowHW-LowE group made fewest passes and touches. The players from weaker teams were mostly distributed in LowHW-LowE group, whereas players from stronger teams were mainly grouped in LowHW-MiddleE group; and players that participated in the finals were allocated in the MiddleHW-MiddleE group. These results provide alternative references for basketball staff concerning the process of evaluating performance.
Tahir, Haroon Elrasheid; Xiaobo, Zou; Xiaowei, Huang; Jiyong, Shi; Mariod, Abdalbasit Adam
2016-09-01
Aroma profiles of six honey varieties of different botanical origins were investigated using colorimetric sensor array, gas chromatography-mass spectrometry (GC-MS) and descriptive sensory analysis. Fifty-eight aroma compounds were identified, including 2 norisoprenoids, 5 hydrocarbons, 4 terpenes, 6 phenols, 7 ketones, 9 acids, 12 aldehydes and 13 alcohols. Twenty abundant or active compounds were chosen as key compounds to characterize honey aroma. Discrimination of the honeys was subsequently implemented using multivariate analysis, including hierarchical clustering analysis (HCA) and principal component analysis (PCA). Honeys of the same botanical origin were grouped together in the PCA score plot and HCA dendrogram. SPME-GC/MS and colorimetric sensor array were able to discriminate the honeys effectively with the advantages of being rapid, simple and low-cost. Moreover, partial least squares regression (PLSR) was applied to indicate the relationship between sensory descriptors and aroma compounds. Copyright © 2016 Elsevier Ltd. All rights reserved.
An Intelligent Decision Support System for Leukaemia Diagnosis using Microscopic Blood Images.
Chin Neoh, Siew; Srisukkham, Worawut; Zhang, Li; Todryk, Stephen; Greystoke, Brigit; Peng Lim, Chee; Alamgir Hossain, Mohammed; Aslam, Nauman
2015-10-09
This research proposes an intelligent decision support system for acute lymphoblastic leukaemia diagnosis from microscopic blood images. A novel clustering algorithm with stimulating discriminant measures (SDM) of both within- and between-cluster scatter variances is proposed to produce robust segmentation of nucleus and cytoplasm of lymphocytes/lymphoblasts. Specifically, the proposed between-cluster evaluation is formulated based on the trade-off of several between-cluster measures of well-known feature extraction methods. The SDM measures are used in conjuction with Genetic Algorithm for clustering nucleus, cytoplasm, and background regions. Subsequently, a total of eighty features consisting of shape, texture, and colour information of the nucleus and cytoplasm sub-images are extracted. A number of classifiers (multi-layer perceptron, Support Vector Machine (SVM) and Dempster-Shafer ensemble) are employed for lymphocyte/lymphoblast classification. Evaluated with the ALL-IDB2 database, the proposed SDM-based clustering overcomes the shortcomings of Fuzzy C-means which focuses purely on within-cluster scatter variance. It also outperforms Linear Discriminant Analysis and Fuzzy Compactness and Separation for nucleus-cytoplasm separation. The overall system achieves superior recognition rates of 96.72% and 96.67% accuracies using bootstrapping and 10-fold cross validation with Dempster-Shafer and SVM, respectively. The results also compare favourably with those reported in the literature, indicating the usefulness of the proposed SDM-based clustering method.
An Intelligent Decision Support System for Leukaemia Diagnosis using Microscopic Blood Images
Chin Neoh, Siew; Srisukkham, Worawut; Zhang, Li; Todryk, Stephen; Greystoke, Brigit; Peng Lim, Chee; Alamgir Hossain, Mohammed; Aslam, Nauman
2015-01-01
This research proposes an intelligent decision support system for acute lymphoblastic leukaemia diagnosis from microscopic blood images. A novel clustering algorithm with stimulating discriminant measures (SDM) of both within- and between-cluster scatter variances is proposed to produce robust segmentation of nucleus and cytoplasm of lymphocytes/lymphoblasts. Specifically, the proposed between-cluster evaluation is formulated based on the trade-off of several between-cluster measures of well-known feature extraction methods. The SDM measures are used in conjuction with Genetic Algorithm for clustering nucleus, cytoplasm, and background regions. Subsequently, a total of eighty features consisting of shape, texture, and colour information of the nucleus and cytoplasm sub-images are extracted. A number of classifiers (multi-layer perceptron, Support Vector Machine (SVM) and Dempster-Shafer ensemble) are employed for lymphocyte/lymphoblast classification. Evaluated with the ALL-IDB2 database, the proposed SDM-based clustering overcomes the shortcomings of Fuzzy C-means which focuses purely on within-cluster scatter variance. It also outperforms Linear Discriminant Analysis and Fuzzy Compactness and Separation for nucleus-cytoplasm separation. The overall system achieves superior recognition rates of 96.72% and 96.67% accuracies using bootstrapping and 10-fold cross validation with Dempster-Shafer and SVM, respectively. The results also compare favourably with those reported in the literature, indicating the usefulness of the proposed SDM-based clustering method. PMID:26450665
Mocz, G.
1995-01-01
Fuzzy cluster analysis has been applied to the 20 amino acids by using 65 physicochemical properties as a basis for classification. The clustering products, the fuzzy sets (i.e., classical sets with associated membership functions), have provided a new measure of amino acid similarities for use in protein folding studies. This work demonstrates that fuzzy sets of simple molecular attributes, when assigned to amino acid residues in a protein's sequence, can predict the secondary structure of the sequence with reasonable accuracy. An approach is presented for discriminating standard folding states, using near-optimum information splitting in half-overlapping segments of the sequence of assigned membership functions. The method is applied to a nonredundant set of 252 proteins and yields approximately 73% matching for correctly predicted and correctly rejected residues with approximately 60% overall success rate for the correctly recognized ones in three folding states: alpha-helix, beta-strand, and coil. The most useful attributes for discriminating these states appear to be related to size, polarity, and thermodynamic factors. Van der Waals volume, apparent average thickness of surrounding molecular free volume, and a measure of dimensionless surface electron density can explain approximately 95% of prediction results. hydrogen bonding and hydrophobicity induces do not yet enable clear clustering and prediction. PMID:7549882
Slaus, Mario; Tomicić, Zeljko; Uglesić, Ante; Jurić, Radomir
2004-08-01
To determine the ethnic composition of the early medieval Croats, the location from which they migrated to the east coast of the Adriatic, and to separate early medieval Croats from Bijelo brdo culture members, using principal components analysis and discriminant function analysis of craniometric data from Central and South-East European medieval archaeological sites. Mean male values for 8 cranial measurements from 39 European and 5 Iranian sites were analyzed by principal components analysis. Raw data for 17 cranial measurements for 103 female and 112 male skulls were used to develop discriminant functions. The scatter-plot of the analyzed sites on the first 2 principal components showed a pattern of intergroup relationships consistent with geographical and archaeological information not included in the data set. The first 2 principal components separated the sites into 4 distinct clusters: Avaroslav sites west of the Danube, Avaroslav sites east of the Danube, Bijelo brdo sites, and Polish sites. All early medieval Croat sites were located in the cluster of Polish sites. Two discriminant functions successfully differentiated between early medieval Croats and Bijelo brdo members. Overall accuracies were high -- 89.3% for males, and 97.1% for females. Early medieval Croats seem to be of Slavic ancestry, and at one time shared a common homeland with medieval Poles. Application of unstandardized discriminant function coefficients to unclassified crania from 18 sites showed an expansion of early medieval Croats into continental Croatia during the 10th to 13th century.
Tian, Huaixiang; Li, Fenghua; Qin, Lan; Yu, Haiyan; Ma, Xia
2014-11-01
This study examines the feasibility of electronic nose as a method to discriminate chicken and beef seasonings and to predict sensory attributes. Sensory evaluation showed that 8 chicken seasonings and 4 beef seasonings could be well discriminated and classified based on 8 sensory attributes. The sensory attributes including chicken/beef, gamey, garlic, spicy, onion, soy sauce, retention, and overall aroma intensity were generated by a trained evaluation panel. Principal component analysis (PCA), discriminant factor analysis (DFA), and cluster analysis (CA) combined with electronic nose were used to discriminate seasoning samples based on the difference of the sensor response signals of chicken and beef seasonings. The correlation between sensory attributes and electronic nose sensors signal was established using partial least squares regression (PLSR) method. The results showed that the seasoning samples were all correctly classified by the electronic nose combined with PCA, DFA, and CA. The electronic nose gave good prediction results for all the sensory attributes with correlation coefficient (r) higher than 0.8. The work indicated that electronic nose is an effective method for discriminating different seasonings and predicting sensory attributes. © 2014 Institute of Food Technologists®
Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue
2016-01-01
The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines.
Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue
2016-01-01
The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines. PMID:26890416
Jiang, Shun-Yuan; Sun, Hong-Bing; Sun, Hui; Ma, Yu-Ying; Chen, Hong-Yu; Zhu, Wen-Tao; Zhou, Yi
2016-03-01
This paper aims to explore a comprehensive assessment method combined traditional Chinese medicinal material specifications with quantitative quality indicators. Seventy-six samples of Notopterygii Rhizoma et Radix were collected on market and at producing areas. Traditional commercial specifications were described and assigned, and 10 chemical components and volatile oils were determined for each sample. Cluster analysis, Fisher discriminant analysis and correspondence analysis were used to establish the relationship between the traditional qualitative commercial specifications and quantitative chemical indices for comprehensive evaluating quality of medicinal materials, and quantitative classification of commercial grade and quality grade. A herb quality index (HQI) including traditional commercial specifications and chemical components for quantitative grade classification were established, and corresponding discriminant function were figured out for precise determination of quality grade and sub-grade of Notopterygii Rhizoma et Radix. The result showed that notopterol, isoimperatorin and volatile oil were the major components for determination of chemical quality, and their dividing values were specified for every grade and sub-grade of the commercial materials of Notopterygii Rhizoma et Radix. According to the result, essential relationship between traditional medicinal indicators, qualitative commercial specifications, and quantitative chemical composition indicators can be examined by K-mean cluster, Fisher discriminant analysis and correspondence analysis, which provide a new method for comprehensive quantitative evaluation of traditional Chinese medicine quality integrated traditional commodity specifications and quantitative modern chemical index. Copyright© by the Chinese Pharmaceutical Association.
Kumar, Raj; Sharma, Vishal
2017-03-15
The present research is focused on the analysis of writing inks using destructive UV-Vis spectroscopy (dissolution of ink by the solvent) and non-destructive diffuse reflectance UV-Vis-NIR spectroscopy along with Chemometrics. Fifty seven samples of blue ballpoint pen inks were analyzed under optimum conditions to determine the differences in spectral features of inks among same and different manufacturers. Normalization was performed on the spectroscopic data before chemometric analysis. Principal Component Analysis (PCA) and K-mean cluster analysis were used on the data to ascertain whether the blue ballpoint pen inks could be differentiated by their UV-Vis/UV-Vis NIR spectra. The discriminating power is calculated by qualitative analysis by the visual comparison of the spectra (absorbance peaks), produced by the destructive and non-destructive methods. In the latter two methods, the pairwise comparison is made by incorporating the clustering method. It is found that chemometric method provides better discriminating power (98.72% and 99.46%, in destructive and non-destructive, respectively) in comparison to the qualitative analysis (69.67%). Copyright © 2016 Elsevier B.V. All rights reserved.
Davis, Philip A.; Grolier, Maurice J.
1984-01-01
Landsat multispectral scanner (MSS) band and band-ratio databases of two scenes covering the Midyan region of northwestern Saudi Arabia were examined quantitatively and qualitatively to determine which databases best discriminate the geologic units of this semi-arid and arid region. Unsupervised, linear-discriminant cluster-analysis was performed on these two band-ratio combinations and on the MSS bands for both scenes. The results for granitoid-rock discrimination indicated that the classification images using the MSS bands are superior to the band-ratio classification images for two reasons, discussed in the paper. Yet, the effects of topography and material type (including desert varnish) on the MSS-band data produced ambiguities in the MSS-band classification results. However, these ambiguities were clarified by using a simulated natural-color image in conjunction with the MSS-band classification image.
Random whole metagenomic sequencing for forensic discrimination of soils.
Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian
2014-01-01
Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.
Applications of Some Artificial Intelligence Methods to Satellite Soundings
NASA Technical Reports Server (NTRS)
Munteanu, M. J.; Jakubowicz, O.
1985-01-01
Hard clustering of temperature profiles and regression temperature retrievals were used to refine the method using the probabilities of membership of each pattern vector in each of the clusters derived with discriminant analysis. In hard clustering the maximum probability is taken and the corresponding cluster as the correct cluster are considered discarding the rest of the probabilities. In fuzzy partitioned clustering these probabilities are kept and the final regression retrieval is a weighted regression retrieval of several clusters. This method was used in the clustering of brightness temperatures where the purpose was to predict tropopause height. A further refinement is the division of temperature profiles into three major regions for classification purposes. The results are summarized in the tables total r.m.s. errors are displayed. An approach based on fuzzy logic which is intimately related to artificial intelligence methods is recommended.
[Discrimination of varieties of brake fluid using visual-near infrared spectra].
Jiang, Lu-lu; Tan, Li-hong; Qiu, Zheng-jun; Lu, Jiang-feng; He, Yong
2008-06-01
A new method was developed to fast discriminate brands of brake fluid by means of visual-near infrared spectroscopy. Five different brands of brake fluid were analyzed using a handheld near infrared spectrograph, manufactured by ASD Company, and 60 samples were gotten from each brand of brake fluid. The samples data were pretreated using average smoothing and standard normal variable method, and then analyzed using principal component analysis (PCA). A 2-dimensional plot was drawn based on the first and the second principal components, and the plot indicated that the clustering characteristic of different brake fluid is distinct. The foregoing 6 principal components were taken as input variable, and the band of brake fluid as output variable to build the discriminate model by stepwise discriminant analysis method. Two hundred twenty five samples selected randomly were used to create the model, and the rest 75 samples to verify the model. The result showed that the distinguishing rate was 94.67%, indicating that the method proposed in this paper has good performance in classification and discrimination. It provides a new way to fast discriminate different brands of brake fluid.
Thin-layer chromatographic identification of Chinese propolis using chemometric fingerprinting.
Tang, Tie-xin; Guo, Wei-yan; Xu, Ye; Zhang, Si-ming; Xu, Xin-jun; Wang, Dong-mei; Zhao, Zhi-min; Zhu, Long-ping; Yang, De-po
2014-01-01
Poplar tree gum has a similar chemical composition and appearance to Chinese propolis (bee glue) and has been widely used as a counterfeit propolis because Chinese propolis is typically the poplar-type propolis, the chemical composition of which is determined mainly by the resin of poplar trees. The discrimination of Chinese propolis from poplar tree gum is a challenging task. To develop a rapid thin-layer chromatographic (TLC) identification method using chemometric fingerprinting to discriminate Chinese propolis from poplar tree gum. A new TLC method using a combination of ammonia and hydrogen peroxide vapours as the visualisation reagent was developed to characterise the chemical profile of Chinese propolis. Three separate people performed TLC on eight Chinese propolis samples and three poplar tree gum samples of varying origins. Five chemometric methods, including similarity analysis, hierarchical clustering, k-means clustering, neural network and support vector machine, were compared for use in classifying the samples based on their densitograms obtained from the TLC chromatograms via image analysis. Hierarchical clustering, neural network and support vector machine analyses achieved a correct classification rate of 100% in classifying the samples. A strategy for TLC identification of Chinese propolis using chemometric fingerprinting was proposed and it provided accurate sample classification. The study has shown that the TLC identification method using chemometric fingerprinting is a rapid, low-cost method for the discrimination of Chinese propolis from poplar tree gum and may be used for the quality control of Chinese propolis. Copyright © 2014 John Wiley & Sons, Ltd.
[A survey on AIDS discrimination among medical college students].
Liu, Jia-hong; Jiang, Hong-ying; Chen, Hong; Liao, Qing-hua; Fu, Jun; Lu, Fei-bao; Liu, Wei-xin; Li, Yue
2009-11-01
To understand the related knowledge, discrimination attitudes toward HIV/AIDS among medical college students, and to provide scientific evidence for further HIV/AIDS anti-discrimination intervention. By means of stratified cluster sampling to classes, 2844 undergraduate students were randomly selected from medical colleges. A self-designed and self-administered questionnaire survey was conducted, and SPSS 13.0 software was used for data analysis. 2501 valid questionnaires had been collected. The overall HIV/AIDS knowledge coverage rate of the respondents was 73.1% (1828/2501); The HIV/AIDS discrimination rates in different questions were varying, the discrimination rate of infected with AIDS by bad sex and sharing needles was 83.1% (2078/2501) and 77.7% (1943/2501) respectively, the discrimination rates in term of contacting with HIV patients and their daily necessities, sharing desks, personal social were all exceeding 40%. The medical students held serious discrimination attitudes to HIV infected persons and patients; it is necessary to strengthen anti-discrimination education about HIV/AIDS among medical students.
Shahdoust, Maryam; Hajizadeh, Ebrahim; Mozdarani, Hossein; Chehrei, Ali
2013-01-01
Cigarette smoking is the major risk factor for development of lung cancer. Identification of effects of tobacco on airway gene expression may provide insight into the causes. This research aimed to compare gene expression of large airway epithelium cells in normal smokers (n=13) and non-smokers (n=9) in order to find genes which discriminate the two groups and assess cigarette smoking effects on large airway epithelium cells. Genes discriminating smokers from non-smokers were identified by applying a neural network clustering method, growing self-organizing maps (GSOM), to microarray data according to class discrimination scores. An index was computed based on differentiation between each mean of gene expression in the two groups. This clustering approach provided the possibility of comparing thousands of genes simultaneously. The applied approach compared the mean of 7,129 genes in smokers and non-smokers simultaneously and classified the genes of large airway epithelium cells which had differently expressed in smokers comparing with non-smokers. Seven genes were identified which had the highest different expression in smokers compared with the non-smokers group: NQO1, H19, ALDH3A1, AKR1C1, ABHD2, GPX2 and ADH7. Most (NQO1, ALDH3A1, AKR1C1, H19 and GPX2) are known to be clinically notable in lung cancer studies. Furthermore, statistical discriminate analysis showed that these genes could classify samples in smokers and non-smokers correctly with 100% accuracy. With the performed GSOM map, other nodes with high average discriminate scores included genes with alterations strongly related to the lung cancer such as AKR1C3, CYP1B1, UCHL1 and AKR1B10. This clustering by comparing expression of thousands of genes at the same time revealed alteration in normal smokers. Most of the identified genes were strongly relevant to lung cancer in the existing literature. The genes may be utilized to identify smokers with increased risk for lung cancer. A large sample study is now recommended to determine relations between the genes ABHD2 and ADH7 and smoking.
Electrofacies analysis for coal lithotype profiling based on high-resolution wireline log data
NASA Astrophysics Data System (ADS)
Roslin, A.; Esterle, J. S.
2016-06-01
The traditional approach to coal lithotype analysis is based on a visual characterisation of coal in core, mine or outcrop exposures. As not all wells are fully cored, the petroleum and coal mining industries increasingly use geophysical wireline logs for lithology interpretation.This study demonstrates a method for interpreting coal lithotypes from geophysical wireline logs, and in particular discriminating between bright or banded, and dull coal at similar densities to a decimetre level. The study explores the optimum combination of geophysical log suites for training the coal electrofacies interpretation, using neural network conception, and then propagating the results to wells with fewer wireline data. This approach is objective and has a recordable reproducibility and rule set.In addition to conventional gamma ray and density logs, laterolog resistivity, microresistivity and PEF data were used in the study. Array resistivity data from a compact micro imager (CMI tool) were processed into a single microresistivity curve and integrated with the conventional resistivity data in the cluster analysis. Microresistivity data were tested in the analysis to test the hypothesis that the improved vertical resolution of microresistivity curve can enhance the accuracy of the clustering analysis. The addition of PEF log allowed discrimination between low density bright to banded coal electrofacies and low density inertinite-rich dull electrofacies.The results of clustering analysis were validated statistically and the results of the electrofacies results were compared to manually derived coal lithotype logs.
Cluster Analysis to Identify Possible Subgroups in Tinnitus Patients.
van den Berge, Minke J C; Free, Rolien H; Arnold, Rosemarie; de Kleine, Emile; Hofman, Rutger; van Dijk, J Marc C; van Dijk, Pim
2017-01-01
In tinnitus treatment, there is a tendency to shift from a "one size fits all" to a more individual, patient-tailored approach. Insight in the heterogeneity of the tinnitus spectrum might improve the management of tinnitus patients in terms of choice of treatment and identification of patients with severe mental distress. The goal of this study was to identify subgroups in a large group of tinnitus patients. Data were collected from patients with severe tinnitus complaints visiting our tertiary referral tinnitus care group at the University Medical Center Groningen. Patient-reported and physician-reported variables were collected during their visit to our clinic. Cluster analyses were used to characterize subgroups. For the selection of the right variables to enter in the cluster analysis, two approaches were used: (1) variable reduction with principle component analysis and (2) variable selection based on expert opinion. Various variables of 1,783 tinnitus patients were included in the analyses. Cluster analysis (1) included 976 patients and resulted in a four-cluster solution. The effect of external influences was the most discriminative between the groups, or clusters, of patients. The "silhouette measure" of the cluster outcome was low (0.2), indicating a "no substantial" cluster structure. Cluster analysis (2) included 761 patients and resulted in a three-cluster solution, comparable to the first analysis. Again, a "no substantial" cluster structure was found (0.2). Two cluster analyses on a large database of tinnitus patients revealed that clusters of patients are mostly formed by a different response of external influences on their disease. However, both cluster outcomes based on this dataset showed a poor stability, suggesting that our tinnitus population comprises a continuum rather than a number of clearly defined subgroups.
Melo, Armindo; Pinto, Edgar; Aguiar, Ana; Mansilha, Catarina; Pinho, Olívia; Ferreira, Isabel M P L V O
2012-07-01
A monitoring program of nitrate, nitrite, potassium, sodium, and pesticides was carried out in water samples from an intensive horticulture area in a vulnerable zone from north of Portugal. Eight collecting points were selected and water-analyzed in five sampling campaigns, during 1 year. Chemometric techniques, such as cluster analysis, principal component analysis (PCA), and discriminant analysis, were used in order to understand the impact of intensive horticulture practices on dug and drilled wells groundwater and to study variations in the hydrochemistry of groundwater. PCA performed on pesticide data matrix yielded seven significant PCs explaining 77.67% of the data variance. Although PCA rendered considerable data reduction, it could not clearly group and distinguish the sample types. However, a visible differentiation between the water samples was obtained. Cluster and discriminant analysis grouped the eight collecting points into three clusters of similar characteristics pertaining to water contamination, indicating that it is necessary to improve the use of water, fertilizers, and pesticides. Inorganic fertilizers such as potassium nitrate were suspected to be the most important factors for nitrate contamination since highly significant Pearson correlation (r = 0.691, P < 0.01) was obtained between groundwater nitrate and potassium contents. Water from dug wells is especially prone to contamination from the grower and their closer neighbor's practices. Water from drilled wells is also contaminated from distant practices.
Liu, Zehua; Wang, Dongmei; Li, Dengwu; Zhang, Shuai
2017-01-01
Juniperus rigida (J. rigida) which is endemic to East Asia, has traditionally been used as an ethnomedicinal plant in China. This study was undertaken to evaluate the quality of J. rigida samples derived from 11 primary regions in China. Ten phenolic compounds were simultaneously quantified using reversed-phase high-performance liquid chromatography (RP-HPLC), and chlorogenic acid, catechin, podophyllotoxin, and amentoflavone were found to be the main compounds in J. rigida needles, with the highest contents detected for catechin and podophyllotoxin. J. rigida from Jilin (S9, S10) and Liaoning (S11) exhibited the highest contents of phenolic profiles (total phenolics, total flavonoids and 10 phenolic compounds) and the strongest antioxidant and antibacterial activities, followed by Shaanxi (S2, S3). A similarity analysis (SA) demonstrated substantial similarities in fingerprint chromatograms, from which 14 common peaks were selected. The similarity values varied from 0.85 to 0.98. Chemometrics techniques, including hierarchical cluster analysis (HCA), principal component analysis (PCA), and discriminant analysis (DA), were further applied to facilitate accurate classification and quantification of the J. rigida samples derived from the 11 regions. The results supported HPLC data showing that all J. rigida samples exhibit considerable variations in phenolic profiles, and the samples were further clustered into three major groups coincident with their geographical regions of origin. In addition, two discriminant functions with a 100% discrimination ratio were constructed to further distinguish and classify samples with unknown membership on the basis of eigenvalues to allow optimal discrimination among the groups. Our comprehensive findings on matching phenolic profiles and bioactivities along with data from fingerprint chromatograms with chemometrics provide an effective tool for screening and quality evaluation of J. rigida and related medicinal preparations. PMID:28469573
Dentistry and HIV/AIDS related stigma.
Elizondo, Jesus Eduardo; Treviño, Ana Cecilia; Violant, Deborah
2015-01-01
To analyze HIV/AIDS positive individual's perception and attitudes regarding dental services. One hundred and thirty-four subjects (30.0% of women and 70.0% of men) from Nuevo León, Mexico, took part in the study (2014). They filled out structured, analytical, self-administered, anonymous questionnaires. Besides the sociodemographic variables, the perception regarding public and private dental services and related professionals was evaluated, as well as the perceived stigma associated with HIV/AIDS, through a Likert-type scale. The statistical evaluation included a factorial and a non-hierarchical cluster analysis. Social inequalities were found regarding the search for public and private dental professionals and services. Most subjects reported omitting their HIV serodiagnosis and agreed that dentists must be trained and qualified to treat patients with HIV/AIDS. The factorial analysis revealed two elements: experiences of stigma and discrimination in dental appointments and feelings of concern regarding the attitudes of professionals or their teams concerning patients' HIV serodiagnosis. The cluster analysis identified three groups: users who have not experienced stigma or discrimination (85.0%); the ones who have not had those experiences, but feel somewhat concerned (12.7%); and the ones who underwent stigma and discrimination and feel concerned (2.3%). We observed a low percentage of stigma and discrimination in dental appointments; however, most HIV/AIDS patients do not reveal their serodiagnosis to dentists out of fear of being rejected. Such fact implies a workplace hazard to dental professionals, but especially to the very own health of HIV/AIDS patients, as dentists will not be able to provide them a proper clinical and pharmaceutical treatment.
Yudthavorasit, Soparat; Wongravee, Kanet; Leepipatpiboon, Natchanun
2014-09-01
Chromatographic fingerprints of gingers from five different ginger-producing countries (China, India, Malaysia, Thailand and Vietnam) were newly established to discriminate the origin of ginger. The pungent bioactive principles of ginger, gingerols and six other gingerol-related compounds were determined and identified. Their variations in HPLC profiles create the characteristic pattern of each origin by employing similarity analysis, hierarchical cluster analysis (HCA), principal component analysis (PCA) and linear discriminant analysis (LDA). As results, the ginger profiles tended to be grouped and separated on the basis of the geographical closeness of the countries of origin. An effective mathematical model with high predictive ability was obtained and chemical markers for each origin were also identified as the characteristic active compounds to differentiate the ginger origin. The proposed method is useful for quality control of ginger in case of origin labelling and to assess food authenticity issues. Copyright © 2014 Elsevier Ltd. All rights reserved.
Identifying contextual influences of community reintegration among injured servicemembers.
Hawkins, Brent L; McGuire, Francis A; Britt, Thomas W; Linder, Sandra M
2015-01-01
Research suggests that community reintegration (CR) after injury and rehabilitation is difficult for many injured servicemembers. However, little is known about the influence of the contextual factors, both personal and environmental, that influence CR. Framed within the International Classification of Functioning, Disability and Health and Social Cognitive Theory, the quantitative portion of a larger mixed-methods study of 51 injured, community-dwelling servicemembers compared the relative contribution of contextual factors between groups of servicemembers with different levels of CR. Cluster analysis indicated three groups of servicemembers showing low, moderate, and high levels of CR. Statistical analyses identified contextual factors (e.g., personal and environmental factors) that significantly discriminated between CR clusters. Multivariate analysis of variance and discriminant analysis indicated significant contributions of general self-efficacy, services and assistance barriers, physical and structural barriers, attitudes and support barriers, perceived level of disability and/or handicap, work and school barriers, and policy barriers on CR scores. Overall, analyses indicated that injured servicemembers with lower CR scores had lower general self-efficacy scores, reported more difficulty with environmental barriers, and reported their injuries as more disabling.
Microclimate influence on mineral and metabolic profiles of grape berries.
Pereira, G E; Gaudillere, J-P; Pieri, P; Hilbert, G; Maucourt, M; Deborde, C; Moing, A; Rolin, D
2006-09-06
The grape berry microclimate is known to influence berry quality. The effects of the light exposure of grape berry clusters on the composition of berry tissues were studied on the "Merlot" variety grown in a vineyard in Bordeaux, France. The light exposure of the fruiting zone was modified using different intensities of leaf removal, cluster position relative to azimuth, and berry position in the cluster. Light exposures were identified and classified by in situ measurements of berry temperatures. Berries were sampled at maturity (>19 Brix) for determination of skin and/or pulp chemical and metabolic profiles based on (1) chemical and physicochemical measurement of minerals (N, P, K, Ca, Mg), (2) untargeted 1H NMR metabolic fingerprints, and HPLC targeted analyses of (3) amino acids and (4) phenolics. Each profile defined by partial least-square discriminant analysis allowed us to discriminate berries from different light exposure. Discriminant compounds between shaded and light-exposed berries were quercetin-3-glucoside, kaempferol-3-glucoside, myricetin-3-glucoside, and isorhamnetin-3-glucoside for the phenolics, histidine, valine, GABA, alanine, and arginine for the amino acids, and malate for the organic acids. Capacities of the different profiling techniques to discriminate berries were compared. Although the proportion of explained variance from the 1H NMR fingerprint was lower compared to that of chemical measurements, NMR spectroscopy allowed us to identify lit and shaded berries. Light exposure of berries increased the skin and pulp flavonols, histidine and valine contents, and reduced the organic acids, GABA, and alanine contents. All the targeted and nontargeted analytical data sets used made it possible to discriminate sun-exposed and shaded berries. The skin phenolics pattern was the most discriminating and allowed us to sort sun from shade berries. These metabolite classes can be used to qualify berries collected in an undetermined environment. The physiological significance of light and temperature effects on berry composition is discussed.
Characterization and Differentiation of Petroleum-Derived Products by E-Nose Fingerprints
Ferreiro-González, Marta; Palma, Miguel; Ayuso, Jesús; Álvarez, José A.; Barroso, Carmelo G.
2017-01-01
Characterization of petroleum-derived products is an area of continuing importance in environmental science, mainly related to fuel spills. In this study, a non-separative analytical method based on E-Nose (Electronic Nose) is presented as a rapid alternative for the characterization of several different petroleum-derived products including gasoline, diesel, aromatic solvents, and ethanol samples, which were poured onto different surfaces (wood, cork, and cotton). The working conditions about the headspace generation were 145 °C and 10 min. Mass spectroscopic data (45–200 m/z) combined with chemometric tools such as hierarchical cluster analysis (HCA), later principal component analysis (PCA), and finally linear discriminant analysis (LDA) allowed for a full discrimination of the samples. A characteristic fingerprint for each product can be used for discrimination or identification. The E-Nose can be considered as a green technique, and it is rapid and easy to use in routine analysis, thus providing a good alternative to currently used methods. PMID:29113069
Theory of mind predicts severity level in autism.
Hoogenhout, Michelle; Malcolm-Smith, Susan
2017-02-01
We investigated whether theory of mind skills can indicate autism spectrum disorder severity. In all, 62 children with autism spectrum disorder completed a developmentally sensitive theory of mind battery. We used intelligence quotient, Diagnostic and Statistical Manual of Mental Disorders (4th ed.) diagnosis and level of support needed as indicators of severity level. Using hierarchical cluster analysis, we found three distinct clusters of theory of mind ability: early-developing theory of mind (Cluster 1), false-belief reasoning (Cluster 2) and sophisticated theory of mind understanding (Cluster 3). The clusters corresponded to severe, moderate and mild autism spectrum disorder. As an indicator of level of support needed, cluster grouping predicted the type of school children attended. All Cluster 1 children attended autism-specific schools; Cluster 2 was divided between autism-specific and special needs schools and nearly all Cluster 3 children attended general special needs and mainstream schools. Assessing theory of mind skills can reliably discriminate severity levels within autism spectrum disorder.
Broad phonetic class definition driven by phone confusions
NASA Astrophysics Data System (ADS)
Lopes, Carla; Perdigão, Fernando
2012-12-01
Intermediate representations between the speech signal and phones may be used to improve discrimination among phones that are often confused. These representations are usually found according to broad phonetic classes, which are defined by a phonetician. This article proposes an alternative data-driven method to generate these classes. Phone confusion information from the analysis of the output of a phone recognition system is used to find clusters at high risk of mutual confusion. A metric is defined to compute the distance between phones. The results, using TIMIT data, show that the proposed confusion-driven phone clustering method is an attractive alternative to the approaches based on human knowledge. A hierarchical classification structure to improve phone recognition is also proposed using a discriminative weight training method. Experiments show improvements in phone recognition on the TIMIT database compared to a baseline system.
Temperature Gradient Effect on Gas Discrimination Power of a Metal-Oxide Thin-Film Sensor Microarray
Sysoev, Victor V.; Kiselev, Ilya; Frietsch, Markus; Goschnick, Joachim
2004-01-01
The paper presents results concerning the effect of spatial inhomogeneous operating temperature on the gas discrimination power of a gas-sensor microarray, with the latter based on a thin SnO2 film employed in the KAMINA electronic nose. Three different temperature distributions over the substrate are discussed: a nearly homogeneous one and two temperature gradients, equal to approx. 3.3 °C/mm and 6.7 °C/mm, applied across the sensor elements (segments) of the array. The gas discrimination power of the microarray is judged by using the Mahalanobis distance in the LDA (Linear Discrimination Analysis) coordinate system between the data clusters obtained by the response of the microarray to four target vapors: ethanol, acetone, propanol and ammonia. It is shown that the application of a temperature gradient increases the gas discrimination power of the microarray by up to 35 %.
Automated flow cytometric analysis across large numbers of samples and cell types.
Chen, Xiaoyi; Hasan, Milena; Libri, Valentina; Urrutia, Alejandra; Beitz, Benoît; Rouilly, Vincent; Duffy, Darragh; Patin, Étienne; Chalmond, Bernard; Rogge, Lars; Quintana-Murci, Lluis; Albert, Matthew L; Schwikowski, Benno
2015-04-01
Multi-parametric flow cytometry is a key technology for characterization of immune cell phenotypes. However, robust high-dimensional post-analytic strategies for automated data analysis in large numbers of donors are still lacking. Here, we report a computational pipeline, called FlowGM, which minimizes operator input, is insensitive to compensation settings, and can be adapted to different analytic panels. A Gaussian Mixture Model (GMM)-based approach was utilized for initial clustering, with the number of clusters determined using Bayesian Information Criterion. Meta-clustering in a reference donor permitted automated identification of 24 cell types across four panels. Cluster labels were integrated into FCS files, thus permitting comparisons to manual gating. Cell numbers and coefficient of variation (CV) were similar between FlowGM and conventional gating for lymphocyte populations, but notably FlowGM provided improved discrimination of "hard-to-gate" monocyte and dendritic cell (DC) subsets. FlowGM thus provides rapid high-dimensional analysis of cell phenotypes and is amenable to cohort studies. Copyright © 2015. Published by Elsevier Inc.
de Medeiros, Anna Cecília Queiroz; Yamamoto, Maria Emilia; Pedrosa, Lucia Fatima Campos; Hutz, Claudio Simon
2017-03-01
This study aimed to evaluate the psychometric properties and scoring pattern of the Brazilian version of the three-factor eating questionnaire-r21 (TFEQ-R21). Data were collected from 410 undergraduate students. Confirmatory factor analysis was conducted to examine the factor structure of the TFEQ-R21. Convergent and discriminant validity also was assessed. Cluster analysis was performed to investigate scoring patterns. In assessing the quality setting, the model was considered satisfactory (χ 2 /gl = 2.24, CFI = 0.97, TLI = 0.96, RMSEA = 0.05). The instrument was also considered appropriate in relation to the discriminant and convergent validity. There was a positive correlation between body mass index and the dimensions of cognitive restraint (r s = 0.449, p < 0.001) and emotional eating (r s = 0.112, p = 0.023). Using cluster analysis three respondent profiles were identified. The profile "A" was associated with appropriate weight, the "B" was characterized by high scores in cognitive restraint dimension, and the cluster "C" focused individuals who had higher scores on the uncontrolled eating and emotional eating dimensions. The Brazilian version of TFEQ-R21 has adequate psychometric properties, and the identified response profiles offer a promising prospect for its use in clinical practice, in weight loss interventions.
NASA Astrophysics Data System (ADS)
Zainudin, Ramlah; Sazali, Siti Nurlydia
A study on morphometrical variations of Malaysian Hylarana signata group was conducted to reveal the morphological relationships within the species group. Twenty-seven morphological characters from 18 individuals of H. signata and H. picturata were measured and recorded. The numerical data were analysed using Discriminant Function Analysis in SPSS program version 16.0 and UPGMA Cluster Analysis in Minitab program version 14.0. The results show the complexity clustering between the examined species that might be due to ancient polymorphism of the lineages or cryptic species within the group. Hence, further study should include more representatives in order to fully elucidate the morphological relationships of H. signata group.
Shawky, Eman; Abou El Kheir, Rasha M
2018-02-11
Species of Apiaceae are used in folk medicine as spices and in officinal medicinal preparations of drugs. They are an excellent source of phenolics exhibiting antioxidant activity, which are of great benefit to human health. Discrimination among Apiaceae medicinal herbs remains an intricate challenge due to their morphological similarity. In this study, a combined "untargeted" and "targeted" approach to investigate different Apiaceae plants species was proposed by using the merging of high-performance thin layer chromatography (HPTLC)-image analysis and pattern recognition methods which were used for fingerprinting and classification of 42 different Apiaceae samples collected from Egypt. Software for image processing was applied for fingerprinting and data acquisition. HPTLC fingerprint assisted by principal component analysis (PCA) and hierarchical cluster analysis (HCA)-heat maps resulted in a reliable untargeted approach for discrimination and classification of different samples. The "targeted" approach was performed by developing and validating an HPTLC method allowing the quantification of eight flavonoids. The combination of quantitative data with PCA and HCA-heat-maps allowed the different samples to be discriminated from each other. The use of chemometrics tools for evaluation of fingerprints reduced expense and analysis time. The proposed method can be adopted for routine discrimination and evaluation of the phytochemical variability in different Apiaceae species extracts. Copyright © 2018 John Wiley & Sons, Ltd.
Wan, B; Yarbrough, J W; Schultz, T W
2008-01-01
This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
[Use of multiple locus variable number tandem repeats analysis for the Brucella systematization].
Kulakov, Iu K; Kovalev, D A; Misetova, E N; Golovneva, S I; Liapustina, L V; Zheludkov, M M
2012-01-01
The methods of molecular-genetic differentiation to strain level acquire increasing significance in the current system of struggle with brucellosis. MLVA (multiple locus variable number tandem repeats analysis) was selected for molecular-genetic differentiation to strain level and simultaneous establishment of the genetic relationship of investigated Brucella strains. The goal of this work was MLVA typing of three pathogenic Brucella species strains with the analysis of stability of chosen loci, discrimination power and concordance to conventional phenotypic methods of the Brucella differentiation for use in systematization of brucellosis causing agents. Twenty six Brucella strains representing reference (n = 15), vaccine (n = 2) and field strains of three pathogenic Brucella species were tested: B. melitensis (n = 3), B. abortus (n = 2), B. suis (n = 2), and isolates (n = 2) with unidentified taxonomic position using MLVA with 9 pairs primers on known variable loci of Brucella genome. The analysis of the stability of chosen loci, discrimination power on Hunter-Gaston discrimination index (HGDI) and consistency to phenotypic methods of identification was performed. MLVA was confirmed for the results of phenotypic methods of identification, stability of the chosen loci in majority reference, and vaccine strains with a high index of variability HGDI 0.9969 for all loci. A dendrogram was plotted on the basis of MLVA data on distributed Brucella strains in related clusters according to its taxonomic species and biovar positions and construction of 25 genotypes. B. melitensis strains formed cluster related to the reference strain of B. melitensis 63/9 biovar 2. Australian isolates of Brucella 83-4 and Brucella 83-6 isolated from rodents formed a cluster distant from other strains of Brucella. MLVA is a promising method for differentiation of Brucella strains with known and unresolved taxonomic status for their systematization and creation of MLVA genotype catalogue that will promote qualitative improvement of brucellosis surveillance system in Russia.
Collected Notes on the Workshop for Pattern Discovery in Large Databases
NASA Technical Reports Server (NTRS)
Buntine, Wray (Editor); Delalto, Martha (Editor)
1991-01-01
These collected notes are a record of material presented at the Workshop. The core data analysis is addressed that have traditionally required statistical or pattern recognition techniques. Some of the core tasks include classification, discrimination, clustering, supervised and unsupervised learning, discovery and diagnosis, i.e., general pattern discovery.
Dentistry and HIV/AIDS related stigma
Elizondo, Jesus Eduardo; Treviño, Ana Cecilia; Violant, Deborah
2015-01-01
OBJECTIVE To analyze HIV/AIDS positive individual’s perception and attitudes regarding dental services. METHODS One hundred and thirty-four subjects (30.0% of women and 70.0% of men) from Nuevo León, Mexico, took part in the study (2014). They filled out structured, analytical, self-administered, anonymous questionnaires. Besides the sociodemographic variables, the perception regarding public and private dental services and related professionals was evaluated, as well as the perceived stigma associated with HIV/AIDS, through a Likert-type scale. The statistical evaluation included a factorial and a non-hierarchical cluster analysis. RESULTS Social inequalities were found regarding the search for public and private dental professionals and services. Most subjects reported omitting their HIV serodiagnosis and agreed that dentists must be trained and qualified to treat patients with HIV/AIDS. The factorial analysis revealed two elements: experiences of stigma and discrimination in dental appointments and feelings of concern regarding the attitudes of professionals or their teams concerning patients’ HIV serodiagnosis. The cluster analysis identified three groups: users who have not experienced stigma or discrimination (85.0%); the ones who have not had those experiences, but feel somewhat concerned (12.7%); and the ones who underwent stigma and discrimination and feel concerned (2.3%). CONCLUSIONS We observed a low percentage of stigma and discrimination in dental appointments; however, most HIV/AIDS patients do not reveal their serodiagnosis to dentists out of fear of being rejected. Such fact implies a workplace hazard to dental professionals, but especially to the very own health of HIV/AIDS patients, as dentists will not be able to provide them a proper clinical and pharmaceutical treatment. PMID:26538100
Determination of Ignitable Liquids in Fire Debris: Direct Analysis by Electronic Nose
Ferreiro-González, Marta; Barbero, Gerardo F.; Palma, Miguel; Ayuso, Jesús; Álvarez, José A.; Barroso, Carmelo G.
2016-01-01
Arsonists usually use an accelerant in order to start or accelerate a fire. The most widely used analytical method to determine the presence of such accelerants consists of a pre-concentration step of the ignitable liquid residues followed by chromatographic analysis. A rapid analytical method based on headspace-mass spectrometry electronic nose (E-Nose) has been developed for the analysis of Ignitable Liquid Residues (ILRs). The working conditions for the E-Nose analytical procedure were optimized by studying different fire debris samples. The optimized experimental variables were related to headspace generation, specifically, incubation temperature and incubation time. The optimal conditions were 115 °C and 10 min for these two parameters. Chemometric tools such as hierarchical cluster analysis (HCA) and linear discriminant analysis (LDA) were applied to the MS data (45–200 m/z) to establish the most suitable spectroscopic signals for the discrimination of several ignitable liquids. The optimized method was applied to a set of fire debris samples. In order to simulate post-burn samples several ignitable liquids (gasoline, diesel, citronella, kerosene, paraffin) were used to ignite different substrates (wood, cotton, cork, paper and paperboard). A full discrimination was obtained on using discriminant analysis. This method reported here can be considered as a green technique for fire debris analyses. PMID:27187407
Liu, Xiang; Guo, Ling-Peng; Zhang, Fei-Yun; Ma, Jie; Mu, Shu-Yong; Zhao, Xin; Li, Lan-Hai
2015-02-01
Eight physical and chemical indicators related to water quality were monitored from nineteen sampling sites along the Kunes River at the end of snowmelt season in spring. To investigate the spatial distribution characteristics of water physical and chemical properties, cluster analysis (CA), discriminant analysis (DA) and principal component analysis (PCA) are employed. The result of cluster analysis showed that the Kunes River could be divided into three reaches according to the similarities of water physical and chemical properties among sampling sites, representing the upstream, midstream and downstream of the river, respectively; The result of discriminant analysis demonstrated that the reliability of such a classification was high, and DO, Cl- and BOD5 were the significant indexes leading to this classification; Three principal components were extracted on the basis of the principal component analysis, in which accumulative variance contribution could reach 86.90%. The result of principal component analysis also indicated that water physical and chemical properties were mostly affected by EC, ORP, NO3(-) -N, NH4(+) -N, Cl- and BOD5. The sorted results of principal component scores in each sampling sites showed that the water quality was mainly influenced by DO in upstream, by pH in midstream, and by the rest of indicators in downstream. The order of comprehensive scores for principal components revealed that the water quality degraded from the upstream to downstream, i.e., the upstream had the best water quality, followed by the midstream, while the water quality at downstream was the worst. This result corresponded exactly to the three reaches classified using cluster analysis. Anthropogenic activity and the accumulation of pollutants along the river were probably the main reasons leading to this spatial difference.
A 6-gene signature identifies four molecular subgroups of neuroblastoma
2011-01-01
Background There are currently three postulated genomic subtypes of the childhood tumour neuroblastoma (NB); Type 1, Type 2A, and Type 2B. The most aggressive forms of NB are characterized by amplification of the oncogene MYCN (MNA) and low expression of the favourable marker NTRK1. Recently, mutations or high expression of the familial predisposition gene Anaplastic Lymphoma Kinase (ALK) was associated to unfavourable biology of sporadic NB. Also, various other genes have been linked to NB pathogenesis. Results The present study explores subgroup discrimination by gene expression profiling using three published microarray studies on NB (47 samples). Four distinct clusters were identified by Principal Components Analysis (PCA) in two separate data sets, which could be verified by an unsupervised hierarchical clustering in a third independent data set (101 NB samples) using a set of 74 discriminative genes. The expression signature of six NB-associated genes ALK, BIRC5, CCND1, MYCN, NTRK1, and PHOX2B, significantly discriminated the four clusters (p < 0.05, one-way ANOVA test). PCA clusters p1, p2, and p3 were found to correspond well to the postulated subtypes 1, 2A, and 2B, respectively. Remarkably, a fourth novel cluster was detected in all three independent data sets. This cluster comprised mainly 11q-deleted MNA-negative tumours with low expression of ALK, BIRC5, and PHOX2B, and was significantly associated with higher tumour stage, poor outcome and poor survival compared to the Type 1-corresponding favourable group (INSS stage 4 and/or dead of disease, p < 0.05, Fisher's exact test). Conclusions Based on expression profiling we have identified four molecular subgroups of neuroblastoma, which can be distinguished by a 6-gene signature. The fourth subgroup has not been described elsewhere, and efforts are currently made to further investigate this group's specific characteristics. PMID:21492432
Rezzonico, Fabio; Braun-Kiewnick, Andrea; Mann, Rachel A; Rodoni, Brendan; Goesmann, Alexander; Duffy, Brion; Smits, Theo H M
2012-10-01
Comparative genomic analysis revealed differences in the lipopolysaccharide (LPS) biosynthesis gene cluster between the Rubus-infecting strain ATCC BAA-2158 and the Spiraeoideae-infecting strain CFBP 1430 of Erwinia amylovora. These differences corroborate rpoB-based phylogenetic clustering of E. amylovora into four different groups and enable the discrimination of Spiraeoideae- and Rubus-infecting strains. The structure of the differences between the two groups supports the hypothesis that adaptation to Rubus spp. took place after species separation of E. amylovora and E. pyrifoliae that contrasts with a recently proposed scenario, based on CRISPR data, in which the shift to domesticated apple would have caused an evolutionary bottleneck in the Spiraeoideae-infecting strains of E. amylovora which would be a much earlier event. In the core region of the LPS biosynthetic gene cluster, Spiraeoideae-infecting strains encode three glycosyltransferases and an LPS ligase (Spiraeoideae-type waaL), whereas Rubus-infecting strains encode two glycosyltransferases and a different LPS ligase (Rubus-type waaL). These coding domains share little to no homology at the amino acid level between Rubus- and Spiraeoideae-infecting strains, and this genotypic difference was confirmed by polymerase chain reaction analysis of the associated DNA region in 31 Rubus- and Spiraeoideae-infecting strains. The LPS biosynthesis gene cluster may thus be used as a molecular marker to distinguish between Rubus- and Spiraeoideae-infecting strains of E. amylovora using primers designed in this study. © 2012 THE AUTHORS. MOLECULAR PLANT PATHOLOGY © 2012 BSPP AND BLACKWELL PUBLISHING LTD.
Bayesian multivariate hierarchical transformation models for ROC analysis.
O'Malley, A James; Zou, Kelly H
2006-02-15
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis
O'Malley, A. James; Zou, Kelly H.
2006-01-01
SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
Characterization of spatial and temporal variability in hydrochemistry of Johor Straits, Malaysia.
Abdullah, Pauzi; Abdullah, Sharifah Mastura Syed; Jaafar, Othman; Mahmud, Mastura; Khalik, Wan Mohd Afiq Wan Mohd
2015-12-15
Characterization of hydrochemistry changes in Johor Straits within 5 years of monitoring works was successfully carried out. Water quality data sets (27 stations and 19 parameters) collected in this area were interpreted subject to multivariate statistical analysis. Cluster analysis grouped all the stations into four clusters ((Dlink/Dmax) × 100<90) and two clusters ((Dlink/Dmax) × 100<80) for site and period similarities. Principal component analysis rendered six significant components (eigenvalue>1) that explained 82.6% of the total variance of the data set. Classification matrix of discriminant analysis assigned 88.9-92.6% and 83.3-100% correctness in spatial and temporal variability, respectively. Times series analysis then confirmed that only four parameters were not significant over time change. Therefore, it is imperative that the environmental impact of reclamation and dredging works, municipal or industrial discharge, marine aquaculture and shipping activities in this area be effectively controlled and managed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Rinaldi, Maurizio; Gindro, Roberto; Barbeni, Massimo; Allegrone, Gianna
2009-01-01
Orange (Citrus sinensis L.) juice comprises a complex mixture of volatile components that are difficult to identify and quantify. Classification and discrimination of the varieties on the basis of the volatile composition could help to guarantee the quality of a juice and to detect possible adulteration of the product. To provide information on the amounts of volatile constituents in fresh-squeezed juices from four orange cultivars and to establish suitable discrimination rules to differentiate orange juices using new chemometric approaches. Fresh juices of four orange cultivars were analysed by headspace solid-phase microextraction (HS-SPME) coupled with GC-MS. Principal component analysis, linear discriminant analysis and heuristic methods, such as neural networks, allowed clustering of the data from HS-SPME analysis while genetic algorithms addressed the problem of data reduction. To check the quality of the results the chemometric techniques were also evaluated on a sample. Thirty volatile compounds were identified by HS-SPME and GC-MS analyses and their relative amounts calculated. Differences in composition of orange juice volatile components were observed. The chosen orange cultivars could be discriminated using neural networks, genetic relocation algorithms and linear discriminant analysis. Genetic algorithms applied to the data were also able to detect the most significant compounds. SPME is a useful technique to investigate orange juice volatile composition and a flexible chemometric approach is able to correctly separate the juices.
A Quantitative Analysis of Pulsed Signals Emitted by Wild Bottlenose Dolphins.
Luís, Ana Rita; Couchinho, Miguel N; Dos Santos, Manuel E
2016-01-01
Common bottlenose dolphins (Tursiops truncatus), produce a wide variety of vocal emissions for communication and echolocation, of which the pulsed repertoire has been the most difficult to categorize. Packets of high repetition, broadband pulses are still largely reported under a general designation of burst-pulses, and traditional attempts to classify these emissions rely mainly in their aural characteristics and in graphical aspects of spectrograms. Here, we present a quantitative analysis of pulsed signals emitted by wild bottlenose dolphins, in the Sado estuary, Portugal (2011-2014), and test the reliability of a traditional classification approach. Acoustic parameters (minimum frequency, maximum frequency, peak frequency, duration, repetition rate and inter-click-interval) were extracted from 930 pulsed signals, previously categorized using a traditional approach. Discriminant function analysis revealed a high reliability of the traditional classification approach (93.5% of pulsed signals were consistently assigned to their aurally based categories). According to the discriminant function analysis (Wilk's Λ = 0.11, F3, 2.41 = 282.75, P < 0.001), repetition rate is the feature that best enables the discrimination of different pulsed signals (structure coefficient = 0.98). Classification using hierarchical cluster analysis led to a similar categorization pattern: two main signal types with distinct magnitudes of repetition rate were clustered into five groups. The pulsed signals, here described, present significant differences in their time-frequency features, especially repetition rate (P < 0.001), inter-click-interval (P < 0.001) and duration (P < 0.001). We document the occurrence of a distinct signal type-short burst-pulses, and highlight the existence of a diverse repertoire of pulsed vocalizations emitted in graded sequences. The use of quantitative analysis of pulsed signals is essential to improve classifications and to better assess the contexts of emission, geographic variation and the functional significance of pulsed signals.
NASA Astrophysics Data System (ADS)
Shahrajabian, Maryam; Hormozi-Nezhad, M. Reza
2016-08-01
Array-based sensor is an interesting approach that suggests an alternative to expensive analytical methods. In this work, we introduce a novel, simple, and sensitive nanoparticle-based chemiluminescence (CL) sensor array for discrimination of biothiols (e.g., cysteine, glutathione and glutathione disulfide). The proposed CL sensor array is based on the CL efficiencies of four types of enhanced nanoparticle-based CL systems. The intensity of CL was altered to varying degrees upon interaction with biothiols, producing unique CL response patterns. These distinct CL response patterns were collected as “fingerprints” and were then identified through chemometric methods, including linear discriminant analysis (LDA) and hierarchical cluster analysis (HCA). The developed array was able to successfully differentiate between cysteine, glutathione and glutathione disulfide in a wide concentration range. Moreover, it was applied to distinguish among the above analytes in human plasma.
Kumar, Raj G; Rubin, Jonathan E; Berger, Rachel P; Kochanek, Patrick M; Wagner, Amy K
2016-03-01
Studies have characterized absolute levels of multiple inflammatory markers as significant risk factors for poor outcomes after traumatic brain injury (TBI). However, inflammatory marker concentrations are highly inter-related, and production of one may result in the production or regulation of another. Therefore, a more comprehensive characterization of the inflammatory response post-TBI should consider relative levels of markers in the inflammatory pathway. We used principal component analysis (PCA) as a dimension-reduction technique to characterize the sets of markers that contribute independently to variability in cerebrospinal (CSF) inflammatory profiles after TBI. Using PCA results, we defined groups (or clusters) of individuals (n=111) with similar patterns of acute CSF inflammation that were then evaluated in the context of outcome and other relevant CSF and serum biomarkers collected days 0-3 and 4-5 post-injury. We identified four significant principal components (PC1-PC4) for CSF inflammation from days 0-3, and PC1 accounted for the greatest (31%) percentage of variance. PC1 was characterized by relatively higher CSF sICAM-1, sFAS, IL-10, IL-6, sVCAM-1, IL-5, and IL-8 levels. Cluster analysis then defined two distinct clusters, such that individuals in cluster 1 had highly positive PC1 scores and relatively higher levels of CSF cortisol, progesterone, estradiol, testosterone, brain derived neurotrophic factor (BDNF), and S100b; this group also had higher serum cortisol and lower serum BDNF. Multinomial logistic regression analyses showed that individuals in cluster 1 had a 10.9 times increased likelihood of GOS scores of 2/3 vs. 4/5 at 6 months compared to cluster 2, after controlling for covariates. Cluster group did not discriminate between mortality compared to GOS scores of 4/5 after controlling for age and other covariates. Cluster groupings also did not discriminate mortality or 12 month outcomes in multivariate models. PCA and cluster analysis establish that a subset of CSF inflammatory markers measured in days 0-3 post-TBI may distinguish individuals with poor 6-month outcome, and future studies should prospectively validate these findings. PCA of inflammatory mediators after TBI could aid in prognostication and in identifying patient subgroups for therapeutic interventions. Copyright © 2015 Elsevier Inc. All rights reserved.
Zianni, Michael R; Nikbakhtzadeh, Mahmood R; Jackson, Bryan T; Panescu, Jenny; Foster, Woodbridge A
2013-04-01
There is a need for more cost-effective options to more accurately discriminate among members of the Anopheles gambiae complex, particularly An. gambiae and Anopheles arabiensis. These species are morphologically indistinguishable in the adult stage, have overlapping distributions, but are behaviorally and ecologically different, yet both are efficient vectors of malaria in equatorial Africa. The method described here, High-Resolution Melt (HRM) analysis, takes advantage of minute differences in DNA melting characteristics, depending on the number of incongruent single nucleotide polymorphisms in an intragenic spacer region of the X-chromosome-based ribosomal DNA. The two species in question differ by an average of 13 single-nucleotide polymorphisms giving widely divergent melting curves. A real-time PCR system, Bio-Rad CFX96, was used in combination with a dsDNA-specific dye, EvaGreen, to detect and measure the melting properties of the amplicon generated from leg-extracted DNA of selected mosquitoes. Results with seven individuals from pure colonies of known species, as well as 10 field-captured individuals unambiguously identified by DNA sequencing, demonstrated that the method provided a high level of accuracy. The method was used to identify 86 field mosquitoes through the assignment of each to the two common clusters with a high degree of certainty. Each cluster was defined by individuals from pure colonies. HRM analysis is simpler to use than most other methods and provides comparable or more accurate discrimination between the two sibling species but requires a specialized melt-analysis instrument and software.
Zianni, Michael R.; Nikbakhtzadeh, Mahmood R.; Jackson, Bryan T.; Panescu, Jenny; Foster, Woodbridge A.
2013-01-01
There is a need for more cost-effective options to more accurately discriminate among members of the Anopheles gambiae complex, particularly An. gambiae and Anopheles arabiensis. These species are morphologically indistinguishable in the adult stage, have overlapping distributions, but are behaviorally and ecologically different, yet both are efficient vectors of malaria in equatorial Africa. The method described here, High-Resolution Melt (HRM) analysis, takes advantage of minute differences in DNA melting characteristics, depending on the number of incongruent single nucleotide polymorphisms in an intragenic spacer region of the X-chromosome-based ribosomal DNA. The two species in question differ by an average of 13 single-nucleotide polymorphisms giving widely divergent melting curves. A real-time PCR system, Bio-Rad CFX96, was used in combination with a dsDNA-specific dye, EvaGreen, to detect and measure the melting properties of the amplicon generated from leg-extracted DNA of selected mosquitoes. Results with seven individuals from pure colonies of known species, as well as 10 field-captured individuals unambiguously identified by DNA sequencing, demonstrated that the method provided a high level of accuracy. The method was used to identify 86 field mosquitoes through the assignment of each to the two common clusters with a high degree of certainty. Each cluster was defined by individuals from pure colonies. HRM analysis is simpler to use than most other methods and provides comparable or more accurate discrimination between the two sibling species but requires a specialized melt-analysis instrument and software. PMID:23543777
Improving Fraud and Abuse Detection in General Physician Claims: A Data Mining Study
Joudaki, Hossein; Rashidian, Arash; Minaei-Bidgoli, Behrouz; Mahmoodi, Mahmood; Geraili, Bijan; Nasiri, Mahdi; Arab, Mohammad
2016-01-01
Background: We aimed to identify the indicators of healthcare fraud and abuse in general physicians’ drug prescription claims, and to identify a subset of general physicians that were more likely to have committed fraud and abuse. Methods: We applied data mining approach to a major health insurance organization dataset of private sector general physicians’ prescription claims. It involved 5 steps: clarifying the nature of the problem and objectives, data preparation, indicator identification and selection, cluster analysis to identify suspect physicians, and discriminant analysis to assess the validity of the clustering approach. Results: Thirteen indicators were developed in total. Over half of the general physicians (54%) were ‘suspects’ of conducting abusive behavior. The results also identified 2% of physicians as suspects of fraud. Discriminant analysis suggested that the indicators demonstrated adequate performance in the detection of physicians who were suspect of perpetrating fraud (98%) and abuse (85%) in a new sample of data. Conclusion: Our data mining approach will help health insurance organizations in low-and middle-income countries (LMICs) in streamlining auditing approaches towards the suspect groups rather than routine auditing of all physicians. PMID:26927587
Improving Fraud and Abuse Detection in General Physician Claims: A Data Mining Study.
Joudaki, Hossein; Rashidian, Arash; Minaei-Bidgoli, Behrouz; Mahmoodi, Mahmood; Geraili, Bijan; Nasiri, Mahdi; Arab, Mohammad
2015-11-10
We aimed to identify the indicators of healthcare fraud and abuse in general physicians' drug prescription claims, and to identify a subset of general physicians that were more likely to have committed fraud and abuse. We applied data mining approach to a major health insurance organization dataset of private sector general physicians' prescription claims. It involved 5 steps: clarifying the nature of the problem and objectives, data preparation, indicator identification and selection, cluster analysis to identify suspect physicians, and discriminant analysis to assess the validity of the clustering approach. Thirteen indicators were developed in total. Over half of the general physicians (54%) were 'suspects' of conducting abusive behavior. The results also identified 2% of physicians as suspects of fraud. Discriminant analysis suggested that the indicators demonstrated adequate performance in the detection of physicians who were suspect of perpetrating fraud (98%) and abuse (85%) in a new sample of data. Our data mining approach will help health insurance organizations in low-and middle-income countries (LMICs) in streamlining auditing approaches towards the suspect groups rather than routine auditing of all physicians. © 2016 by Kerman University of Medical Sciences.
ERIC Educational Resources Information Center
Carson, Andrew D.; Bizot, Elizabeth B.; Hendershot, Peggy E.; Barton, Margaret G.; Garvin, Mary K.; Kraemer, Barbara
1999-01-01
Career recommendations were made based on aptitude scores of 335 high school freshmen. Artificial neural networks were used to map recommendations to 12 occupational clusters. Overall accuracy of neural networks (.80) approached that of discriminant function analysis (.84). The two methods had different strengths and weaknesses. (SK)
Patterns of Parenting Behavior in Young Mothers.
ERIC Educational Resources Information Center
Whiteside-Mansell, Leanne; And Others
1996-01-01
Assessed the parenting behaviors of 193 white and African American mothers 15-24 years of age when their children were 12 and 36 months old. Cluster analysis of three dimensions of parenting was used to identify five types of parenting patterns. The strongest discriminating factor--maternal IQ--was associated with more positive parenting behavior…
NASA Astrophysics Data System (ADS)
Farics, Éva; Farics, Dávid; Kovács, József; Haas, János
2017-10-01
The main aim of this paper is to determine the depositional environments of an Upper-Eocene coarse-grained clastic succession in the Buda Hills, Hungary. First of all, we measured some commonly used parameters of samples (size, amount, roundness and sphericity) in a much more objective overall and faster way than with traditional measurement approaches, using the newly developed Rock Analyst application. For the multivariate data obtained, we applied Combined Cluster and Discriminant Analysis (CCDA) in order to determine homogeneous groups of the sampling locations based on the quantitative composition of the conglomerate as well as the shape parameters (roundness and sphericity). The result is the spatial pattern of these groups, which assists with the interpretation of the depositional processes. According to our concept, those sampling sites which belong to the same homogeneous groups were likely formed under similar geological circumstances and by similar geological processes. In the Buda Hills, we were able to distinguish various sedimentological environments within the area based on the results: fan, intermittent stream or marine.
Young swimmers' classification based on kinematics, hydrodynamics, and anthropometrics.
Barbosa, Tiago M; Morais, Jorge E; Costa, Mário J; Goncalves, José; Marinho, Daniel A; Silva, António J
2014-04-01
The aim of this article has been to classify swimmers based on kinematics, hydrodynamics, and anthropometrics. Sixty-seven young swimmers made a maximal 25 m front-crawl to measure with a speedometer the swimming velocity (v), speed-fluctuation (dv) and dv normalized to v (dv/v). Another two 25 m bouts with and without carrying a perturbation device were made to estimate active drag coefficient (CDa). Trunk transverse surface area (S) was measured with photogrammetric technique on land and in the hydrodynamic position. Cluster 1 was related to swimmers with a high speed fluctuation (ie, dv and dv/v), cluster 2 with anthropometrics (ie, S) and cluster 3 with a high hydrodynamic profile (ie, CDa). The variable that seems to discriminate better the clusters was the dv/v (F=53.680; P<.001), followed by the dv (F=28.506; P<.001), CDa (F=21.025; P<.001), S (F=6.297; P<.01) and v (F=5.375; P=.01). Stepwise discriminant analysis extracted 2 functions: Function 1 was mainly defined by dv/v and S (74.3% of variance), whereas function 2 was mainly defined by CDa (25.7% of variance). It can be concluded that kinematics, hydrodynamics and anthropometrics are determinant domains in which to classify and characterize young swimmers' profiles.
Eye-gaze determination of user intent at the computer interface
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goldberg, J.H.; Schryver, J.C.
1993-12-31
Determination of user intent at the computer interface through eye-gaze monitoring can significantly aid applications for the disabled, as well as telerobotics and process control interfaces. Whereas current eye-gaze control applications are limited to object selection and x/y gazepoint tracking, a methodology was developed here to discriminate a more abstract interface operation: zooming-in or out. This methodology first collects samples of eve-gaze location looking at controlled stimuli, at 30 Hz, just prior to a user`s decision to zoom. The sample is broken into data frames, or temporal snapshots. Within a data frame, all spatial samples are connected into a minimummore » spanning tree, then clustered, according to user defined parameters. Each cluster is mapped to one in the prior data frame, and statistics are computed from each cluster. These characteristics include cluster size, position, and pupil size. A multiple discriminant analysis uses these statistics both within and between data frames to formulate optimal rules for assigning the observations into zooming, zoom-out, or no zoom conditions. The statistical procedure effectively generates heuristics for future assignments, based upon these variables. Future work will enhance the accuracy and precision of the modeling technique, and will empirically test users in controlled experiments.« less
Xue, Gang; Song, Wen-qi; Li, Shu-chao
2015-01-01
In order to achieve the rapid identification of fire resistive coating for steel structure of different brands in circulating, a new method for the fast discrimination of varieties of fire resistive coating for steel structure by means of near infrared spectroscopy was proposed. The raster scanning near infrared spectroscopy instrument and near infrared diffuse reflectance spectroscopy were applied to collect the spectral curve of different brands of fire resistive coating for steel structure and the spectral data were preprocessed with standard normal variate transformation(standard normal variate transformation, SNV) and Norris second derivative. The principal component analysis (principal component analysis, PCA)was used to near infrared spectra for cluster analysis. The analysis results showed that the cumulate reliabilities of PC1 to PC5 were 99. 791%. The 3-dimentional plot was drawn with the scores of PC1, PC2 and PC3 X 10, which appeared to provide the best clustering of the varieties of fire resistive coating for steel structure. A total of 150 fire resistive coating samples were divided into calibration set and validation set randomly, the calibration set had 125 samples with 25 samples of each variety, and the validation set had 25 samples with 5 samples of each variety. According to the principal component scores of unknown samples, Mahalanobis distance values between each variety and unknown samples were calculated to realize the discrimination of different varieties. The qualitative analysis model for external verification of unknown samples is a 10% recognition ration. The results demonstrated that this identification method can be used as a rapid, accurate method to identify the classification of fire resistive coating for steel structure and provide technical reference for market regulation.
Shared and Distinct Rupture Discriminants of Small and Large Intracranial Aneurysms.
Varble, Nicole; Tutino, Vincent M; Yu, Jihnhee; Sonig, Ashish; Siddiqui, Adnan H; Davies, Jason M; Meng, Hui
2018-04-01
Many ruptured intracranial aneurysms (IAs) are small. Clinical presentations suggest that small and large IAs could have different phenotypes. It is unknown if small and large IAs have different characteristics that discriminate rupture. We analyzed morphological, hemodynamic, and clinical parameters of 413 retrospectively collected IAs (training cohort; 102 ruptured IAs). Hierarchal cluster analysis was performed to determine a size cutoff to dichotomize the IA population into small and large IAs. We applied multivariate logistic regression to build rupture discrimination models for small IAs, large IAs, and an aggregation of all IAs. We validated the ability of these 3 models to predict rupture status in a second, independently collected cohort of 129 IAs (testing cohort; 14 ruptured IAs). Hierarchal cluster analysis in the training cohort confirmed that small and large IAs are best separated at 5 mm based on morphological and hemodynamic features (area under the curve=0.81). For small IAs (<5 mm), the resulting rupture discrimination model included undulation index, oscillatory shear index, previous subarachnoid hemorrhage, and absence of multiple IAs (area under the curve=0.84; 95% confidence interval, 0.78-0.88), whereas for large IAs (≥5 mm), the model included undulation index, low wall shear stress, previous subarachnoid hemorrhage, and IA location (area under the curve=0.87; 95% confidence interval, 0.82-0.93). The model for the aggregated training cohort retained all the parameters in the size-dichotomized models. Results in the testing cohort showed that the size-dichotomized rupture discrimination model had higher sensitivity (64% versus 29%) and accuracy (77% versus 74%), marginally higher area under the curve (0.75; 95% confidence interval, 0.61-0.88 versus 0.67; 95% confidence interval, 0.52-0.82), and similar specificity (78% versus 80%) compared with the aggregate-based model. Small (<5 mm) and large (≥5 mm) IAs have different hemodynamic and clinical, but not morphological, rupture discriminants. Size-dichotomized rupture discrimination models performed better than the aggregate model. © 2018 American Heart Association, Inc.
Shen, Shi; Wang, Jingbo; Zhuo, Qin; Chen, Xi; Liu, Tingting; Zhang, Shuang-Qing
2018-05-08
Phenolics and flavonoids in honey are considered as the main phytonutrients which not only act as natural antioxidants, but can also be used as floral markers for honey identification. In this study, the chemical profiles of phenolics and flavonoids, antioxidant competences including total phenolic content, DPPH and ABTS assays and discrimination using chemometric analysis of various Chinese monofloral honeys from six botanical origins (acacia, Vitex , linden, rapeseed, Astragalus and Codonopsis ) were examined. A reproducible and sensitive ultra-performance liquid chromatography-tandem mass spectrometry (UPLC-MS/MS) method was optimized and validated for the simultaneous determination of 38 phenolics, flavonoids and abscisic acid in honey. Formononetin, ononin, calycosin and calycosin-7- O -β-d-glucoside were identified and quantified in honeys for the first time. Principal component analysis (PCA) showed obvious differences among the honey samples in three-dimensional space accounting for 72.63% of the total variance. Hierarchical cluster analysis (HCA) also revealed that the botanical origins of honey samples correlated with their phenolic and flavonoid contents. Partial least squares-discriminant analysis (PLS-DA) classification was performed to derive a model with high prediction ability. Orthogonal partial least squares-discriminant analysis (OPLS-DA) model was employed to identify markers specific to a particular honey type. The results indicated that Chinese honeys contained various and discriminative phenolics and flavonoids, as well as antioxidant competence from different botanical origins, which was an alternative approach to honey identification and nutritional evaluation.
Micro-Raman spectroscopy of natural and synthetic indigo samples.
Vandenabeele, Peter; Moens, Luc
2003-02-01
In this work indigo samples from three different sources are studied by using Raman spectroscopy: the synthetic pigment and pigments from the woad (Isatis tinctoria) and the indigo plant (Indigofera tinctoria). 21 samples were obtained from 8 suppliers; for each sample 5 Raman spectra were recorded and used for further chemometrical analysis. Principal components analysis (PCA) was performed as data reduction method before applying hierarchical cluster analysis. Linear discriminant analysis (LDA) was implemented as a non-hierarchical supervised pattern recognition method to build a classification model. In order to avoid broad-shaped interferences from the fluorescence background, the influence of 1st and 2nd derivatives on the classification was studied by using cross-validation. Although chemically identical, it is shown that Raman spectroscopy in combination with suitable chemometric methods has the potential to discriminate between synthetic and natural indigo samples.
Zhou, Fei; Zhao, Yajing; Peng, Jiyu; Jiang, Yirong; Li, Maiquan; Jiang, Yuan; Lu, Baiyi
2017-07-01
Osmanthus fragrans flowers are used as folk medicine and additives for teas, beverages and foods. The metabolites of O. fragrans flowers from different geographical origins were inconsistent in some extent. Chromatography and mass spectrometry combined with multivariable analysis methods provides an approach for discriminating the origin of O. fragrans flowers. To discriminate the Osmanthus fragrans var. thunbergii flowers from different origins with the identified metabolites. GC-MS and UPLC-PDA were conducted to analyse the metabolites in O. fragrans var. thunbergii flowers (in total 150 samples). Principal component analysis (PCA), soft independent modelling of class analogy analysis (SIMCA) and random forest (RF) analysis were applied to group the GC-MS and UPLC-PDA data. GC-MS identified 32 compounds common to all samples while UPLC-PDA/QTOF-MS identified 16 common compounds. PCA of the UPLC-PDA data generated a better clustering than PCA of the GC-MS data. Ten metabolites (six from GC-MS and four from UPLC-PDA) were selected as effective compounds for discrimination by PCA loadings. SIMCA and RF analysis were used to build classification models, and the RF model, based on the four effective compounds (caffeic acid derivative, acteoside, ligustroside and compound 15), yielded better results with the classification rate of 100% in the calibration set and 97.8% in the prediction set. GC-MS and UPLC-PDA combined with multivariable analysis methods can discriminate the origin of Osmanthus fragrans var. thunbergii flowers. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Assessment of Depression in a Rodent Model of Spinal Cord Injury
Luedtke, Kelsey; Bouchard, Sioui Maldonado; Woller, Sarah A.; Funk, Mary Katherine; Aceves, Miriam
2014-01-01
Abstract Despite an increased incidence of depression in patients after spinal cord injury (SCI), there is no animal model of depression after SCI. To address this, we used a battery of established tests to assess depression after a rodent contusion injury. Subjects were acclimated to the tasks, and baseline scores were collected before SCI. Testing was conducted on days 9–10 (acute) and 19–20 (chronic) postinjury. To categorize depression, subjects' scores on each behavioral measure were averaged across the acute and chronic stages of injury and subjected to a principal component analysis. This analysis revealed a two-component structure, which explained 72.2% of between-subjects variance. The data were then analyzed with a hierarchical cluster analysis, identifying two clusters that differed significantly on the sucrose preference, open field, social exploration, and burrowing tasks. One cluster (9 of 26 subjects) displayed characteristics of depression. Using these data, a discriminant function analysis was conducted to derive an equation that could classify subjects as “depressed” on days 9–10. The discriminant function was used in a second experiment examining whether the depression-like symptoms could be reversed with the antidepressant, fluoxetine. Fluoxetine significantly decreased immobility in the forced swim test (FST) in depressed subjects identified with the equation. Subjects that were depressed and treated with saline displayed significantly increased immobility on the FST, relative to not depressed, saline-treated controls. These initial experiments validate our tests of depression, generating a powerful model system for further understanding the relationships between molecular changes induced by SCI and the development of depression. PMID:24564232
Masiol, Mauro; Centanni, Elena; Squizzato, Stefania; Hofer, Angelika; Pecorari, Eliana; Rampazzo, Giancarlo; Pavoni, Bruno
2012-09-01
This study presents a procedure to differentiate the local and remote sources of particulate-bound polycyclic aromatic hydrocarbons (PAHs). Data were collected during an extended PM(2.5) sampling campaign (2009-2010) carried out for 1 year in Venice-Mestre, Italy, at three stations with different emissive scenarios: urban, industrial, and semirural background. Diagnostic ratios and factor analysis were initially applied to point out the most probable sources. In a second step, the areal distribution of the identified sources was studied by applying the discriminant analysis on factor scores. Third, samples collected in days with similar atmospheric circulation patterns were grouped using a cluster analysis on wind data. Local contributions to PM(2.5) and PAHs were then assessed by interpreting cluster results with chemical data. Results evidenced that significantly lower levels of PM(2.5) and PAHs were found when faster winds changed air masses, whereas in presence of scarce ventilation, locally emitted pollutants were trapped and concentrations increased. This way, an estimation of pollutant loads due to local sources can be derived from data collected in days with similar wind patterns. Long-range contributions were detected by a cluster analysis on the air mass back-trajectories. Results revealed that PM(2.5) concentrations were relatively high when air masses had passed over the Po Valley. However, external sources do not significantly contribute to the PAHs load. The proposed procedure can be applied to other environments with minor modifications, and the obtained information can be useful to design local and national air pollution control strategies.
Pang, Yuanjie; Peng, Roger D; Jones, Miranda R; Francesconi, Kevin A; Goessler, Walter; Howard, Barbara V; Umans, Jason G; Best, Lyle G; Guallar, Eliseo; Post, Wendy S; Kaufman, Joel D; Vaidya, Dhananjay; Navas-Acien, Ana
2016-05-01
Natural and anthropogenic sources of metal exposure differ for urban and rural residents. We searched to identify patterns of metal mixtures which could suggest common environmental sources and/or metabolic pathways of different urinary metals, and compared metal-mixtures in two population-based studies from urban/sub-urban and rural/town areas in the US: the Multi-Ethnic Study of Atherosclerosis (MESA) and the Strong Heart Study (SHS). We studied a random sample of 308 White, Black, Chinese-American, and Hispanic participants in MESA (2000-2002) and 277 American Indian participants in SHS (1998-2003). We used principal component analysis (PCA), cluster analysis (CA), and linear discriminant analysis (LDA) to evaluate nine urinary metals (antimony [Sb], arsenic [As], cadmium [Cd], lead [Pb], molybdenum [Mo], selenium [Se], tungsten [W], uranium [U] and zinc [Zn]). For arsenic, we used the sum of inorganic and methylated species (∑As). All nine urinary metals were higher in SHS compared to MESA participants. PCA and CA revealed the same patterns in SHS, suggesting 4 distinct principal components (PC) or clusters (∑As-U-W, Pb-Sb, Cd-Zn, Mo-Se). In MESA, CA showed 2 large clusters (∑As-Mo-Sb-U-W, Cd-Pb-Se-Zn), while PCA showed 4 PCs (Sb-U-W, Pb-Se-Zn, Cd-Mo, ∑As). LDA indicated that ∑As, U, W, and Zn were the most discriminant variables distinguishing MESA and SHS participants. In SHS, the ∑As-U-W cluster and PC might reflect groundwater contamination in rural areas, and the Cd-Zn cluster and PC could reflect common sources from meat products or metabolic interactions. Among the metals assayed, ∑As, U, W and Zn differed the most between MESA and SHS, possibly reflecting disproportionate exposure from drinking water and perhaps food in rural Native communities compared to urban communities around the US. Copyright © 2016 Elsevier Inc. All rights reserved.
Pattern Activity Clustering and Evaluation (PACE)
NASA Astrophysics Data System (ADS)
Blasch, Erik; Banas, Christopher; Paul, Michael; Bussjager, Becky; Seetharaman, Guna
2012-06-01
With the vast amount of network information available on activities of people (i.e. motions, transportation routes, and site visits) there is a need to explore the salient properties of data that detect and discriminate the behavior of individuals. Recent machine learning approaches include methods of data mining, statistical analysis, clustering, and estimation that support activity-based intelligence. We seek to explore contemporary methods in activity analysis using machine learning techniques that discover and characterize behaviors that enable grouping, anomaly detection, and adversarial intent prediction. To evaluate these methods, we describe the mathematics and potential information theory metrics to characterize behavior. A scenario is presented to demonstrate the concept and metrics that could be useful for layered sensing behavior pattern learning and analysis. We leverage work on group tracking, learning and clustering approaches; as well as utilize information theoretical metrics for classification, behavioral and event pattern recognition, and activity and entity analysis. The performance evaluation of activity analysis supports high-level information fusion of user alerts, data queries and sensor management for data extraction, relations discovery, and situation analysis of existing data.
Soybean varieties discrimination using non-imaging hyperspectral sensor
NASA Astrophysics Data System (ADS)
da Silva Junior, Carlos Antonio; Nanni, Marcos Rafael; Shakir, Muhammad; Teodoro, Paulo Eduardo; de Oliveira-Júnior, José Francisco; Cezar, Everson; de Gois, Givanildo; Lima, Mendelson; Wojciechowski, Julio Cesar; Shiratsuchi, Luciano Shozo
2018-03-01
Infrared region of electromagnetic spectrum has remarkable applications in crop studies. Infrared along with Red band has been used to develop certain vegetation indices. These indices like NDVI, EVI provide important information on any crop physiological stages. The main objective of this research was to discriminate 4 different soybean varieties (BMX Potência, NA5909, FT Campo Mourão and Don Mario) using non-imaging hyperspectral sensor. The study was conducted in four agricultural areas in the municipality of Deodápolis (MS), Brazil. For spectral analysis, 2400 field samples were taken from soybean leaves by means of FieldSpec 3 JR spectroradiometer in the range from 350 to 2500 nm. The data were evaluated through multivariate analysis with the whole set of spectral curves isolated by blue, green, red and near infrared wavelengths along with the addition of vegetation indices like (Enhanced Vegetation Index - EVI, Normalized Difference Vegetation Index - NDVI, Green Normalized Difference Vegetation Index - GNDVI, Soil-adjusted Vegetation Index - SAVI, Transformed Vegetation Index - TVI and Optimized Soil-Adjusted Vegetation Index - OSAVI). A number of the analysis performed where, discriminant (60 and 80% of the data), simulated discriminant (40 and 20% of data), principal component (PC) and cluster analysis (CA). Discriminant and simulated discriminant analyze presented satisfactory results, with average global hit rates of 99.28 and 98.77%, respectively. The results obtained by PC and CA revealed considerable associations between the evaluated variables and the varieties, which indicated that each variety has a variable that discriminates it more effectively in relation to the others. There was great variation in the sample size (number of leaves) for estimating the mean of variables. However, it was possible to observe that 200 leaves allow to obtain a maximum error of 2% in relation to the mean.
Potential of SNP markers for the characterization of Brazilian cassava germplasm.
de Oliveira, Eder Jorge; Ferreira, Cláudia Fortes; da Silva Santos, Vanderlei; de Jesus, Onildo Nunes; Oliveira, Gilmara Alvarenga Fachardo; da Silva, Maiane Suzarte
2014-06-01
High-throughput markers, such as SNPs, along with different methodologies were used to evaluate the applicability of the Bayesian approach and the multivariate analysis in structuring the genetic diversity in cassavas. The objective of the present work was to evaluate the diversity and genetic structure of the largest cassava germplasm bank in Brazil. Complementary methodological approaches such as discriminant analysis of principal components (DAPC), Bayesian analysis and molecular analysis of variance (AMOVA) were used to understand the structure and diversity of 1,280 accessions genotyped using 402 single nucleotide polymorphism markers. The genetic diversity (0.327) and the average observed heterozygosity (0.322) were high considering the bi-allelic markers. In terms of population, the presence of a complex genetic structure was observed indicating the formation of 30 clusters by DAPC and 34 clusters by Bayesian analysis. Both methodologies presented difficulties and controversies in terms of the allocation of some accessions to specific clusters. However, the clusters suggested by the DAPC analysis seemed to be more consistent for presenting higher probability of allocation of the accessions within the clusters. Prior information related to breeding patterns and geographic origins of the accessions were not sufficient for providing clear differentiation between the clusters according to the AMOVA analysis. In contrast, the F ST was maximized when considering the clusters suggested by the Bayesian and DAPC analyses. The high frequency of germplasm exchange between producers and the subsequent alteration of the name of the same material may be one of the causes of the low association between genetic diversity and geographic origin. The results of this study may benefit cassava germplasm conservation programs, and contribute to the maximization of genetic gains in breeding programs.
Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne
2015-10-01
The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking <300 minutes per week of moderate activity was significantly greater in cluster 1 than in clusters 2 and 3. Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups.
Serrano, M G; Camargo, E P; Teixeira, M M
1999-01-01
The random amplification of polymorphic DNA was used for easy, quick and sensitive assessment of genetic polymorphism within Phytomonas to discriminate isolates and determine genetic relationships within the genus. We examined 48 Phytomonas spp., 31 isolates from plants and 17 from insects, from different geographic regions. Topology of the dendrogram based on randomly amplified polymorphic DNA fingerprints segregated the Phytomonas spp. into 5 main clusters, despite the high genetic variability within this genus. Similar clustering could also be obtained by both visual and cross-hybridization analysis of randomly amplified synapomorphic DNA fragments. There was some concordance between the genetic relationship of isolates and their plant tissue tropism. Moreover, Phytomonas spp. from plants and insects were grouped according to geographic origin, thus revealing a complex structure of this taxon comprising several clusters of very closely related organisms.
Antoniewicz, Franziska; Brand, Ralf
2016-01-01
The aim of this study was to examine how automatic evaluations of exercising (AEE) varied according to adherence to an exercise program. Eighty-eight participants (24.98 years ± 6.88; 51.1% female) completed a Brief-Implicit Association Task assessing their AEE, positive and negative associations to exercising at the beginning of a 3-month exercise program. Attendance data were collected for all participants and used in a cluster analysis of adherence patterns. Three different adherence patterns (52 maintainers, 16 early dropouts, 20 late dropouts; 40.91% overall dropouts) were detected using cluster analyses. Participants from these three clusters differed significantly with regard to their positive and negative associations to exercising before the first course meeting ([Formula: see text] = 0.07). Discriminant function analyses revealed that positive associations to exercising was a particularly good discriminating factor. This is the first study to provide evidence of the differential impact of positive and negative associations on exercise behavior over the medium term. The findings contribute to theoretical understanding of evaluative processes from a dual-process perspective and may provide a basis for targeted interventions.
Antoniewicz, Franziska; Brand, Ralf
2016-01-01
The aim of this study was to examine how automatic evaluations of exercising (AEE) varied according to adherence to an exercise program. Eighty-eight participants (24.98 years ± 6.88; 51.1% female) completed a Brief-Implicit Association Task assessing their AEE, positive and negative associations to exercising at the beginning of a 3-month exercise program. Attendance data were collected for all participants and used in a cluster analysis of adherence patterns. Three different adherence patterns (52 maintainers, 16 early dropouts, 20 late dropouts; 40.91% overall dropouts) were detected using cluster analyses. Participants from these three clusters differed significantly with regard to their positive and negative associations to exercising before the first course meeting (ηp2 = 0.07). Discriminant function analyses revealed that positive associations to exercising was a particularly good discriminating factor. This is the first study to provide evidence of the differential impact of positive and negative associations on exercise behavior over the medium term. The findings contribute to theoretical understanding of evaluative processes from a dual-process perspective and may provide a basis for targeted interventions. PMID:27313559
Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian
2016-01-01
The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Technical and biological reproducibility ranged between 96.8-99.4% and 47.6-94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable.
Laursen, Jens; Milman, Nils; Pind, Niels; Pedersen, Henrik; Mulvad, Gert
2014-01-01
Meta-analysis of previous studies evaluating associations between content of elements sulphur (S), chlorine (Cl), potassium (K), iron (Fe), copper (Cu), zinc (Zn) and bromine (Br) in normal and cirrhotic autopsy liver tissue samples. Normal liver samples from 45 Greenlandic Inuit, median age 60 years and from 71 Danes, median age 61 years. Cirrhotic liver samples from 27 Danes, median age 71 years. Element content was measured using X-ray fluorescence spectrometry. Dual hierarchical clustering analysis, creating a dual dendrogram, one clustering element contents according to calculated similarities, one clustering elements according to correlation coefficients between the element contents, both using Euclidian distance and Ward Procedure. One dendrogram separated subjects in 7 clusters showing no differences in ethnicity, gender or age. The analysis discriminated between elements in normal and cirrhotic livers. The other dendrogram clustered elements in four clusters: sulphur and chlorine; copper and bromine; potassium and zinc; iron. There were significant correlations between the elements in normal liver samples: S was associated with Cl, K, Br and Zn; Cl with S and Br; K with S, Br and Zn; Cu with Br. Zn with S and K. Br with S, Cl, K and Cu. Fe did not show significant associations with any other element. In contrast to simple statistical methods, which analyses content of elements separately one by one, dual hierarchical clustering analysis incorporates all elements at the same time and can be used to examine the linkage and interplay between multiple elements in tissue samples. Copyright © 2013 Elsevier GmbH. All rights reserved.
Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat
2010-08-01
Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food.
Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat
2010-01-01
Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food. PMID:24575193
Autonomic specificity of basic emotions: evidence from pattern classification and cluster analysis.
Stephens, Chad L; Christie, Israel C; Friedman, Bruce H
2010-07-01
Autonomic nervous system (ANS) specificity of emotion remains controversial in contemporary emotion research, and has received mixed support over decades of investigation. This study was designed to replicate and extend psychophysiological research, which has used multivariate pattern classification analysis (PCA) in support of ANS specificity. Forty-nine undergraduates (27 women) listened to emotion-inducing music and viewed affective films while a montage of ANS variables, including heart rate variability indices, peripheral vascular activity, systolic time intervals, and electrodermal activity, were recorded. Evidence for ANS discrimination of emotion was found via PCA with 44.6% of overall observations correctly classified into the predicted emotion conditions, using ANS variables (z=16.05, p<.001). Cluster analysis of these data indicated a lack of distinct clusters, which suggests that ANS responses to the stimuli were nomothetic and stimulus-specific rather than idiosyncratic and individual-specific. Collectively these results further confirm and extend support for the notion that basic emotions have distinct ANS signatures. Copyright © 2010 Elsevier B.V. All rights reserved.
Discriminative clustering on manifold for adaptive transductive classification.
Zhang, Zhao; Jia, Lei; Zhang, Min; Li, Bing; Zhang, Li; Li, Fanzhang
2017-10-01
In this paper, we mainly propose a novel adaptive transductive label propagation approach by joint discriminative clustering on manifolds for representing and classifying high-dimensional data. Our framework seamlessly combines the unsupervised manifold learning, discriminative clustering and adaptive classification into a unified model. Also, our method incorporates the adaptive graph weight construction with label propagation. Specifically, our method is capable of propagating label information using adaptive weights over low-dimensional manifold features, which is different from most existing studies that usually predict the labels and construct the weights in the original Euclidean space. For transductive classification by our formulation, we first perform the joint discriminative K-means clustering and manifold learning to capture the low-dimensional nonlinear manifolds. Then, we construct the adaptive weights over the learnt manifold features, where the adaptive weights are calculated through performing the joint minimization of the reconstruction errors over features and soft labels so that the graph weights can be joint-optimal for data representation and classification. Using the adaptive weights, we can easily estimate the unknown labels of samples. After that, our method returns the updated weights for further updating the manifold features. Extensive simulations on image classification and segmentation show that our proposed algorithm can deliver the state-of-the-art performance on several public datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.
A novel method for qualitative analysis of edible oil oxidation using an electronic nose.
Xu, Lirong; Yu, Xiuzhu; Liu, Lei; Zhang, Rui
2016-07-01
An electronic nose (E-nose) was used for rapid assessment of the degree of oxidation in edible oils. Peroxide and acid values of edible oil samples were analyzed using data obtained by the American Oil Chemists' Society (AOCS) Official Method for reference. Qualitative discrimination between non-oxidized and oxidized oils was conducted using the E-nose technique developed in combination with cluster analysis (CA), principal component analysis (PCA), and linear discriminant analysis (LDA). The results from CA, PCA and LDA indicated that the E-nose technique could be used for differentiation of non-oxidized and oxidized oils. LDA produced slightly better results than CA and PCA. The proposed approach can be used as an alternative to AOCS Official Method as an innovative tool for rapid detection of edible oil oxidation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Yu, Chunhao; Wang, Chong-Zhi; Zhou, Chun-Jie; Wang, Bin; Han, Lide; Zhang, Chun-Feng; Wu, Xiao-Hui; Yuan, Chun-Su
2014-01-01
American ginseng (Panax quinquefolius) is originally grown in North America. Due to price difference and supply shortage, American ginseng recently has been cultivated in northern China. Further, in the market, some Asian ginsengs are labeled as American ginseng. In this study, forty-three American ginseng samples cultivated in the USA, Canada or China were collected and 14 ginseng saponins were determined using HPLC. HPLC coupled with hierarchical cluster analysis and principal component analysis was developed to identify the species. Subsequently, an HPLC-linear discriminant analysis was established to discriminate cultivation regions of American ginseng. This method was successfully applied to identify the sources of 6 commercial American ginseng samples. Two of them were identified as Asian ginseng, while 4 others were identified as American ginseng, which were cultivated in the USA (3) and China (1). Our newly developed method can be used to identify American ginseng with different cultivation regions. PMID:25044150
Comparison between cachaça and rum using pattern recognition methods.
Cardoso, Daniel R; Andrade-Sobrinho, Luiz G; Leite-Neto, Alexandre F; Reche, Roni V; Isique, William D; Ferreira, Marcia M C; Lima-Neto, Benedito S; Franco, Douglas W
2004-06-02
The differentiation between cachaça and rum using analytical data referred to alcohols (methanol, propanol, isobutanol, and isopentanol), acetaldehyde, ethyl acetate, organic acids (octanoic acid, decanoic acid, and dodecanoic acid), metals (Al, Ca, Co, Cu, Cr, Fe, Mg, Mn, Ni, Na, and Zn), and polyphenols (protocatechuic acid, sinapaldehyde, syringaldehyde, ellagic acid, syringic acid, gallic acid, (-)-epicatechin, vanillic acid, vanillin, p-coumaric acid, coniferaldehyde, coniferyl alcohol, kaempferol, and quercetin) is described. The organic and metal analyte contents were determined in 18 cachaça and 21 rum samples using chromatographic methods (GC-MS, GC-FID, and HPLC-UV-vis) and inductively coupled plasma atomic emission spectrometry, respectively. The analytical data of the above compounds, when treated by principal component analysis, hierarchical cluster analysis, discriminant analysis, and K-nearest neighbor analysis, provide a very good discrimination between the two classes of beverages.
Waldram, Alison; Dolan, Gayle; Ashton, Philip M; Jenkins, Claire; Dallman, Timothy J
2018-05-01
The unprecedented level of bacterial strain discrimination provided by whole genome sequencing (WGS) presents new challenges with respect to the utility and interpretation of the data. Whole genome sequences from 1445 isolates of Salmonella belonging to the most commonly identified serotypes in England and Wales isolated between April and August 2014 were analysed. Single linkage single nucleotide polymorphism thresholds at the 10, 5 and 0 level were explored for evidence of epidemiological links between clustered cases. Analysis of the WGS data organised 566 of the 1445 isolates into 32 clusters of five or more. A statistically significant epidemiological link was identified for 17 clusters. The clusters were associated with foreign travel (n = 8), consumption of Chinese takeaways (n = 4), chicken eaten at home (n = 2), and one each of the following; eating out, contact with another case in the home and contact with reptiles. In the same time frame, one cluster was detected using traditional outbreak detection methods. WGS can be used for the highly specific and highly sensitive detection of biologically related isolates when epidemiological links are obscured. Improvements in the collection of detailed, standardised exposure information would enhance cluster investigations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Romano, Federica; Meoni, Gaia; Manavella, Valeria; Baima, Giacomo; Tenori, Leonardo; Cacciatore, Stefano; Aimetti, Mario
2018-06-07
Recent findings about the differential gene expression signature of periodontal lesions have raised the hypothesis of distinctive biological phenotypes expressed by generalized chronic periodontitis (GCP) and generalized aggressive periodontitis (GAgP) patients. Therefore, this cross-sectional investigation was planned, primarily, to determine the ability of nuclear magnetic resonance (NMR) spectroscopic analysis of unstimulated whole saliva to discriminate GCP and GAgP disease-specific metabolomic fingerprint and, secondarily, to assess potential metabolites discriminating periodontitis patients from periodontally healthy individuals (HI). NMR-metabolomics spectra were acquired from salivary samples of patients with a clinical diagnosis of GCP (n = 33) or GAgP (n = 28) and from HI (n = 39). The clustering of HI, GCP and GAgP patients was achieved by using a combination of the Principal Component Analysis and Canonical Correlation Analysis on the NMR profiles. These analyses revealed a significant predictive accuracy discriminating HI from GCP, and discriminating HI from GAgP patients (both 81%). In contrast, the GAgP and GCP saliva samples seem to belong to the same metabolic space (60% predictive accuracy). Significantly lower levels (P < 0.05) of pyruvate, N-acetyl groups and lactate and higher levels (P < 0.05) of proline, phenylalanine, and tyrosine were found in GCP and GAgP patients compared with HI. Within the limitations of this study, CGP and GAgP metabolomic profiles were not unequivocally discriminated through a NMR-based spectroscopic analysis of saliva. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
NASA Astrophysics Data System (ADS)
Fletcher, John S.; Henderson, Alexander; Jarvis, Roger M.; Lockyer, Nicholas P.; Vickerman, John C.; Goodacre, Royston
2006-07-01
Advances in time of flight secondary ion mass spectrometry (ToF-SIMS) have enabled this technique to become a powerful tool for the analysis of biological samples. Such samples are often very complex and as a result full interpretation of the acquired data can be extremely difficult. To simplify the interpretation of these information rich data, the use of chemometric techniques is becoming widespread in the ToF-SIMS community. Here we discuss the application of principal components-discriminant function analysis (PC-DFA) to the separation and classification of a number of bacterial samples that are known to be major causal agents of urinary tract infection. A large data set has been generated using three biological replicates of each isolate and three machine replicates were acquired from each biological replicate. Ordination plots generated using the PC-DFA are presented demonstrating strain level discrimination of the bacteria. The results are discussed in terms of biological differences between certain species and with reference to FT-IR, Raman spectroscopy and pyrolysis mass spectrometric studies of similar samples.
Eye-gaze control of the computer interface: Discrimination of zoom intent
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goldberg, J.H.; Schryver, J.C.
1993-10-01
An analysis methodology and associated experiment were developed to assess whether definable and repeatable signatures of eye-gaze characteristics are evident, preceding a decision to zoom-in, zoom-out, or not to zoom at a computer interface. This user intent discrimination procedure can have broad application in disability aids and telerobotic control. Eye-gaze was collected from 10 subjects in a controlled experiment, requiring zoom decisions. The eye-gaze data were clustered, then fed into a multiple discriminant analysis (MDA) for optimal definition of heuristics separating the zoom-in, zoom-out, and no-zoom conditions. Confusion matrix analyses showed that a number of variable combinations classified at amore » statistically significant level, but practical significance was more difficult to establish. Composite contour plots demonstrated the regions in parameter space consistently assigned by the MDA to unique zoom conditions. Peak classification occurred at about 1200--1600 msec. Improvements in the methodology to achieve practical real-time zoom control are considered.« less
NASA Astrophysics Data System (ADS)
Koma, Zsófia; Deák, Márton; Kovács, József; Székely, Balázs; Kelemen, Kristóf; Standovár, Tibor
2016-04-01
Airborne Laser Scanning (ALS) is a widely used technology for forestry classification applications. However, single tree detection and species classification from low density ALS point cloud is limited in a dense forest region. In this study we investigate the division of a forest into homogenous groups at stand level. The study area is located in the Aggtelek karst region (Northeast Hungary) with a complex relief topography. The ALS dataset contained only 4 discrete echoes (at 2-4 pt/m2 density) from the study area during leaf-on season. Ground-truth measurements about canopy closure and proportion of tree species cover are available for every 70 meter in 500 square meter circular plots. In the first step, ALS data were processed and geometrical and intensity based features were calculated into a 5×5 meter raster based grid. The derived features contained: basic statistics of relative height, canopy RMS, echo ratio, openness, pulse penetration ratio, basic statistics of radiometric feature. In the second step the data were investigated using Combined Cluster and Discriminant Analysis (CCDA, Kovács et al., 2014). The CCDA method first determines a basic grouping for the multiple circle shaped sampling locations using hierarchical clustering and then for the arising grouping possibilities a core cycle is executed comparing the goodness of the investigated groupings with random ones. Out of these comparisons difference values arise, yielding information about the optimal grouping out of the investigated ones. If sub-groups are then further investigated, one might even find homogeneous groups. We found that low density ALS data classification into homogeneous groups are highly dependent on canopy closure, and the proportion of the dominant tree species. The presented results show high potential using CCDA for determination of homogenous separable groups in LiDAR based tree species classification. Aggtelek Karst/Slovakian Karst Caves" (HUSK/1101/221/0180, Aggtelek NP), data evaluation: 'Multipurpose assessment serving forest biodiversity conservation in the Carpathian region of Hungary', Swiss-Hungarian Cooperation Programme (SH/4/13 Project). BS contributed as an Alexander von Humboldt Research Fellow. J. Kovács, S. Kovács, N. Magyar, P. Tanos, I. G. Hatvani, and A. Anda (2014), Classification into homogeneous groups using combined cluster and discriminant analysis, Environmental Modelling & Software, 57, 52-59.
Identification of chronic rhinosinusitis phenotypes using cluster analysis.
Soler, Zachary M; Hyer, J Madison; Ramakrishnan, Viswanathan; Smith, Timothy L; Mace, Jess; Rudmik, Luke; Schlosser, Rodney J
2015-05-01
Current clinical classifications of chronic rhinosinusitis (CRS) have been largely defined based upon preconceived notions of factors thought to be important, such as polyp or eosinophil status. Unfortunately, these classification systems have little correlation with symptom severity or treatment outcomes. Unsupervised clustering can be used to identify phenotypic subgroups of CRS patients, describe clinical differences in these clusters and define simple algorithms for classification. A multi-institutional, prospective study of 382 patients with CRS who had failed initial medical therapy completed the Sino-Nasal Outcome Test (SNOT-22), Rhinosinusitis Disability Index (RSDI), Medical Outcomes Study Short Form-12 (SF-12), Pittsburgh Sleep Quality Index (PSQI), and Patient Health Questionnaire (PHQ-2). Objective measures of CRS severity included Brief Smell Identification Test (B-SIT), CT, and endoscopy scoring. All variables were reduced and unsupervised hierarchical clustering was performed. After clusters were defined, variations in medication usage were analyzed. Discriminant analysis was performed to develop a simplified, clinically useful algorithm for clustering. Clustering was largely determined by age, severity of patient reported outcome measures, depression, and fibromyalgia. CT and endoscopy varied somewhat among clusters. Traditional clinical measures, including polyp/atopic status, prior surgery, B-SIT and asthma, did not vary among clusters. A simplified algorithm based upon productivity loss, SNOT-22 score, and age predicted clustering with 89% accuracy. Medication usage among clusters did vary significantly. A simplified algorithm based upon hierarchical clustering is able to classify CRS patients and predict medication usage. Further studies are warranted to determine if such clustering predicts treatment outcomes. © 2015 ARS-AAOA, LLC.
Discriminating Characteristics of Tectonic and Human-Induced Seismicity
NASA Astrophysics Data System (ADS)
Zaliapin, I. V.; Ben-Zion, Y.
2015-12-01
We analyze statistical features of background and clustered subpopulations of earthquakes in different regions in an effort to distinguish between human-induced and natural seismicity. Analysis of "end-member" areas known to be dominated by human-induced earthquakes (the Geyser geothermal field in northern California and TauTona gold mine in South Africa) and regular tectonic activity (the San Jacinto fault zone in southern California and Coso region excluding the Coso geothermal field in eastern central California) reveals several distinguishing characteristics. Induced seismicity is shown to have (i) higher rate of background events (both absolute and relative to the total rate), (ii) faster temporal offspring decay, (iii) higher intensity of repeating events, (iv) larger proportion of small clusters, and (v) larger spatial separation between parent and offspring, compared to regular tectonic activity. These differences also successfully discriminate seismicity within the Coso and Salton Sea geothermal fields in California before and after the expansion of geothermal production during the 1980s.
[Nondestructive discrimination of strawberry varieties by NIR and BP-ANN].
Niu, Xiao-ying; Shao, Li-min; Zhao, Zhi-lei; Zhang, Xiao-yu
2012-08-01
Strawberry variety is a main factor that can influence strawberry fruit quality. The use of near-infrared reflectance spectroscopy was explored discriminate among samples of strawberry of different varieties. And the significance of difference among different varieties was analyzed by comparison of the chemical composition of the different varieties samples. The performance of models established using back propagation-artificial neural networks (BP-ANN), least squares-support vector machine and discriminant analysis were evaluated on spectra range of 4545-9090 cm(-1). The optimal model was obtained by BP-ANN with a topology of 12-18-3, which correctly classified 96.68% of calibration set and 97.14% of prediction set. And the 94.95%, 97% and 98.29% classifications were given respectively for "Tianbao" (n=99), "Fengxiang" (n=100) and "Mingxing" (n=117). One-way analysis of variance was made for comparison of the mean values for soluble solids content (SSC), titratable acid (TA), pH value and SSC-TA ratio, and the statistically significant differences were found. Principal component analysis was performed on the four chemical compositions, and obvious clustering tendencies for different varieties were found. These results showed that NIR combined with BP-ANN can discriminate strawberry of different varieties effectively, and the difference in chemical compositions of different varieties strawberry might be a chemical validation for NIR results.
Moran, James J; Ehrhardt, Christopher J; Wahl, Jon H; Kreuzer, Helen W; Wahl, Karen L
2013-11-15
We analyzed 21 neat acetone samples from 15 different suppliers to demonstrate the utility of a coupled stable isotope and trace contaminant strategy for distinguishing forensically-relevant samples. By combining these two pieces of orthogonal data we could discriminate all of the acetones that were produced by the 15 different suppliers. Using stable isotope ratios alone, we were able to distinguish 8 acetone samples, while the remaining 13 fell into four clusters with highly similar signatures. Adding trace chemical contaminant information enhanced discrimination to 13 individual acetones with three residual clusters. The acetones within each cluster shared a common manufacturer and might, therefore, not be expected to be resolved. The data presented here demonstrates the power of combining orthogonal data sets to enhance sample fingerprinting and highlights the role disparate data could play in future forensic investigations. © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Gap Shape Classification using Landscape Indices and Multivariate Statistics
Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung
2016-01-01
This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks’ lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap. PMID:27901127
Muller, E; Gargani, D; Banuls, A L; Tibayrenc, M; Dollet, M
1997-10-01
The genetic polymorphism of 30 isolates of plant trypanosomatids colloquially referred to as plant trypanosomes was assayed by means of RAPD. The principle objectives of this study were to assess the discriminative power of RAPD analysis for studying plant trypanosomes and to determine whether the results obtained were comparable with those from a previous isoenzyme (MLEE) study. The principle groups of plant trypanosomes identified previously by isoenzyme analysis--intraphloemic trypanosomes, intralaticiferous trypanosomes and trypanosomes isolated from fruits--were also clearly separated by the RAPD technique. Moreover, the results showed a fair parity between MLEE and RAPD data (coefficient of correlation = 0.84) and the two techniques have comparable discriminative ability. Most of the separation revealed by the two techniques between the clusters was associated with major biological properties. However, the RAPD technique gave a more coherent separation than MLEE because the intraphloemic isolates, which were biologically similar in terms of their specific localization in the sieve tubes of the plant, were found to be in closer groups by the RAPD. For both techniques, the existence of the main clusters was correlated with the existence of synapomorphic characters, which could be used as powerful tools in taxonomy and epidemiology.
Gap Shape Classification using Landscape Indices and Multivariate Statistics.
Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung
2016-11-30
This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks' lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap.
Cebi, Nur; Yilmaz, Mustafa Tahsin; Sagdic, Osman
2017-08-15
Sibutramine may be illicitly included in herbal slimming foods and supplements marketed as "100% natural" to enhance weight loss. Considering public health and legal regulations, there is an urgent need for effective, rapid and reliable techniques to detect sibutramine in dietetic herbal foods, teas and dietary supplements. This research comprehensively explored, for the first time, detection of sibutramine in green tea, green coffee and mixed herbal tea using ATR-FTIR spectroscopic technique combined with chemometrics. Hierarchical cluster analysis and PCA principle component analysis techniques were employed in spectral range (2746-2656cm -1 ) for classification and discrimination through Euclidian distance and Ward's algorithm. Unadulterated and adulterated samples were classified and discriminated with respect to their sibutramine contents with perfect accuracy without any false prediction. The results suggest that existence of the active substance could be successfully determined at the levels in the range of 0.375-12mg in totally 1.75g of green tea, green coffee and mixed herbal tea by using FTIR-ATR technique combined with chemometrics. Copyright © 2017 Elsevier Ltd. All rights reserved.
Discrimination of complex mixtures by a colorimetric sensor array: coffee aromas.
Suslick, Benjamin A; Feng, Liang; Suslick, Kenneth S
2010-03-01
The analysis of complex mixtures presents a difficult challenge even for modern analytical techniques, and the ability to discriminate among closely similar such mixtures often remains problematic. Coffee provides a readily available archetype of such highly multicomponent systems. The use of a low-cost, sensitive colorimetric sensor array for the detection and identification of coffee aromas is reported. The color changes of the sensor array were used as a digital representation of the array response and analyzed with standard statistical methods, including principal component analysis (PCA) and hierarchical clustering analysis (HCA). PCA revealed that the sensor array has exceptionally high dimensionality with 18 dimensions required to define 90% of the total variance. In quintuplicate runs of 10 commercial coffees and controls, no confusions or errors in classification by HCA were observed in 55 trials. In addition, the effects of temperature and time in the roasting of green coffee beans were readily observed and distinguishable with a resolution better than 10 degrees C and 5 min, respectively. Colorimetric sensor arrays demonstrate excellent potential for complex systems analysis in real-world applications and provide a novel method for discrimination among closely similar complex mixtures.
Discrimination of Complex Mixtures by a Colorimetric Sensor Array: Coffee Aromas
Suslick, Benjamin A.; Feng, Liang; Suslick, Kenneth S.
2010-01-01
The analysis of complex mixtures presents a difficult challenge even for modern analytical techniques, and the ability to discriminate among closely similar such mixtures often remains problematic. Coffee provides a readily available archetype of such highly multicomponent systems. The use of a low-cost, sensitive colorimetric sensor array for the detection and identification of coffee aromas is reported. The color changes of the sensor array were used as a digital representation of the array response and analyzed with standard statistical methods, including principal component analysis (PCA) and hierarchical clustering analysis (HCA). PCA revealed that the sensor array has exceptionally high dimensionality with 18 dimensions required to define 90% of the total variance. In quintuplicate runs of 10 commercial coffees and controls, no confusions or errors in classification by HCA were observed in 55 trials. In addition, the effects of temperature and time in the roasting of green coffee beans were readily observed and distinguishable with a resolution better than 10 °C and 5 min, respectively. Colorimetric sensor arrays demonstrate excellent potential for complex systems analysis in real-world applications and provide a novel method for discrimination among closely similar complex mixtures. PMID:20143838
Adnane, Choaib; Adouly, Taoufik; Khallouk, Amine; Rouadi, Sami; Abada, Redallah; Roubal, Mohamed; Mahtar, Mohamed
2017-02-01
The purpose of this study is to use unsupervised cluster methodology to identify phenotype and mucosal eosinophilia endotype subgroups of patients with medical refractory chronic rhinosinusitis (CRS), and evaluate the difference in quality of life (QOL) outcomes after endoscopic sinus surgery (ESS) between these clusters for better surgical case selection. A prospective cohort study included 131 patients with medical refractory CRS who elected ESS. The Sino-Nasal Outcome Test (SNOT-22) was used to evaluate QOL before and 12 months after surgery. Unsupervised two-step clustering method was performed. One hundred and thirteen subjects were retained in this study: 46 patients with CRS without nasal polyps and 67 patients with nasal polyps. Nasal polyps, gender, mucosal eosinophilia profile, and prior sinus surgery were the most discriminating factors in the generated clusters. Three clusters were identified. A significant clinical improvement was observed in all clusters 12 months after surgery with a reduction of SNOT-22 scores. There was a significant difference in QOL outcomes between clusters; cluster 1 had the worst QOL improvement after FESS in comparison with the other clusters 2 and 3. All patients in cluster 1 presented CRSwNP with the highest mucosal eosinophilia endotype. Clustering method is able to classify CRS phenotypes and endotypes with different associated surgical outcomes.
Study on fast discrimination of varieties of yogurt using Vis/NIR-spectroscopy
NASA Astrophysics Data System (ADS)
He, Yong; Feng, Shuijuan; Deng, Xunfei; Li, Xiaoli
2006-09-01
A new approach for discrimination of varieties of yogurt by means of VisINTR-spectroscopy was present in this paper. Firstly, through the principal component analysis (PCA) of spectroscopy curves of 5 typical kinds of yogurt, the clustering of yogurt varieties was processed. The analysis results showed that the cumulate reliabilities of PC1 and PC2 (the first two principle components) were more than 98.956%, and the cumulate reliabilities from PC1 to PC7 (the first seven principle components) was 99.97%. Secondly, a discrimination model of Artificial Neural Network (ANN-BP) was set up. The first seven principles components of the samples were applied as ANN-BP inputs, and the value of type of yogurt were applied as outputs, then the three-layer ANN-BP model was build. In this model, every variety yogurt includes 27 samples, the total number of sample is 135, and the rest 25 samples were used as prediction set. The results showed the distinguishing rate of the five yogurt varieties was 100%. It presented that this model was reliable and practicable. So a new approach for the rapid and lossless discrimination of varieties of yogurt was put forward.
Using radar imagery for crop discrimination: a statistical and conditional probability study
Haralick, R.M.; Caspall, F.; Simonett, D.S.
1970-01-01
A number of the constraints with which remote sensing must contend in crop studies are outlined. They include sensor, identification accuracy, and congruencing constraints; the nature of the answers demanded of the sensor system; and the complex temporal variances of crops in large areas. Attention is then focused on several methods which may be used in the statistical analysis of multidimensional remote sensing data.Crop discrimination for radar K-band imagery is investigated by three methods. The first one uses a Bayes decision rule, the second a nearest-neighbor spatial conditional probability approach, and the third the standard statistical techniques of cluster analysis and principal axes representation.Results indicate that crop type and percent of cover significantly affect the strength of the radar return signal. Sugar beets, corn, and very bare ground are easily distinguishable, sorghum, alfalfa, and young wheat are harder to distinguish. Distinguishability will be improved if the imagery is examined in time sequence so that changes between times of planning, maturation, and harvest provide additional discriminant tools. A comparison between radar and photography indicates that radar performed surprisingly well in crop discrimination in western Kansas and warrants further study.
Sakhteman, Amirhossein; Faridi, Pouya; Daneshamouz, Saeid; Akbarizadeh, Amin Reza; Borhani-Haghighi, Afshin; Mohagheghzadeh, Abdolali
2017-01-01
Herbal oils have been widely used in Iran as medicinal compounds dating back to thousands of years in Iran. Chamomile oil is widely used as an example of traditional oil. We remade chamomile oils and tried to modify it with current knowledge and facilities. Six types of oil (traditional and modified) were prepared. Microbial limit tests and physicochemical tests were performed on them. Also, principal component analysis, hierarchical cluster analysis, and partial least squares discriminant analysis were done on the spectral data of attenuated total reflectance–infrared in order to obtain insight based on classification pattern of the samples. The results show that we can use modified versions of the chamomile oils (modified Clevenger-type apparatus method and microwave method) with the same content of traditional ones and with less microbial contaminations and better physicochemical properties. PMID:28585466
Zargaran, Arman; Sakhteman, Amirhossein; Faridi, Pouya; Daneshamouz, Saeid; Akbarizadeh, Amin Reza; Borhani-Haghighi, Afshin; Mohagheghzadeh, Abdolali
2017-10-01
Herbal oils have been widely used in Iran as medicinal compounds dating back to thousands of years in Iran. Chamomile oil is widely used as an example of traditional oil. We remade chamomile oils and tried to modify it with current knowledge and facilities. Six types of oil (traditional and modified) were prepared. Microbial limit tests and physicochemical tests were performed on them. Also, principal component analysis, hierarchical cluster analysis, and partial least squares discriminant analysis were done on the spectral data of attenuated total reflectance-infrared in order to obtain insight based on classification pattern of the samples. The results show that we can use modified versions of the chamomile oils (modified Clevenger-type apparatus method and microwave method) with the same content of traditional ones and with less microbial contaminations and better physicochemical properties.
Towards the identification of plant and animal binders on Australian stone knives.
Blee, Alisa J; Walshe, Keryn; Pring, Allan; Quinton, Jamie S; Lenehan, Claire E
2010-07-15
There is limited information regarding the nature of plant and animal residues used as adhesives, fixatives and pigments found on Australian Aboriginal artefacts. This paper reports the use of FTIR in combination with the chemometric tools principal component analysis (PCA) and hierarchical clustering (HC) for the analysis and identification of Australian plant and animal fixatives on Australian stone artefacts. Ten different plant and animal residues were able to be discriminated from each other at a species level by combining FTIR spectroscopy with the chemometric data analysis methods, principal component analysis (PCA) and hierarchical clustering (HC). Application of this method to residues from three broken stone knives from the collections of the South Australian Museum indicated that two of the handles of knives were likely to have contained beeswax as the fixative whilst Spinifex resin was the probable binder on the third. Copyright 2010 Elsevier B.V. All rights reserved.
Kuhlmey, J; Lautsch, E
1980-01-01
In our 2. information on the investigation of the need for cultural entertainments of inhabitants in geriatric nursing homes we tested the influence of the factors age, sex, kind of work and during of stay in the geriatric nursing home singly and successively for each single indicator of this complex need. In this 3. information the influence of this four factors was investigated in these contradictory dependency on the indicators under synchronous consideration of their contradictory dependency. The contradictory dependency of the factors was presented by typisation (cluster analysis). As a result of the cluster analysis same classes arose--similar disposed inhabitants belong to same classes. The average coinage in this classes was obtained and differences were analysed by statistical methods multidimensional analysis of variance and analysis of discriminance).
NASA Astrophysics Data System (ADS)
Longo, S.; Roney, J. M.
2018-03-01
Pulse shape discrimination using CsI(Tl) scintillators to perform neutral hadron particle identification is explored with emphasis towards application at high energy electron-positron collider experiments. Through the analysis of the pulse shape differences between scintillation pulses from photon and hadronic energy deposits using neutron and proton data collected at TRIUMF, it is shown that the pulse shape variations observed for hadrons can be modelled using a third scintillation component for CsI(Tl), in addition to the standard fast and slow components. Techniques for computing the hadronic pulse amplitudes and shape variations are developed and it is shown that the intensity of the additional scintillation component can be computed from the ionization energy loss of the interacting particles. These pulse modelling and simulation methods are integrated with GEANT4 simulation libraries and the predicted pulse shape for CsI(Tl) crystals in a 5 × 5 array of 5 × 5 × 30 cm3 crystals is studied for hadronic showers from 0.5 and 1 GeV/c KL0 and neutron particles. Using a crystal level and cluster level approach for photon vs. hadron cluster separation we demonstrate proof-of-concept for neutral hadron detection using CsI(Tl) pulse shape discrimination in high energy electron-positron collider experiments.
Hafen, G M; Hurst, C; Yearwood, J; Smith, J; Dzalilov, Z; Robinson, P J
2008-10-05
Cystic fibrosis is the most common fatal genetic disorder in the Caucasian population. Scoring systems for assessment of Cystic fibrosis disease severity have been used for almost 50 years, without being adapted to the milder phenotype of the disease in the 21st century. The aim of this current project is to develop a new scoring system using a database and employing various statistical tools. This study protocol reports the development of the statistical tools in order to create such a scoring system. The evaluation is based on the Cystic Fibrosis database from the cohort at the Royal Children's Hospital in Melbourne. Initially, unsupervised clustering of the all data records was performed using a range of clustering algorithms. In particular incremental clustering algorithms were used. The clusters obtained were characterised using rules from decision trees and the results examined by clinicians. In order to obtain a clearer definition of classes expert opinion of each individual's clinical severity was sought. After data preparation including expert-opinion of an individual's clinical severity on a 3 point-scale (mild, moderate and severe disease), two multivariate techniques were used throughout the analysis to establish a method that would have a better success in feature selection and model derivation: 'Canonical Analysis of Principal Coordinates' and 'Linear Discriminant Analysis'. A 3-step procedure was performed with (1) selection of features, (2) extracting 5 severity classes out of a 3 severity class as defined per expert-opinion and (3) establishment of calibration datasets. (1) Feature selection: CAP has a more effective "modelling" focus than DA.(2) Extraction of 5 severity classes: after variables were identified as important in discriminating contiguous CF severity groups on the 3-point scale as mild/moderate and moderate/severe, Discriminant Function (DF) was used to determine the new groups mild, intermediate moderate, moderate, intermediate severe and severe disease. (3) Generated confusion tables showed a misclassification rate of 19.1% for males and 16.5% for females, with a majority of misallocations into adjacent severity classes particularly for males. Our preliminary data show that using CAP for detection of selection features and Linear DA to derive the actual model in a CF database might be helpful in developing a scoring system. However, there are several limitations, particularly more data entry points are needed to finalize a score and the statistical tools have further to be refined and validated, with re-running the statistical methods in the larger dataset.
Katayama, K; Sato, T; Arai, T; Amao, H; Ohta, Y; Ozawa, T; Kenyon, P R; Hickson, R E; Tazaki, H
2013-02-01
Simple liquid chromatography-mass spectrometry (LC-MS) was applied to non-targeted metabolic analyses to discover new metabolic markers in animal plasma. Principle component analysis (PCA) and partial least squares-discriminate analysis (PLS-DA) were used to analyse LC-MS multivariate data. PCA clearly generated two separate clusters for artificially induced diabetic mice and healthy control mice. PLS-DA of time-course changes in plasma metabolites of chicks after feeding generated three clusters (pre- and immediately after feeding, 0.5-3 h after feeding and 4 h after feeding). Two separate clusters were also generated for plasma metabolites of pregnant Angus heifers with differing live-weight change profiles (gaining or losing). The accompanying PLS-DA loading plot detailed the metabolites that contribute the most to the cluster separation. In each case, the same highly hydrophilic metabolite was strongly correlated to the group separation. The metabolite was identified as betaine by LC-MS/MS. This result indicates that betaine and its metabolic precursor, choline, may be useful biomarkers to evaluate the nutritional and metabolic status of animals. © 2011 Blackwell Verlag GmbH.
Real Time Intelligent Target Detection and Analysis with Machine Vision
NASA Technical Reports Server (NTRS)
Howard, Ayanna; Padgett, Curtis; Brown, Kenneth
2000-01-01
We present an algorithm for detecting a specified set of targets for an Automatic Target Recognition (ATR) application. ATR involves processing images for detecting, classifying, and tracking targets embedded in a background scene. We address the problem of discriminating between targets and nontarget objects in a scene by evaluating 40x40 image blocks belonging to an image. Each image block is first projected onto a set of templates specifically designed to separate images of targets embedded in a typical background scene from those background images without targets. These filters are found using directed principal component analysis which maximally separates the two groups. The projected images are then clustered into one of n classes based on a minimum distance to a set of n cluster prototypes. These cluster prototypes have previously been identified using a modified clustering algorithm based on prior sensed data. Each projected image pattern is then fed into the associated cluster's trained neural network for classification. A detailed description of our algorithm will be given in this paper. We outline our methodology for designing the templates, describe our modified clustering algorithm, and provide details on the neural network classifiers. Evaluation of the overall algorithm demonstrates that our detection rates approach 96% with a false positive rate of less than 0.03%.
Motegi, Hiromi; Tsuboi, Yuuri; Saga, Ayako; Kagami, Tomoko; Inoue, Maki; Toki, Hideaki; Minowa, Osamu; Noda, Tetsuo; Kikuchi, Jun
2015-11-04
There is an increasing need to use multivariate statistical methods for understanding biological functions, identifying the mechanisms of diseases, and exploring biomarkers. In addition to classical analyses such as hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis, various multivariate strategies, including independent component analysis, non-negative matrix factorization, and multivariate curve resolution, have recently been proposed. However, determining the number of components is problematic. Despite the proposal of several different methods, no satisfactory approach has yet been reported. To resolve this problem, we implemented a new idea: classifying a component as "reliable" or "unreliable" based on the reproducibility of its appearance, regardless of the number of components in the calculation. Using the clustering method for classification, we applied this idea to multivariate curve resolution-alternating least squares (MCR-ALS). Comparisons between conventional and modified methods applied to proton nuclear magnetic resonance ((1)H-NMR) spectral datasets derived from known standard mixtures and biological mixtures (urine and feces of mice) revealed that more plausible results are obtained by the modified method. In particular, clusters containing little information were detected with reliability. This strategy, named "cluster-aided MCR-ALS," will facilitate the attainment of more reliable results in the metabolomics datasets.
Cluster-based exposure variation analysis
2013-01-01
Background Static posture, repetitive movements and lack of physical variation are known risk factors for work-related musculoskeletal disorders, and thus needs to be properly assessed in occupational studies. The aims of this study were (i) to investigate the effectiveness of a conventional exposure variation analysis (EVA) in discriminating exposure time lines and (ii) to compare it with a new cluster-based method for analysis of exposure variation. Methods For this purpose, we simulated a repeated cyclic exposure varying within each cycle between “low” and “high” exposure levels in a “near” or “far” range, and with “low” or “high” velocities (exposure change rates). The duration of each cycle was also manipulated by selecting a “small” or “large” standard deviation of the cycle time. Theses parameters reflected three dimensions of exposure variation, i.e. range, frequency and temporal similarity. Each simulation trace included two realizations of 100 concatenated cycles with either low (ρ = 0.1), medium (ρ = 0.5) or high (ρ = 0.9) correlation between the realizations. These traces were analyzed by conventional EVA, and a novel cluster-based EVA (C-EVA). Principal component analysis (PCA) was applied on the marginal distributions of 1) the EVA of each of the realizations (univariate approach), 2) a combination of the EVA of both realizations (multivariate approach) and 3) C-EVA. The least number of principal components describing more than 90% of variability in each case was selected and the projection of marginal distributions along the selected principal component was calculated. A linear classifier was then applied to these projections to discriminate between the simulated exposure patterns, and the accuracy of classified realizations was determined. Results C-EVA classified exposures more correctly than univariate and multivariate EVA approaches; classification accuracy was 49%, 47% and 52% for EVA (univariate and multivariate), and C-EVA, respectively (p < 0.001). All three methods performed poorly in discriminating exposure patterns differing with respect to the variability in cycle time duration. Conclusion While C-EVA had a higher accuracy than conventional EVA, both failed to detect differences in temporal similarity. The data-driven optimality of data reduction and the capability of handling multiple exposure time lines in a single analysis are the advantages of the C-EVA. PMID:23557439
Classification of arrhythmia using hybrid networks.
Haseena, Hassan H; Joseph, Paul K; Mathew, Abraham T
2011-12-01
Reliable detection of arrhythmias based on digital processing of Electrocardiogram (ECG) signals is vital in providing suitable and timely treatment to a cardiac patient. Due to corruption of ECG signals with multiple frequency noise and presence of multiple arrhythmic events in a cardiac rhythm, computerized interpretation of abnormal ECG rhythms is a challenging task. This paper focuses a Fuzzy C- Mean (FCM) clustered Probabilistic Neural Network (PNN) and Multi Layered Feed Forward Network (MLFFN) for the discrimination of eight types of ECG beats. Parameters such as fourth order Auto Regressive (AR) coefficients along with Spectral Entropy (SE) are extracted from each ECG beat and feature reduction has been carried out using FCM clustering. The cluster centers form the input of neural network classifiers. The extensive analysis of Massachusetts Institute of Technology- Beth Israel Hospital (MIT-BIH) arrhythmia database shows that FCM clustered PNNs is superior in cardiac arrhythmia classification than FCM clustered MLFFN with an overall accuracy of 99.05%, 97.14%, respectively.
Neural net applied to anthropological material: a methodical study on the human nasal skeleton.
Prescher, Andreas; Meyers, Anne; Gerf von Keyserlingk, Diedrich
2005-07-01
A new information processing method, an artificial neural net, was applied to characterise the variability of anthropological features of the human nasal skeleton. The aim was to find different types of nasal skeletons. A neural net with 15*15 nodes was trained by 17 standard anthropological parameters taken from 184 skulls of the Aachen collection. The trained neural net delivers its classification in a two-dimensional map. Different types of noses were locally separated within the map. Rare and frequent types may be distinguished after one passage of the complete collection through the net. Statistical descriptive analysis, hierarchical cluster analysis, and discriminant analysis were applied to the same data set. These parallel applications allowed comparison of the new approach to the more traditional ones. In general the classification by the neural net is in correspondence with cluster analysis and discriminant analysis. However, it goes beyond these classifications because of the possibility of differentiating the types in multi-dimensional dependencies. Furthermore, places in the map are kept blank for intermediate forms, which may be theoretically expected, but were not included in the training set. In conclusion, the application of a neural network is a suitable method for investigating large collections of biological material. The gained classification may be helpful in anatomy and anthropology as well as in forensic medicine. It may be used to characterise the peculiarity of a whole set as well as to find particular cases within the set.
Comnea-Stancu, Ionela Raluca; Wieland, Karin; Ramer, Georg; Schwaighofer, Andreas
2016-01-01
This work was sparked by the reported identification of man-made cellulosic fibers (rayon/viscose) in the marine environment as a major fraction of plastic litter by Fourier transform infrared (FT-IR) transmission spectroscopy and library search. To assess the plausibility of such findings, both natural and man-made fibers were examined using FT-IR spectroscopy. Spectra acquired by transmission microscopy, attenuated total reflection (ATR) microscopy, and ATR spectroscopy were compared. Library search was employed and results show significant differences in the identification rate depending on the acquisition method of the spectra. Careful selection of search parameters and the choice of spectra acquisition method were found to be essential for optimization of the library search results. When using transmission spectra of fibers and ATR libraries it was not possible to differentiate between man-made and natural fibers. Successful differentiation of natural and man-made cellulosic fibers has been achieved for FT-IR spectra acquired by ATR microscopy and ATR spectroscopy, and application of ATR libraries. As an alternative, chemometric methods such as unsupervised hierarchical cluster analysis, principal component analysis, and partial least squares-discriminant analysis were employed to facilitate identification based on intrinsic relationships of sample spectra and successful discrimination of the fiber type could be achieved. Differences in the ATR spectra depending on the internal reflection element (Ge versus diamond) were observed as expected; however, these did not impair correct classification by chemometric analysis. Moreover, the effects of different levels of humidity on the IR spectra of natural and man-made fibers were investigated, too. It has been found that drying and re-humidification leads to intensity changes of absorption bands of the carbohydrate backbone, but does not impair the identification of the fiber type by library search or cluster analysis. PMID:27650982
Vásquez, Fernando; Soler, Carles; Camps, Patricia; Valverde, Anthony; García-Molina, Almudena
2016-01-01
This work evaluates sperm head morphometric characteristics in adolescents from 12 to 18 years of age, and the effect of varicocele. Volunteers between 150 and 224 months of age (mean 191, n = 87), who had reached oigarche by 12 years old, were recruited in the area of Barranquilla, Colombia. Morphometric analysis of sperm heads was performed with principal component (PC) and discriminant analysis. Combining seminal fluid and sperm parameters provided five PCs: two related to sperm morphometry, one to sperm motility, and two to seminal fluid components. Discriminant analysis on the morphometric results of varicocele and nonvaricocele groups did not provide a useful classification matrix. Of the semen-related PCs, the most explanatory (40%) was related to sperm motility. Two PCs, including sperm head elongation and size, were sufficient to evaluate sperm morphometric characteristics. Most of the morphometric variables were correlated with age, with an increase in size and decrease in the elongation of the sperm head. For head size, the entire sperm population could be divided into two morphometric subpopulations, SP1 and SP2, which did not change during adolescence. In general, for varicocele individuals, SP1 had larger and more elongated sperm heads than SP2, which had smaller and more elongated heads than in nonvaricocele men. In summary, sperm head morphometry assessed by CASA-Morph and multivariate cluster analysis provides a better comprehension of the ejaculate structure and possibly sperm function. Morphometric analysis provides much more information than data obtained from conventional semen analysis. PMID:27751986
Sahilah, A M; Laila, R A S; Sallehuddin, H Mohd; Osman, H; Aminah, A; Ahmad Azuhairi, A
2014-02-01
Genomic DNA of Vibrio parahaemolyticus were characterized by antibiotic resistance, enterobacterial repetitive intergenic consensus-polymerase chain reaction (ERIC-PCR) and random amplified polymorphic DNA-polymerase chain reaction (RAPD-PCR) analysis. These isolates originated from 3 distantly locations of Selangor, Negeri Sembilan and Melaka (East coastal areas), Malaysia. A total of 44 (n = 44) of tentatively V. parahaemolyticus were also examined for the presence of toxR, tdh and trh gene. Of 44 isolates, 37 were positive towards toxR gene; while, none were positive to tdh and trh gene. Antibiotic resistance analysis showed the V. parahaemolyticus isolates were highly resistant to bacitracin (92%, 34/37) and penicillin (89%, 33/37) followed by resistance towards ampicillin (68%, 25/37), cefuroxime (38%, 14/37), amikacin (6%, 2/37) and ceftazidime (14%, 5/37). None of the V. parahaemolyticus isolates were resistant towards chloramphenicol, ciprofloxacin, ceftriaxone, enrofloxacin, norfloxacin, streptomycin and vancomycin. Antibiogram patterns exhibited, 9 patterns and phenotypically less heterogenous when compared to PCR-based techniques using ERIC- and RAPD-PCR. The results of the ERIC- and RAPD-PCR were analyzed using GelCompare software. ERIC-PCR with primers ERIC1R and ERIC2 discriminated the V. parahaemolyticus isolates into 6 clusters and 21 single isolates at a similarity level of 80%. While, RAPD-PCR with primer Gen8 discriminated the V. parahaemolyticus isolates into 11 clusters and 10 single isolates and Gen9 into 8 clusters and 16 single isolates at the same similarity level examined. Results in the presence study demonstrated combination of phenotypically and genotypically methods show a wide heterogeneity among cockle isolates of V. parahaemolyticus.
Gatos, Ilias; Tsantis, Stavros; Spiliopoulos, Stavros; Karnabatidis, Dimitris; Theotokas, Ioannis; Zoumpoulis, Pavlos; Loupas, Thanasis; Hazle, John D; Kagadis, George C
2017-09-01
The purpose of the present study was to employ a computer-aided diagnosis system that classifies chronic liver disease (CLD) using ultrasound shear wave elastography (SWE) imaging, with a stiffness value-clustering and machine-learning algorithm. A clinical data set of 126 patients (56 healthy controls, 70 with CLD) was analyzed. First, an RGB-to-stiffness inverse mapping technique was employed. A five-cluster segmentation was then performed associating corresponding different-color regions with certain stiffness value ranges acquired from the SWE manufacturer-provided color bar. Subsequently, 35 features (7 for each cluster), indicative of physical characteristics existing within the SWE image, were extracted. A stepwise regression analysis toward feature reduction was used to derive a reduced feature subset that was fed into the support vector machine classification algorithm to classify CLD from healthy cases. The highest accuracy in classification of healthy to CLD subject discrimination from the support vector machine model was 87.3% with sensitivity and specificity values of 93.5% and 81.2%, respectively. Receiver operating characteristic curve analysis gave an area under the curve value of 0.87 (confidence interval: 0.77-0.92). A machine-learning algorithm that quantifies color information in terms of stiffness values from SWE images and discriminates CLD from healthy cases is introduced. New objective parameters and criteria for CLD diagnosis employing SWE images provided by the present study can be considered an important step toward color-based interpretation, and could assist radiologists' diagnostic performance on a daily basis after being installed in a PC and employed retrospectively, immediately after the examination. Copyright © 2017 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Glick, B. J.
1985-01-01
Techniques for classifying objects into groups or clases go under many different names including, most commonly, cluster analysis. Mathematically, the general problem is to find a best mapping of objects into an index set consisting of class identifiers. When an a priori grouping of objects exists, the process of deriving the classification rules from samples of classified objects is known as discrimination. When such rules are applied to objects of unknown class, the process is denoted classification. The specific problem addressed involves the group classification of a set of objects that are each associated with a series of measurements (ratio, interval, ordinal, or nominal levels of measurement). Each measurement produces one variable in a multidimensional variable space. Cluster analysis techniques are reviewed and methods for incuding geographic location, distance measures, and spatial pattern (distribution) as parameters in clustering are examined. For the case of patterning, measures of spatial autocorrelation are discussed in terms of the kind of data (nominal, ordinal, or interval scaled) to which they may be applied.
Molecular reclassification of Crohn's disease: a cautionary note on population stratification.
Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel
2013-01-01
Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn's disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn's disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals.
Molecular Reclassification of Crohn’s Disease: A Cautionary Note on Population Stratification
Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M.; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel
2013-01-01
Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn’s disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn’s disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals. PMID:24147066
Wang, Juan; Nishikawa, Robert M; Yang, Yongyi
2017-04-01
In computerized detection of clustered microcalcifications (MCs) from mammograms, the traditional approach is to apply a pattern detector to locate the presence of individual MCs, which are subsequently grouped into clusters. Such an approach is often susceptible to the occurrence of false positives (FPs) caused by local image patterns that resemble MCs. We investigate the feasibility of a direct detection approach to determining whether an image region contains clustered MCs or not. Toward this goal, we develop a deep convolutional neural network (CNN) as the classifier model to which the input consists of a large image window ([Formula: see text] in size). The multiple layers in the CNN classifier are trained to automatically extract image features relevant to MCs at different spatial scales. In the experiments, we demonstrated this approach on a dataset consisting of both screen-film mammograms and full-field digital mammograms. We evaluated the detection performance both on classifying image regions of clustered MCs using a receiver operating characteristic (ROC) analysis and on detecting clustered MCs from full mammograms by a free-response receiver operating characteristic analysis. For comparison, we also considered a recently developed MC detector with FP suppression. In classifying image regions of clustered MCs, the CNN classifier achieved 0.971 in the area under the ROC curve, compared to 0.944 for the MC detector. In detecting clustered MCs from full mammograms, at 90% sensitivity, the CNN classifier obtained an FP rate of 0.69 clusters/image, compared to 1.17 clusters/image by the MC detector. These results indicate that using global image features can be more effective in discriminating clustered MCs from FPs caused by various sources, such as linear structures, thereby providing a more accurate detection of clustered MCs on mammograms.
Validating clustering of molecular dynamics simulations using polymer models.
Phillips, Joshua L; Colvin, Michael E; Newsam, Shawn
2011-11-14
Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers.
Validating clustering of molecular dynamics simulations using polymer models
2011-01-01
Background Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. Results We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. Conclusions We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers. PMID:22082218
Psychometric properties of the WHOQOL-BREF in an Iranian adult sample.
Yousefy, A R; Usefy, A R; Ghassemi, Gh R; Sarrafzadegan, N; Mallik, S; Baghaei, A M; Rabiei, K
2010-04-01
To evaluate discriminant validity, reliability, internal consistency, and dimensional structure of the World Health Organization Quality of Life-BREF (WHOQOL-BREF) in a heterogeneous Iranian population. A clustered randomized sample of 2,956 healthy with 2,936 unhealthy rural and urban inhabitants aged 30 and above from two dissimilar Iranian provinces during 2006 completed the Persian version of the WHOQOL-BREF. We performed descriptive and analytical analysis including t-student, correlation matrix, Cronbach's Alpha, and factor analysis with principal components method and Varimax rotation with SPSS.15. The mean age of the participants was 42.2 +/- 12.1 years and the mean years of education was 9.3 +/- 3.8. The Iranian version of the WHOQOL-BREF domain scores demonstrated good internal consistency, criterion validity, and discriminant validity. The physical health domain contributed most in overall quality of life, while the environment domain made the least contribution. Factor analysis provided evidence for construct validity for four-factor model of the instrument. The scores of all domains discriminated between healthy persons and the patients. The WHOQOL-BREF has adequate psychometric properties and is, therefore, an adequate measure for assessing quality of life at the domain level in an adult Iranian population.
Stone loaches of Choman River system, Kurdistan, Iran (Teleostei: Cypriniformes: Nemacheilidae).
Kamangar, Barzan Bahrami; Prokofiev, Artem M; Ghaderi, Edris; Nalbant, Theodore T
2014-01-20
For the first time, we present data on species composition and distributions of nemacheilid loaches in the Choman River basin of Kurdistan province, Iran. Two genera and four species are recorded from the area, of which three species are new for science: Oxynoemacheilus kurdistanicus, O. zagrosensis, O. chomanicus spp. nov., and Turcinoemacheilus kosswigi Băn. et Nalb. Detailed and illustrated morphological descriptions and univariate and multivariate analysis of morphometric and meristic features are for each of these species. Forty morphometric and eleven meristic characters were used in multivariate analysis to select characters that could discriminate between the four loach species. Discriminant Function Analysis revealed that sixteen morphometric measures and five meristic characters have the most variability between the loach species. The dendrograms based on cluster analysis of Mahalanobis distances of morphometrics and a combination of both characters confirmed two distinct groups: Oxynoemacheilus spp. and T. kosswigi. Within Oxynoemacheilus, O. zagrosensis and O. chomanicus are more similar to one other rather to either is to O. kurdistanicus.
Khanmohammadi, Mohammadreza; Bagheri Garmarudi, Amir; Samani, Simin; Ghasemi, Keyvan; Ashuri, Ahmad
2011-06-01
Attenuated Total Reflectance Fourier Transform Infrared (ATR-FTIR) microspectroscopy was applied for detection of colon cancer according to the spectral features of colon tissues. Supervised classification models can be trained to identify the tissue type based on the spectroscopic fingerprint. A total of 78 colon tissues were used in spectroscopy studies. Major spectral differences were observed in 1,740-900 cm(-1) spectral region. Several chemometric methods such as analysis of variance (ANOVA), cluster analysis (CA) and linear discriminate analysis (LDA) were applied for classification of IR spectra. Utilizing the chemometric techniques, clear and reproducible differences were observed between the spectra of normal and cancer cases, suggesting that infrared microspectroscopy in conjunction with spectral data processing would be useful for diagnostic classification. Using LDA technique, the spectra were classified into cancer and normal tissue classes with an accuracy of 95.8%. The sensitivity and specificity was 100 and 93.1%, respectively.
Song, Seung Yeob; Lee, Young Koung; Kim, In-Jung
2016-01-01
A high-throughput screening system for Citrus lines were established with higher sugar and acid contents using Fourier transform infrared (FT-IR) spectroscopy in combination with multivariate analysis. FT-IR spectra confirmed typical spectral differences between the frequency regions of 950-1100 cm(-1), 1300-1500 cm(-1), and 1500-1700 cm(-1). Principal component analysis (PCA) and subsequent partial least square-discriminant analysis (PLS-DA) were able to discriminate five Citrus lines into three separate clusters corresponding to their taxonomic relationships. The quantitative predictive modeling of sugar and acid contents from Citrus fruits was established using partial least square regression algorithms from FT-IR spectra. The regression coefficients (R(2)) between predicted values and estimated sugar and acid content values were 0.99. These results demonstrate that by using FT-IR spectra and applying quantitative prediction modeling to Citrus sugar and acid contents, excellent Citrus lines can be early detected with greater accuracy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P.; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian
2016-01-01
Background The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Material/Methods Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Results Technical and biological reproducibility ranged between 96.8–99.4% and 47.6–94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Conclusions Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable. PMID:27798637
Malkassian, Anthony; Nerini, David; van Dijk, Mark A; Thyssen, Melilotus; Mante, Claude; Gregori, Gerald
2011-04-01
Analytical flow cytometry (FCM) is well suited for the analysis of phytoplankton communities in fresh and sea waters. The measurement of light scatter and autofluorescence properties of particles by FCM provides optical fingerprints, which enables different phytoplankton groups to be separated. A submersible version of the CytoSense flow cytometer (the CytoSub) has been designed for in situ autonomous sampling and analysis, making it possible to monitor phytoplankton at a short temporal scale and obtain accurate information about its dynamics. For data analysis, a manual clustering is usually performed a posteriori: data are displayed on histograms and scatterplots, and group discrimination is made by drawing and combining regions (gating). The purpose of this study is to provide greater objectivity in the data analysis by applying a nonmanual and consistent method to automatically discriminate clusters of particles. In other words, we seek for partitioning methods based on the optical fingerprints of each particle. As the CytoSense is able to record the full pulse shape for each variable, it quickly generates a large and complex dataset to analyze. The shape, length, and area of each curve were chosen as descriptors for the analysis. To test the developed method, numerical experiments were performed on simulated curves. Then, the method was applied and validated on phytoplankton cultures data. Promising results have been obtained with a mixture of various species whose optical fingerprints overlapped considerably and could not be accurately separated using manual gating. Copyright © 2011 International Society for Advancement of Cytometry.
[Analysis of different forms Linderae Radix based on HPLC and NIRS fingerprints].
Du, Wei-Feng; Yue, Xian-Ke; Wu, Yao; Ge, Wei-Hong; Lu, Tu-Lin; Wang, Zhi-Min
2016-10-01
Three different forms of Linderae Radix were evaluated by HPLC combined with NIRS fingerprint. The Linderae Radix was divided into three forms, including spindle root, straight root and old root. The HPLC fingerprints were developed, and then cluster analysis was performed using the SPSS software. The near-infrared spectra of Linderae Radix was collected, and then established the discriminant analysis model. The similarity values of the spindle root and straight root all were above 0.990, while the similarity value of the old root was less than 0.850. Two forms of Linderae Radix were obviously divided into three parts by the NIRS model and Cluster analysis. The results of HPLC and FT-NIR analysis showed the quality of Linderae Radix old root was different from the spindle root and straight root. The combined use of the two methods could identify different forms of Linderae Radix quickly and accurately. Copyright© by the Chinese Pharmaceutical Association.
Cabral Soares, Fernanda; de Oliveira, Thaís Cristina Galdino; de Macedo, Liliane Dias e Dias; Tomás, Alessandra Mendonça; Picanço-Diniz, Domingos Luiz Wanderley; Bento-Torres, João; Bento-Torres, Natáli Valim Oliver; Picanço-Diniz, Cristovam Wanderley
2015-01-01
Objective The recognition of the limits between normal and pathological aging is essential to start preventive actions. The aim of this paper is to compare the Cambridge Neuropsychological Test Automated Battery (CANTAB) and language tests to distinguish subtle differences in cognitive performances in two different age groups, namely young adults and elderly cognitively normal subjects. Method We selected 29 young adults (29.9±1.06 years) and 31 older adults (74.1±1.15 years) matched by educational level (years of schooling). All subjects underwent a general assessment and a battery of neuropsychological tests, including the Mini Mental State Examination, visuospatial learning, and memory tasks from CANTAB and language tests. Cluster and discriminant analysis were applied to all neuropsychological test results to distinguish possible subgroups inside each age group. Results Significant differences in the performance of aged and young adults were detected in both language and visuospatial memory tests. Intragroup cluster and discriminant analysis revealed that CANTAB, as compared to language tests, was able to detect subtle but significant differences between the subjects. Conclusion Based on these findings, we concluded that, as compared to language tests, large-scale application of automated visuospatial tests to assess learning and memory might increase our ability to discern the limits between normal and pathological aging. PMID:25565785
Sequential analysis of hydrochemical data for watershed characterization.
Thyne, Geoffrey; Güler, Cüneyt; Poeter, Eileen
2004-01-01
A methodology for characterizing the hydrogeology of watersheds using hydrochemical data that combine statistical, geochemical, and spatial techniques is presented. Surface water and ground water base flow and spring runoff samples (180 total) from a single watershed are first classified using hierarchical cluster analysis. The statistical clusters are analyzed for spatial coherence confirming that the clusters have a geological basis corresponding to topographic flowpaths and showing that the fractured rock aquifer behaves as an equivalent porous medium on the watershed scale. Then principal component analysis (PCA) is used to determine the sources of variation between parameters. PCA analysis shows that the variations within the dataset are related to variations in calcium, magnesium, SO4, and HCO3, which are derived from natural weathering reactions, and pH, NO3, and chlorine, which indicate anthropogenic impact. PHREEQC modeling is used to quantitatively describe the natural hydrochemical evolution for the watershed and aid in discrimination of samples that have an anthropogenic component. Finally, the seasonal changes in the water chemistry of individual sites were analyzed to better characterize the spatial variability of vertical hydraulic conductivity. The integrated result provides a method to characterize the hydrogeology of the watershed that fully utilizes traditional data.
Liang, Wenyi; Chen, Wenjing; Wu, Lingfang; Li, Shi; Qi, Qi; Cui, Yaping; Liang, Linjin; Ye, Ting; Zhang, Lanzhen
2017-03-17
Danshen, the dried root of Salvia miltiorrhiza Bge., is a widely used commercially available herbal drug, and unstable quality of different samples is a current issue. This study focused on a comprehensive and systematic method combining fingerprints and chemical identification with chemometrics for discrimination and quality assessment of Danshen samples. Twenty-five samples were analyzed by HPLC-PAD and HPLC-MS n . Forty-nine components were identified and characteristic fragmentation regularities were summarized for further interpretation of bioactive components. Chemometric analysis was employed to differentiate samples and clarify the quality differences of Danshen including hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis. Consistent results were that the samples were divided into three categories which reflected the difference in quality of Danshen samples. By analyzing the reasons for sample classification, it was revealed that the processing method had a more obvious impact on sample classification than the geographical origin, it induced the different content of bioactive compounds and finally lead to different qualities. Cryptotanshinone, trijuganone B, and 15,16-dihydrotanshinone I were screened out as markers to distinguish samples by different processing methods. The developed strategy could provide a reference for evaluation and discrimination of other traditional herbal medicines.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.
2004-08-06
The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayedmore » embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Measuring conservation of sequence features closely linked to function--such as binding-site clustering--makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less
NASA Astrophysics Data System (ADS)
Bibi, Humera; Alam, Khan; Bibi, Samina
2016-11-01
Discrimination of aerosol types is essential over the Indo-Gangetic plain (IGP) because several aerosol types originate from different sources having different atmospheric impacts. In this paper, we analyzed a seasonal discrimination of aerosol types by multiple clustering techniques using AERosol RObotic NETwork (AERONET) datasets for the period 2007-2013 over Karachi, Lahore, Jaipur and Kanpur. We discriminated the aerosols into three major types; dust, biomass burning and urban/industrial. The discrimination was carried out by analyzing different aerosol optical properties such as Aerosol Optical Depth (AOD), Angstrom Exponent (AE), Extinction Angstrom Exponent (EAE), Abortion Angstrom Exponent (AAE), Single Scattering Albedo (SSA) and Real Refractive Index (RRI) and their interrelationship to investigate the dominant aerosol types and to examine the variation in their seasonal distribution. The results revealed that during summer and pre-monsoon, dust aerosols were dominant while during winter and post-monsoon prevailing aerosols were biomass burning and urban industrial, and the mixed type of aerosols were present in all seasons. These types of aerosol discriminated from AERONET were in good agreement with CALIPSO (the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation) measurement.
Klett, Hagen; Fuellgraf, Hannah; Levit-Zerdoun, Ella; Hussung, Saskia; Kowar, Silke; Küsters, Simon; Bronsert, Peter; Werner, Martin; Wittel, Uwe; Fritsch, Ralph; Busch, Hauke; Boerries, Melanie
2018-01-01
Late diagnosis and systemic dissemination essentially contribute to the invariably poor prognosis of pancreatic ductal adenocarcinoma (PDAC). Therefore, the development of diagnostic biomarkers for PDAC are urgently needed to improve patient stratification and outcome in the clinic. By studying the transcriptomes of independent PDAC patient cohorts of tumor and non-tumor tissues, we identified 81 robustly regulated genes, through a novel, generally applicable meta-analysis. Using consensus clustering on co-expression values revealed four distinct clusters with genes originating from exocrine/endocrine pancreas, stromal and tumor cells. Three clusters were strongly associated with survival of PDAC patients based on TCGA database underlining the prognostic potential of the identified genes. With the added information of impact of survival and the robustness within the meta-analysis, we extracted a 17-gene subset for further validation. We show that it did not only discriminate PDAC from non-tumor tissue and stroma in fresh-frozen as well as formalin-fixed paraffin embedded samples, but also detected pancreatic precursor lesions and singled out pancreatitis samples. Moreover, the classifier discriminated PDAC from other cancers in the TCGA database. In addition, we experimentally validated the classifier in PDAC patients on transcript level using qPCR and exemplify the usage on protein level for three proteins (AHNAK2, LAMC2, TFF1) using immunohistochemistry and for two secreted proteins (TFF1, SERPINB5) using ELISA-based protein detection in blood-plasma. In conclusion, we present a novel robust diagnostic and prognostic gene signature for PDAC with future potential applicability in the clinic.
Klett, Hagen; Fuellgraf, Hannah; Levit-Zerdoun, Ella; Hussung, Saskia; Kowar, Silke; Küsters, Simon; Bronsert, Peter; Werner, Martin; Wittel, Uwe; Fritsch, Ralph; Busch, Hauke; Boerries, Melanie
2018-01-01
Late diagnosis and systemic dissemination essentially contribute to the invariably poor prognosis of pancreatic ductal adenocarcinoma (PDAC). Therefore, the development of diagnostic biomarkers for PDAC are urgently needed to improve patient stratification and outcome in the clinic. By studying the transcriptomes of independent PDAC patient cohorts of tumor and non-tumor tissues, we identified 81 robustly regulated genes, through a novel, generally applicable meta-analysis. Using consensus clustering on co-expression values revealed four distinct clusters with genes originating from exocrine/endocrine pancreas, stromal and tumor cells. Three clusters were strongly associated with survival of PDAC patients based on TCGA database underlining the prognostic potential of the identified genes. With the added information of impact of survival and the robustness within the meta-analysis, we extracted a 17-gene subset for further validation. We show that it did not only discriminate PDAC from non-tumor tissue and stroma in fresh-frozen as well as formalin-fixed paraffin embedded samples, but also detected pancreatic precursor lesions and singled out pancreatitis samples. Moreover, the classifier discriminated PDAC from other cancers in the TCGA database. In addition, we experimentally validated the classifier in PDAC patients on transcript level using qPCR and exemplify the usage on protein level for three proteins (AHNAK2, LAMC2, TFF1) using immunohistochemistry and for two secreted proteins (TFF1, SERPINB5) using ELISA-based protein detection in blood-plasma. In conclusion, we present a novel robust diagnostic and prognostic gene signature for PDAC with future potential applicability in the clinic. PMID:29675033
Enhancement of plant metabolite fingerprinting by machine learning.
Scott, Ian M; Vermeer, Cornelia P; Liakata, Maria; Corol, Delia I; Ward, Jane L; Lin, Wanchang; Johnson, Helen E; Whitehead, Lynne; Kular, Baldeep; Baker, John M; Walsh, Sean; Dave, Anuja; Larson, Tony R; Graham, Ian A; Wang, Trevor L; King, Ross D; Draper, John; Beale, Michael H
2010-08-01
Metabolite fingerprinting of Arabidopsis (Arabidopsis thaliana) mutants with known or predicted metabolic lesions was performed by (1)H-nuclear magnetic resonance, Fourier transform infrared, and flow injection electrospray-mass spectrometry. Fingerprinting enabled processing of five times more plants than conventional chromatographic profiling and was competitive for discriminating mutants, other than those affected in only low-abundance metabolites. Despite their rapidity and complexity, fingerprints yielded metabolomic insights (e.g. that effects of single lesions were usually not confined to individual pathways). Among fingerprint techniques, (1)H-nuclear magnetic resonance discriminated the most mutant phenotypes from the wild type and Fourier transform infrared discriminated the fewest. To maximize information from fingerprints, data analysis was crucial. One-third of distinctive phenotypes might have been overlooked had data models been confined to principal component analysis score plots. Among several methods tested, machine learning (ML) algorithms, namely support vector machine or random forest (RF) classifiers, were unsurpassed for phenotype discrimination. Support vector machines were often the best performing classifiers, but RFs yielded some particularly informative measures. First, RFs estimated margins between mutant phenotypes, whose relations could then be visualized by Sammon mapping or hierarchical clustering. Second, RFs provided importance scores for the features within fingerprints that discriminated mutants. These scores correlated with analysis of variance F values (as did Kruskal-Wallis tests, true- and false-positive measures, mutual information, and the Relief feature selection algorithm). ML classifiers, as models trained on one data set to predict another, were ideal for focused metabolomic queries, such as the distinctiveness and consistency of mutant phenotypes. Accessible software for use of ML in plant physiology is highlighted.
Van Cann, Joannes; Virgilio, Massimiliano; Jordaens, Kurt; De Meyer, Marc
2015-01-01
Previous attempts to resolve the Ceratitis FAR complex (Ceratitis fasciventris, Ceratitis anonae, Ceratitis rosa, Diptera, Tephritidae) showed contrasting results and revealed the occurrence of five microsatellite genotypic clusters (A, F1, F2, R1, R2). In this paper we explore the potential of wing morphometrics for the diagnosis of FAR morphospecies and genotypic clusters. We considered a set of 227 specimens previously morphologically identified and genotyped at 16 microsatellite loci. Seventeen wing landmarks and 6 wing band areas were used for morphometric analyses. Permutational multivariate analysis of variance detected significant differences both across morphospecies and genotypic clusters (for both males and females). Unconstrained and constrained ordinations did not properly resolve groups corresponding to morphospecies or genotypic clusters. However, posterior group membership probabilities (PGMPs) of the Discriminant Analysis of Principal Components (DAPC) allowed the consistent identification of a relevant proportion of specimens (but with performances differing across morphospecies and genotypic clusters). This study suggests that wing morphometrics and PGMPs might represent a possible tool for the diagnosis of species within the FAR complex. Here, we propose a tentative diagnostic method and provide a first reference library of morphometric measures that might be used for the identification of additional and unidentified FAR specimens.
Yu, Ke-Qiang; Zhao, Yan-Ru; Liu, Fei; He, Yong
2016-01-01
The aim of this work was to analyze the variety of soil by laser-induced breakdown spectroscopy (LIBS) coupled with chemometrics methods. 6 certified reference materials (CRMs) of soil samples were selected and their LIBS spectra were captured. Characteristic emission lines of main elements were identified based on the LIBS curves and corresponding contents. From the identified emission lines, LIBS spectra in 7 lines with high signal-to-noise ratio (SNR) were chosen for further analysis. Principal component analysis (PCA) was carried out using the LIBS spectra at 7 selected lines and an obvious cluster of 6 soils was observed. Soft independent modeling of class analogy (SIMCA) and least-squares support vector machine (LS-SVM) were introduced to establish discriminant models for classifying the 6 types of soils, and they offered the correct discrimination rates of 90% and 100%, respectively. Receiver operating characteristic (ROC) curve was used to evaluate the performance of models and the results demonstrated that the LS-SVM model was promising. Lastly, 8 types of soils from different places were gathered to conduct the same experiments for verifying the selected 7 emission lines and LS-SVM model. The research revealed that LIBS technology coupled with chemometrics could conduct the variety discrimination of soil. PMID:27279284
Basati, Zahra; Jamshidi, Bahareh; Rasekh, Mansour; Abbaspour-Gilandeh, Yousef
2018-05-30
The presence of sunn pest-damaged grains in wheat mass reduces the quality of flour and bread produced from it. Therefore, it is essential to assess the quality of the samples in collecting and storage centers of wheat and flour mills. In this research, the capability of visible/near-infrared (Vis/NIR) spectroscopy combined with pattern recognition methods was investigated for discrimination of wheat samples with different percentages of sunn pest-damaged. To this end, various samples belonging to five classes (healthy and 5%, 10%, 15% and 20% unhealthy) were analyzed using Vis/NIR spectroscopy (wavelength range of 350-1000 nm) based on both supervised and unsupervised pattern recognition methods. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) as the unsupervised techniques and soft independent modeling of class analogies (SIMCA) and partial least squares-discriminant analysis (PLS-DA) as supervised methods were used. The results showed that Vis/NIR spectra of healthy samples were correctly clustered using both PCA and HCA. Due to the high overlapping between the four unhealthy classes (5%, 10%, 15% and 20%), it was not possible to discriminate all the unhealthy samples in individual classes. However, when considering only the two main categories of healthy and unhealthy, an acceptable degree of separation between the classes can be obtained after classification with supervised pattern recognition methods of SIMCA and PLS-DA. SIMCA based on PCA modeling correctly classified samples in two classes of healthy and unhealthy with classification accuracy of 100%. Moreover, the power of the wavelengths of 839 nm, 918 nm and 995 nm were more than other wavelengths to discriminate two classes of healthy and unhealthy. It was also concluded that PLS-DA provides excellent classification results of healthy and unhealthy samples (R 2 = 0.973 and RMSECV = 0.057). Therefore, Vis/NIR spectroscopy based on pattern recognition techniques can be useful for rapid distinguishing the healthy wheat samples from those damaged by sunn pest in the maintenance and processing centers. Copyright © 2018 Elsevier B.V. All rights reserved.
Miconi, Diana; Altoè, Gianmarco; Salcuni, Silvia; Di Riso, Daniela; Schiff, Sami; Moscardino, Ughetta
2018-05-24
Although discrimination is a common stressor in the everyday life of immigrant youth, individuals are not equally susceptible to its adverse effects. This cross-sectional study aimed to examine whether cultural orientation preferences and impulse control (IC) moderate the association between perceived discrimination and externalizing problems among Moroccan- and Romanian-origin early adolescents in Italy. The sample included 126 Moroccan and 126 Romanian youths (46% girls, 42% first-generation) aged 11-13 years and their parents. Perceived discrimination and cultural orientations were assessed using self-report questionnaires, while IC was evaluated via a computerized version of the Iowa Gambling Task. Externalizing behaviors were assessed via parental report. Cluster analysis identified separated, assimilated, and integrated early adolescents. Regression analyses revealed that when facing discrimination, youths who endorsed separation and exhibited low levels of IC were more vulnerable to externalizing problems. In contrast, among assimilated adolescents the discrimination-externalizing difficulties link was significant at high levels of IC. Furthermore, low levels of IC were associated with more externalizing problems for Romanian, but not for Moroccan early adolescents. Findings underscore the need to consider both cultural orientation processes and early adolescents' ability to control their impulses when developing interventions aimed to reduce discrimination-related problem behaviors in immigrant youth. Implications for theory and practice are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Zhang, Hong-Guang; Yang, Qin-Min; Lu, Jian-Gang
2014-04-01
In this paper, a novel discriminant methodology based on near infrared spectroscopic analysis technique and least square support vector machine was proposed for rapid and nondestructive discrimination of different types of Polyacrylamide. The diffuse reflectance spectra of samples of Non-ionic Polyacrylamide, Anionic Polyacrylamide and Cationic Polyacrylamide were measured. Then principal component analysis method was applied to reduce the dimension of the spectral data and extract of the principal compnents. The first three principal components were used for cluster analysis of the three different types of Polyacrylamide. Then those principal components were also used as inputs of least square support vector machine model. The optimization of the parameters and the number of principal components used as inputs of least square support vector machine model was performed through cross validation based on grid search. 60 samples of each type of Polyacrylamide were collected. Thus a total of 180 samples were obtained. 135 samples, 45 samples for each type of Polyacrylamide, were randomly split into a training set to build calibration model and the rest 45 samples were used as test set to evaluate the performance of the developed model. In addition, 5 Cationic Polyacrylamide samples and 5 Anionic Polyacrylamide samples adulterated with different proportion of Non-ionic Polyacrylamide were also prepared to show the feasibilty of the proposed method to discriminate the adulterated Polyacrylamide samples. The prediction error threshold for each type of Polyacrylamide was determined by F statistical significance test method based on the prediction error of the training set of corresponding type of Polyacrylamide in cross validation. The discrimination accuracy of the built model was 100% for prediction of the test set. The prediction of the model for the 10 mixing samples was also presented, and all mixing samples were accurately discriminated as adulterated samples. The overall results demonstrate that the discrimination method proposed in the present paper can rapidly and nondestructively discriminate the different types of Polyacrylamide and the adulterated Polyacrylamide samples, and offered a new approach to discriminate the types of Polyacrylamide.
A new locally weighted K-means for cancer-aided microarray data analysis.
Iam-On, Natthakan; Boongoen, Tossapon
2012-11-01
Cancer has been identified as the leading cause of death. It is predicted that around 20-26 million people will be diagnosed with cancer by 2020. With this alarming rate, there is an urgent need for a more effective methodology to understand, prevent and cure cancer. Microarray technology provides a useful basis of achieving this goal, with cluster analysis of gene expression data leading to the discrimination of patients, identification of possible tumor subtypes and individualized treatment. Amongst clustering techniques, k-means is normally chosen for its simplicity and efficiency. However, it does not account for the different importance of data attributes. This paper presents a new locally weighted extension of k-means, which has proven more accurate across many published datasets than the original and other extensions found in the literature.
Wettstein, Markus; Wahl, Hans-Werner; Shoval, Noam; Auslander, Gail; Oswald, Frank; Heinik, Jeremia
2015-12-01
Heterogeneity in older adults' mobility and its correlates have rarely been investigated based on objective mobility data and in samples including cognitively impaired individuals. We analyzed mobility profiles within a cognitively heterogeneous sample of N = 257 older adults from Israel and Germany based on GPS tracking technology. Participants were aged between 59 and 91 years (M = 72.9; SD = 6.4) and were either cognitively healthy (CH, n = 146), mildly cognitively impaired (MCI, n = 76), or diagnosed with an early-stage dementia of the Alzheimer's type (DAT, n = 35). Based on cluster analysis, we identified three mobility types ("Mobility restricted," "Outdoor oriented," "Walkers"), which could be predicted based on socio-demographic indicators, activity, health, and cognitive impairment status using discriminant analysis. Particularly demented individuals and persons with worse health exhibited restrictions in mobility. Our findings contribute to a better understanding of heterogeneity in mobility in old age. © The Author(s) 2013.
Uarrota, Virgílio Gavicho; Moresco, Rodolfo; Coelho, Bianca; Nunes, Eduardo da Costa; Peruch, Luiz Augusto Martins; Neubert, Enilto de Oliveira; Rocha, Miguel; Maraschin, Marcelo
2014-10-15
Cassava roots are an important source of dietary and industrial carbohydrates and suffer markedly from postharvest physiological deterioration (PPD). This paper deals with metabolomics combined with chemometric tools for screening the chemical and enzymatic composition in several genotypes of cassava roots during PPD. Metabolome analyses showed increases in carotenoids, flavonoids, anthocyanins, phenolics, reactive scavenging species, and enzymes (superoxide dismutase family, hydrogen peroxide, and catalase) until 3-5days postharvest. PPD correlated negatively with phenolics and carotenoids and positively with anthocyanins and flavonoids. Chemometric tools such as principal component analysis, partial least squares discriminant analysis, and support vector machines discriminated well cassava samples and enabled a good prediction of samples. Hierarchical clustering analyses grouped samples according to their levels of PPD and chemical compositions. Copyright © 2014 Elsevier Ltd. All rights reserved.
Muhammad, Syahidah Akmal; Seow, Eng-Keng; Mohd Omar, A K; Rodhi, Ainolsyakira Mohd; Mat Hassan, Hasnuri; Lalung, Japareng; Lee, Sze-Chi; Ibrahim, Baharudin
2018-01-01
A total of 33 crude palm oil samples were randomly collected from different regions in Malaysia. Stable carbon isotopic composition (δ 13 C) was determined using Flash 2000 elemental analyzer while hydrogen and oxygen isotopic compositions (δ 2 H and δ 18 O) were analyzed by Thermo Finnigan TC/EA, wherein both instruments were coupled to an isotope ratio mass spectrometer. The bulk δ 2 H, δ 18 O and δ 13 C of the samples were analyzed by Hierarchical Cluster Analysis (HCA), Principal Component Analysis (PCA) and Orthogonal Partial Least Square-Discriminant Analysis (OPLS-DA). Unsupervised HCA and PCA methods have demonstrated that crude palm oil samples were grouped into clusters according to respective state. A predictive model was constructed by supervised OPLS-DA with good predictive power of 52.60%. Robustness of the predictive model was validated with overall accuracy of 71.43%. Blind test samples were correctly assigned to their respective cluster except for samples from southern region. δ 18 O was proposed as the promising discriminatory marker for discerning crude palm oil samples obtained from different regions. Stable isotopes profile was proven to be useful for origin traceability of crude palm oil samples at a narrower geographical area, i.e. based on regions in Malaysia. Predictive power and accuracy of the predictive model was expected to improve with the increase in sample size. Conclusively, the results in this study has fulfilled the main objective of this work where the simple approach of combining stable isotope analysis with chemometrics can be used to discriminate crude palm oil samples obtained from different regions in Malaysia. Overall, this study shows the feasibility of this approach to be used as a traceability assessment of crude palm oils. Copyright © 2017 The Chartered Society of Forensic Sciences. Published by Elsevier B.V. All rights reserved.
Mirjankar, Nikhil S; Fraga, Carlos G; Carman, April J; Moran, James J
2016-02-02
Chemical attribution signatures (CAS) for chemical threat agents (CTAs), such as cyanides, are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. Herein, stocks of KCN and NaCN were analyzed for trace anions by high performance ion chromatography (HPIC), carbon stable isotope ratio (δ(13)C) by isotope ratio mass spectrometry (IRMS), and trace elements by inductively coupled plasma optical emission spectroscopy (ICP-OES). The collected analytical data were evaluated using hierarchical cluster analysis (HCA), Fisher-ratio (F-ratio), interval partial least-squares (iPLS), genetic algorithm-based partial least-squares (GAPLS), partial least-squares discriminant analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminant analysis (SVMDA). HCA of anion impurity profiles from multiple cyanide stocks from six reported countries of origin resulted in cyanide samples clustering into three groups, independent of the associated alkali metal (K or Na). The three groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries each having one known solid cyanide factory: Czech Republic, Germany, and United States. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability associated with using anion impurities for matching a cyanide sample to its factory using our current cyanide stocks. Variable selection methods reduced errors for those classification methods having errors greater than zero; iPLS-forward selection and F-ratio typically provided the lowest errors. Finally, using anion profiles to classify cyanides to a specific stock or stock group for a subset of United States stocks resulted in cross-validation errors ranging from 0 to 5.3%.
Studying the Therapeutic Process by Observing Clinicians' In-Session Behaviour.
Montaño-Fidalgo, Montserrat; Ruiz, Elena M; Calero-Elvira, Ana; Froján-Parga, María Xesús
2015-01-01
This paper presents a further step in the use and validation of a systematic, functional-analytic method of describing psychologists' verbal behaviour during therapy. We observed recordings from 92 clinical sessions of 19 adults (14 women and 5 men of Caucasian origin, with ages ranging from 19 to 51 years) treated by nine cognitive-behavioural therapists (eight women and one man, Caucasian as well, with ages ranging from 25 to 48 years). The therapists' verbal behaviour was codified and then classified according to its possible functionality. A cluster analysis of the data, followed by a discriminant analysis, showed that the therapists' verbal behaviour tended to aggregate around four types of session differentiated by their clinical objective (assessment, explanation, treatment and consolidation). These results confirm the validity of our method and enable us to further describe clinical phenomena by distinguishing psychologists' classes of clinically relevant activities. Specific learning mechanisms may be responsible for clinical change within each class. These issues should be analysed more closely when explaining therapeutic phenomena and when developing more effective forms of clinical intervention. We described therapists' verbal behaviour in a focused fashion so as to develop new research methods that evaluate psychological work moment by moment. We performed a cluster analysis in order to evaluate how the therapists' verbal behaviour was distributed throughout the intervention. A discriminant analysis gave us further information about the statistical significance and possible nature of the clusters we observed. The therapists' verbal behaviour depended on current clinical objectives and could be classified into four classes of clinically relevant activities: evaluation, explanation, treatment and consolidation. Some of the therapist's verbalizations were more important than others when carrying out these clinically relevant activities. The distribution of the therapists' verbal behaviour across classes may provide us with clues regarding the functionality of their in-session verbal behaviour. Copyright © 2014 John Wiley & Sons, Ltd.
Rumore, Jillian Leigh; Tschetter, Lorelee; Nadon, Celine
2016-05-01
The lack of pattern diversity among pulsed-field gel electrophoresis (PFGE) profiles for Escherichia coli O157:H7 in Canada does not consistently provide optimal discrimination, and therefore, differentiating temporally and/or geographically associated sporadic cases from potential outbreak cases can at times impede investigations. To address this limitation, DNA sequence-based methods such as multilocus variable-number tandem-repeat analysis (MLVA) have been explored. To assess the performance of MLVA as a supplemental method to PFGE from the Canadian perspective, a retrospective analysis of all E. coli O157:H7 isolated in Canada from January 2008 to December 2012 (inclusive) was conducted. A total of 2285 E. coli O157:H7 isolates and 63 clusters of cases (by PFGE) were selected for the study. Based on the qualitative analysis, the addition of MLVA improved the categorization of cases for 60% of clusters and no change was observed for ∼40% of clusters investigated. In such situations, MLVA serves to confirm PFGE results, but may not add further information per se. The findings of this study demonstrate that MLVA data, when used in combination with PFGE-based analyses, provide additional resolution to the detection of clusters lacking PFGE diversity as well as demonstrate good epidemiological concordance. In addition, MLVA is able to identify cluster-associated isolates with variant PFGE pattern combinations that may have been previously missed by PFGE alone. Optimal laboratory surveillance in Canada is achieved with the application of PFGE and MLVA in tandem for routine surveillance, cluster detection, and outbreak response.
Evaluation of drinking quality of groundwater through multivariate techniques in urban area.
Das, Madhumita; Kumar, A; Mohapatra, M; Muduli, S D
2010-07-01
Groundwater is a major source of drinking water in urban areas. Because of the growing threat of debasing water quality due to urbanization and development, monitoring water quality is a prerequisite to ensure its suitability for use in drinking. But analysis of a large number of properties and parameter to parameter basis evaluation of water quality is not feasible in a regular interval. Multivariate techniques could streamline the data without much loss of information to a reasonably manageable data set. In this study, using principal component analysis, 11 relevant properties of 58 water samples were grouped into three statistical factors. Discriminant analysis identified "pH influence" as the most distinguished factor and pH, Fe, and NO₃⁻ as the most discriminating variables and could be treated as water quality indicators. These were utilized to classify the sampling sites into homogeneous clusters that reflect location-wise importance of specific indicator/s for use to monitor drinking water quality in the whole study area.
Hanseniaspora uvarum from Winemaking Environments Show Spatial and Temporal Genetic Clustering
Albertin, Warren; Setati, Mathabatha E.; Miot-Sertier, Cécile; Mostert, Talitha T.; Colonna-Ceccaldi, Benoit; Coulon, Joana; Girard, Patrick; Moine, Virginie; Pillet, Myriam; Salin, Franck; Bely, Marina; Divol, Benoit; Masneuf-Pomarede, Isabelle
2016-01-01
Hanseniaspora uvarum is one of the most abundant yeast species found on grapes and in grape must, at least before the onset of alcoholic fermentation (AF) which is usually performed by Saccharomyces species. The aim of this study was to characterize the genetic and phenotypic variability within the H. uvarum species. One hundred and fifteen strains isolated from winemaking environments in different geographical origins were analyzed using 11 microsatellite markers and a subset of 47 strains were analyzed by AFLP. H. uvarum isolates clustered mainly on the basis of their geographical localization as revealed by microsatellites. In addition, a strong clustering based on year of isolation was evidenced, indicating that the genetic diversity of H. uvarum isolates was related to both spatial and temporal variations. Conversely, clustering analysis based on AFLP data provided a different picture with groups showing no particular characteristics, but provided higher strain discrimination. This result indicated that AFLP approaches are inadequate to establish the genetic relationship between individuals, but allowed good strain discrimination. At the phenotypic level, several extracellular enzymatic activities of enological relevance (pectinase, chitinase, protease, β-glucosidase) were measured but showed low diversity. The impact of environmental factors of enological interest (temperature, anaerobia, and copper addition) on growth was also assessed and showed poor variation. Altogether, this work provided both new analytical tool (microsatellites) and new insights into the genetic and phenotypic diversity of H. uvarum, a yeast species that has previously been identified as a potential candidate for co-inoculation in grape must, but whose intraspecific variability had never been fully assessed. PMID:26834719
NASA Astrophysics Data System (ADS)
Teye, Ernest; Huang, Xingyi; Dai, Huang; Chen, Quansheng
2013-10-01
Quick, accurate and reliable technique for discrimination of cocoa beans according to geographical origin is essential for quality control and traceability management. This current study presents the application of Near Infrared Spectroscopy technique and multivariate classification for the differentiation of Ghana cocoa beans. A total of 194 cocoa bean samples from seven cocoa growing regions were used. Principal component analysis (PCA) was used to extract relevant information from the spectral data and this gave visible cluster trends. The performance of four multivariate classification methods: Linear discriminant analysis (LDA), K-nearest neighbors (KNN), Back propagation artificial neural network (BPANN) and Support vector machine (SVM) were compared. The performances of the models were optimized by cross validation. The results revealed that; SVM model was superior to all the mathematical methods with a discrimination rate of 100% in both the training and prediction set after preprocessing with Mean centering (MC). BPANN had a discrimination rate of 99.23% for the training set and 96.88% for prediction set. While LDA model had 96.15% and 90.63% for the training and prediction sets respectively. KNN model had 75.01% for the training set and 72.31% for prediction set. The non-linear classification methods used were superior to the linear ones. Generally, the results revealed that NIR Spectroscopy coupled with SVM model could be used successfully to discriminate cocoa beans according to their geographical origins for effective quality assurance.
Crawford, Natalie D; Borrell, Luisa N; Galea, Sandro; Ford, Chandra; Latkin, Carl; Fuller, Crystal M
2013-04-01
Social discrimination may isolate drug users into higher risk relationships, particularly in disadvantaged neighborhood environments where drug trade occurs. We used negative binomial regression accounting for clustering of individuals within their recruitment neighborhood to investigate the relationship between high-risk drug ties with various forms of social discrimination, neighborhood minority composition, poverty and education. Results show that experiencing discrimination due to drug use is significantly associated with more drug ties in neighborhoods with fewer blacks. Future social network and discrimination research should assess the role of neighborhood social cohesion.
Morphology delimits more species than molecular genetic clusters of invasive Pilosella.
Moffat, Chandra E; Ensing, David J; Gaskin, John F; De Clerck-Floate, Rosemarie A; Pither, Jason
2015-07-01
• Accurate assessments of biodiversity are paramount for understanding ecosystem processes and adaptation to change. Invasive species often contribute substantially to local biodiversity; correctly identifying and distinguishing invaders is thus necessary to assess their potential impacts. We compared the reliability of morphology and molecular sequences to discriminate six putative species of invasive Pilosella hawkweeds (syn. Hieracium, Asteraceae), known for unreliable identifications and historical introgression. We asked (1) which morphological traits dependably discriminate putative species, (2) if genetic clusters supported morphological species, and (3) if novel hybridizations occur in the invaded range.• We assessed 33 morphometric characters for their discriminatory power using the randomForest classifier and, using AFLPs, evaluated genetic clustering with the program structure and subsequently with an AMOVA. The strength of the association between morphological and genotypic dissimilarity was assessed with a Mantel test.• Morphometric analyses delimited six species while genetic analyses defined only four clusters. Specifically, we found (1) eight morphological traits could reliably distinguish species, (2) structure suggested strong genetic differentiation but for only four putative species clusters, and (3) genetic data suggest both novel hybridizations and multiple introductions have occurred.• (1) Traditional floristic techniques may resolve more species than molecular analyses in taxonomic groups subject to introgression. (2) Even within complexes of closely related species, relatively few but highly discerning morphological characters can reliably discriminate species. (3) By clarifying patterns of morphological and genotypic variation of invasive Pilosella, we lay foundations for further ecological study and mitigation. © 2015 Botanical Society of America, Inc.
Shape characteristics of equilibrium and non-equilibrium fractal clusters.
Mansfield, Marc L; Douglas, Jack F
2013-07-28
It is often difficult in practice to discriminate between equilibrium and non-equilibrium nanoparticle or colloidal-particle clusters that form through aggregation in gas or solution phases. Scattering studies often permit the determination of an apparent fractal dimension, but both equilibrium and non-equilibrium clusters in three dimensions frequently have fractal dimensions near 2, so that it is often not possible to discriminate on the basis of this geometrical property. A survey of the anisotropy of a wide variety of polymeric structures (linear and ring random and self-avoiding random walks, percolation clusters, lattice animals, diffusion-limited aggregates, and Eden clusters) based on the principal components of both the radius of gyration and electric polarizability tensor indicates, perhaps counter-intuitively, that self-similar equilibrium clusters tend to be intrinsically anisotropic at all sizes, while non-equilibrium processes such as diffusion-limited aggregation or Eden growth tend to be isotropic in the large-mass limit, providing a potential means of discriminating these clusters experimentally if anisotropy could be determined along with the fractal dimension. Equilibrium polymer structures, such as flexible polymer chains, are normally self-similar due to the existence of only a single relevant length scale, and are thus anisotropic at all length scales, while non-equilibrium polymer structures that grow irreversibly in time eventually become isotropic if there is no difference in the average growth rates in different directions. There is apparently no proof of these general trends and little theoretical insight into what controls the universal anisotropy in equilibrium polymer structures of various kinds. This is an obvious topic of theoretical investigation, as well as a matter of practical interest. To address this general problem, we consider two experimentally accessible ratios, one between the hydrodynamic and gyration radii, the other between the viscosity and hydrodynamic radii, as potential measures of shape anisotropy. We also find a strong correlation between anisotropy and effective fractal dimension. These observations should provide new practical methods for quantifying the nature of particle clustering in diverse contexts.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moran, James J.; Ehrhardt, Christopher J.; Wahl, Jon H.
We analyzed 21 neat acetone samples from 15 different suppliers to demonstrate the utility of a coupled stable isotope and trace contaminant strategy for distinguishing forensically-relevant samples. By combining these two pieces of orthogonal data we could discriminate all of the acetones that were produced by the 15 different suppliers. Using stable isotope ratios alone, we were able to distinguish 9 acetone samples, while the remaining 12 fell into four clusters with highly similar signatures. Adding trace chemical contaminant information enhanced discrimination to 13 individual acetones with three residual clusters. The acetones within each cluster shared a common manufacturer andmore » might, therefore, not be expected to be resolved. The data presented here demonstrates the power of combining orthogonal data sets to enhance sample fingerprinting and highlights the role disparate data could play in future forensic investigations.« less
FT-IR microspectroscopy in rapid identification of bacteria in pure and mixed culture
NASA Astrophysics Data System (ADS)
Fontoura, Inglid; Belo, Ricardo; Sakane, Kumiko; Cardoso, Maria Angélica Gargione; Khouri, Sônia; Uehara, Mituo; Raniero, Leandro; Martin, Airton A.
2010-02-01
In recent years FT-IR microspectroscopy has been developed for microbiology analysis and applied successfully in pure cultures of microorganisms to rapidly identify strains of bacteria, yeasts and fungi. The investigation and characterization of microorganism mixed cultures is also of growing importance, especially in hospitals where it is common to poly-microbial infections. In this work, the rapid identification of bacteria in pure and mixed cultures was studied. The bacteria were obtained from the Institute Oswaldo Cruz culture collection at Brazil. Escherichia coli ATCC 10799 and Staphylococcus aureus ATCC 14456 were analyzed, 3 inoculations were examined in triplicate: Escherichia coli, Staphylococcus aureus and a mixed culture of them. The inoculations were prepared according to McFarland 0.5, incubated at 37 ° C for 6 hours, diluted in saline, placed in the CaF2 window and store for one hour at 50°C to obtain thin film. The measurement was performed by Spectrum Spotlight 400 (Perkin-Elmer) equipment in the range of 4000-900 cm-1, with 32 scans using a transmittance technique with point and image modes. The data were processed (baseline, normalization, calculation of first derivate followed by smoothing with 9 point using a Savitzky-Golay algorithm) and a cluster analysis were done by Ward's algorithm and an excellent discrimination between pure and mixed culture was obtained. Our preliminary results indicate that the FT-IR microspectroscopy associated with cluster analysis can be used to discriminate between pure and mixed culture.
Identification of pathogenic fungi with an optoelectronic nose
Zhang, Yinan; Askim, Jon R.; Zhong, Wenxuan; Orlean, Peter; Suslick, Kenneth S.
2014-01-01
Human fungal infections have gained recent notoriety following contamination of pharmaceuticals in the compounding process. Such invasive infections are a more serious global problem, especially for immunocompromised patients. While superficial fungal infections are common and generally curable, invasive fungal infections are often life-threatening and much harder to diagnose and treat. Despite the increasing awareness of the situation’s severity, currently available fungal diagnostic methods cannot always meet diagnostic needs, especially for invasive fungal infections. Volatile organic compounds produced by fungi provide an alternative diagnostic approach for identification of fungal strains. We report here an optoelectronic nose based on a disposable colorimetric sensor array capable of rapid differentiation and identification of pathogenic fungi based on their metabolic profiles of emitted volatiles. The sensor arrays were tested with 12 human pathogenic fungal strains grown on standard agar medium. Array responses were monitored with an ordinary flatbed scanner. All fungal strains gave unique composite responses within 3 hours and were correctly clustered using hierarchical cluster analysis. A standard jackknifed linear discriminant analysis gave a classification accuracy of 94% for 155 trials. Tensor discriminant analysis, which takes better advantage of the high dimensionality of the sensor array data, gave a classification accuracy of 98.1%. The sensor array is also able to observe metabolic changes in growth patterns upon the addition of fungicides, and this provides a facile screening tool for determining fungicide efficacy for various fungal strains in real time. PMID:24570999
Exploring Game Performance in the National Basketball Association Using Player Tracking Data
Calleja-González, Julio; Jiménez Sáiz, Sergio; Schelling i del Alcázar, Xavi; Balciunas, Mindaugas
2015-01-01
Recent player tracking technology provides new information about basketball game performance. The aim of this study was to (i) compare the game performances of all-star and non all-star basketball players from the National Basketball Association (NBA), and (ii) describe the different basketball game performance profiles based on the different game roles. Archival data were obtained from all 2013-2014 regular season games (n = 1230). The variables analyzed included the points per game, minutes played and the game actions recorded by the player tracking system. To accomplish the first aim, the performance per minute of play was analyzed using a descriptive discriminant analysis to identify which variables best predict the all-star and non all-star playing categories. The all-star players showed slower velocities in defense and performed better in elbow touches, defensive rebounds, close touches, close points and pull-up points, possibly due to optimized attention processes that are key for perceiving the required appropriate environmental information. The second aim was addressed using a k-means cluster analysis, with the aim of creating maximal different performance profile groupings. Afterwards, a descriptive discriminant analysis identified which variables best predict the different playing clusters. The results identified different playing profile of performers, particularly related to the game roles of scoring, passing, defensive and all-round game behavior. Coaching staffs may apply this information to different players, while accounting for individual differences and functional variability, to optimize practice planning and, consequently, the game performances of individuals and teams. PMID:26171606
Exploring Game Performance in the National Basketball Association Using Player Tracking Data.
Sampaio, Jaime; McGarry, Tim; Calleja-González, Julio; Jiménez Sáiz, Sergio; Schelling I Del Alcázar, Xavi; Balciunas, Mindaugas
2015-01-01
Recent player tracking technology provides new information about basketball game performance. The aim of this study was to (i) compare the game performances of all-star and non all-star basketball players from the National Basketball Association (NBA), and (ii) describe the different basketball game performance profiles based on the different game roles. Archival data were obtained from all 2013-2014 regular season games (n = 1230). The variables analyzed included the points per game, minutes played and the game actions recorded by the player tracking system. To accomplish the first aim, the performance per minute of play was analyzed using a descriptive discriminant analysis to identify which variables best predict the all-star and non all-star playing categories. The all-star players showed slower velocities in defense and performed better in elbow touches, defensive rebounds, close touches, close points and pull-up points, possibly due to optimized attention processes that are key for perceiving the required appropriate environmental information. The second aim was addressed using a k-means cluster analysis, with the aim of creating maximal different performance profile groupings. Afterwards, a descriptive discriminant analysis identified which variables best predict the different playing clusters. The results identified different playing profile of performers, particularly related to the game roles of scoring, passing, defensive and all-round game behavior. Coaching staffs may apply this information to different players, while accounting for individual differences and functional variability, to optimize practice planning and, consequently, the game performances of individuals and teams.
Identification of Alfalfa Leaf Diseases Using Image Recognition Technology
Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang
2016-01-01
Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease. PMID:27977767
Identification of Alfalfa Leaf Diseases Using Image Recognition Technology.
Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang
2016-01-01
Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berman, Benjamin P.; Pfeiffer, Barret D.; Laverty, Todd R.
2004-08-06
Background The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. Results We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene,more » and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. Conclusions Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity.« less
Chaouch, Melek; Fathallah-Mili, Akila; Driss, Mehdi; Lahmadi, Ramzi; Ayari, Chiraz; Guizani, Ikram; Ben Said, Moncef; Benabderrazak, Souha
2013-03-01
Discrimination of the Old World Leishmania parasites is important for diagnosis and epidemiological studies of leishmaniasis. We have developed PCR assays that allow the discrimination between Leishmania major, Leishmania tropica and Leishmania infantum Tunisian species. The identification was performed by a simple PCR targeting cysteine protease B (cpb) gene copies. These PCR can be a routine molecular biology tools for discrimination of Leishmania spp. from different geographical origins and different clinical forms. Our assays can be an informative source for cpb gene studying concerning drug, diagnostics and vaccine research. The PCR products of the cpb gene and the N-acetylglucosamine-1-phosphate transferase (nagt) Leishmania gene were sequenced and aligned. Phylogenetic trees of Leishmania based cpb and nagt sequences are close in topology and present the classic distribution of Leishmania in the Old World. The phylogenetic analysis has enabled the characterization and identification of different strains, using both multicopy (cpb) and single copy (nagt) genes. Indeed, the cpb phylogenetic analysis allowed us to identify the Tunisian Leishmania killicki species, and a group which gathers the least evolved isolates of the Leishmania donovani complex, that was originated from East Africa. This clustering confirms the African origin for the visceralizing species of the L. donovani complex. Copyright © 2012 Elsevier B.V. All rights reserved.
Buddhachat, Kittisak; Thitaram, Chatchote; Brown, Janine L.; Klinhom, Sarisa; Bansiddhi, Pakkanut; Penchart, Kitichaya; Ouitavon, Kanita; Sriaksorn, Khanittha; Pa-in, Chalermpol; Kanchanasaka, Budsabong; Somgird, Chaleamchat; Nganvongpanit, Korakot
2016-01-01
We describe the use of handheld X-ray fluorescence, for elephant tusk species identification. Asian (n = 72) and African (n = 85) elephant tusks were scanned and we utilized the species differences in elemental composition to develop a functional model differentiating between species with high precision. Spatially, the majority of measured elements (n = 26) exhibited a homogeneous distribution in cross-section, but a more heterologous pattern in the longitudinal direction. Twenty-one of twenty four elements differed between Asian and African samples. Data were subjected to hierarchical cluster analysis followed by a stepwise discriminant analysis, which identified elements for the functional equation. The best equation consisted of ratios of Si, S, Cl, Ti, Mn, Ag, Sb and W, with Zr as the denominator. Next, Bayesian binary regression model analysis was conducted to predict the probability that a tusk would be of African origin. A cut-off value was established to improve discrimination. This Bayesian hybrid classification model was then validated by scanning an additional 30 Asian and 41 African tusks, which showed high accuracy (94%) and precision (95%) rates. We conclude that handheld XRF is an accurate, non-invasive method to discriminate origin of elephant tusks provides rapid results applicable to use in the field. PMID:27097717
Willard, Melissa A Bodnar; McGuffin, Victoria L; Smith, Ruth Waddell
2012-01-01
Salvia divinorum is a hallucinogenic herb that is internationally regulated. In this study, salvinorin A, the active compound in S. divinorum, was extracted from S. divinorum plant leaves using a 5-min extraction with dichloromethane. Four additional Salvia species (Salvia officinalis, Salvia guaranitica, Salvia splendens, and Salvia nemorosa) were extracted using this procedure, and all extracts were analyzed by gas chromatography-mass spectrometry. Differentiation of S. divinorum from other Salvia species was successful based on visual assessment of the resulting chromatograms. To provide a more objective comparison, the total ion chromatograms (TICs) were subjected to principal components analysis (PCA). Prior to PCA, the TICs were subjected to a series of data pretreatment procedures to minimize non-chemical sources of variance in the data set. Successful discrimination of S. divinorum from the other four Salvia species was possible based on visual assessment of the PCA scores plot. To provide a numerical assessment of the discrimination, a series of statistical procedures such as Euclidean distance measurement, hierarchical cluster analysis, Student's t tests, Wilcoxon rank-sum tests, and Pearson product moment correlation were also applied to the PCA scores. The statistical procedures were then compared to determine the advantages and disadvantages for forensic applications.
NASA Astrophysics Data System (ADS)
Buddhachat, Kittisak; Thitaram, Chatchote; Brown, Janine L.; Klinhom, Sarisa; Bansiddhi, Pakkanut; Penchart, Kitichaya; Ouitavon, Kanita; Sriaksorn, Khanittha; Pa-in, Chalermpol; Kanchanasaka, Budsabong; Somgird, Chaleamchat; Nganvongpanit, Korakot
2016-04-01
We describe the use of handheld X-ray fluorescence, for elephant tusk species identification. Asian (n = 72) and African (n = 85) elephant tusks were scanned and we utilized the species differences in elemental composition to develop a functional model differentiating between species with high precision. Spatially, the majority of measured elements (n = 26) exhibited a homogeneous distribution in cross-section, but a more heterologous pattern in the longitudinal direction. Twenty-one of twenty four elements differed between Asian and African samples. Data were subjected to hierarchical cluster analysis followed by a stepwise discriminant analysis, which identified elements for the functional equation. The best equation consisted of ratios of Si, S, Cl, Ti, Mn, Ag, Sb and W, with Zr as the denominator. Next, Bayesian binary regression model analysis was conducted to predict the probability that a tusk would be of African origin. A cut-off value was established to improve discrimination. This Bayesian hybrid classification model was then validated by scanning an additional 30 Asian and 41 African tusks, which showed high accuracy (94%) and precision (95%) rates. We conclude that handheld XRF is an accurate, non-invasive method to discriminate origin of elephant tusks provides rapid results applicable to use in the field.
NASA Astrophysics Data System (ADS)
Basalto, Nicolas; Bellotti, Roberto; de Carlo, Francesco; Facchi, Paolo; Pantaleo, Ester; Pascazio, Saverio
2008-10-01
A clustering algorithm based on the Hausdorff distance is analyzed and compared to the single, complete, and average linkage algorithms. The four clustering procedures are applied to a toy example and to the time series of financial data. The dendrograms are scrutinized and their features compared. The Hausdorff linkage relies on firm mathematical grounds and turns out to be very effective when one has to discriminate among complex structures.
Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure.
Zhang, Wen; Xiao, Fan; Li, Bin; Zhang, Siguang
2016-01-01
Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods.
Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure
Xiao, Fan; Li, Bin; Zhang, Siguang
2016-01-01
Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods. PMID:27579031
Facial measurement differences between patients with schizophrenia and non-psychiatric controls.
Compton, Michael T; Brudno, Jennifer; Kryda, Aimee D; Bollini, Annie M; Walker, Elaine F
2007-07-01
Several previous reports suggest that facial measurements in patients with schizophrenia differ from those of non-psychiatric controls. Because the face and brain develop in concert from the same ectodermal tissue, the study of quantitative craniofacial abnormalities may give clues to genetic and/or environmental factors predisposing to schizophrenia. Using a predominantly African American sample, the present research question was two-fold: (1) Do patients differ from controls in terms of a number of specific facial measurements?, and (2) Does cluster analysis based on these facial measurements reveal distinct facial morphologies that significantly discriminate patients from controls? Facial dimensions were measured in 73 patients with schizophrenia and related psychotic disorders (42 males and 31 females) and 69 non-psychiatric controls (35 males and 34 females) using a 25-cm head and neck caliper. Due to differences in facial dimensions by gender, separate independent samples Student's t-tests and logistic regression analyses were employed to discern differences in facial measures between the patient and control groups in women and men. Findings were further explored using cluster analysis. Given an association between age and some facial dimensions, the effect of age was controlled. In unadjusted bivariate tests, female patients differed from female controls on several facial dimensions, though male patients did not differ significantly from male controls for any facial measure. Controlling for age using logistic regression, female patients had a greater mid-facial depth (tragus-subnasale) compared to female controls; male patients had lesser upper facial (trichion-glabella) and lower facial (subnasale-gnathion) heights compared to male controls. Among females, cluster analysis revealed two facial morphologies that significantly discriminated patients from controls, though this finding was not evident when employing further cluster analyses using secondary distance measures. When the sample was restricted to African Americans, results were similar and consistent. These findings indicate that, in a predominantly African American sample, some facial measurements differ between patients with schizophrenia and non-psychiatric controls, and these differences appear to be gender-specific. Further research on gender-specific quantitative craniofacial measurement differences between cases and controls could suggest gender-specific differences in embryologic/fetal neurodevelopmental processes underpinning schizophrenia.
Enhancement of Plant Metabolite Fingerprinting by Machine Learning1[W
Scott, Ian M.; Vermeer, Cornelia P.; Liakata, Maria; Corol, Delia I.; Ward, Jane L.; Lin, Wanchang; Johnson, Helen E.; Whitehead, Lynne; Kular, Baldeep; Baker, John M.; Walsh, Sean; Dave, Anuja; Larson, Tony R.; Graham, Ian A.; Wang, Trevor L.; King, Ross D.; Draper, John; Beale, Michael H.
2010-01-01
Metabolite fingerprinting of Arabidopsis (Arabidopsis thaliana) mutants with known or predicted metabolic lesions was performed by 1H-nuclear magnetic resonance, Fourier transform infrared, and flow injection electrospray-mass spectrometry. Fingerprinting enabled processing of five times more plants than conventional chromatographic profiling and was competitive for discriminating mutants, other than those affected in only low-abundance metabolites. Despite their rapidity and complexity, fingerprints yielded metabolomic insights (e.g. that effects of single lesions were usually not confined to individual pathways). Among fingerprint techniques, 1H-nuclear magnetic resonance discriminated the most mutant phenotypes from the wild type and Fourier transform infrared discriminated the fewest. To maximize information from fingerprints, data analysis was crucial. One-third of distinctive phenotypes might have been overlooked had data models been confined to principal component analysis score plots. Among several methods tested, machine learning (ML) algorithms, namely support vector machine or random forest (RF) classifiers, were unsurpassed for phenotype discrimination. Support vector machines were often the best performing classifiers, but RFs yielded some particularly informative measures. First, RFs estimated margins between mutant phenotypes, whose relations could then be visualized by Sammon mapping or hierarchical clustering. Second, RFs provided importance scores for the features within fingerprints that discriminated mutants. These scores correlated with analysis of variance F values (as did Kruskal-Wallis tests, true- and false-positive measures, mutual information, and the Relief feature selection algorithm). ML classifiers, as models trained on one data set to predict another, were ideal for focused metabolomic queries, such as the distinctiveness and consistency of mutant phenotypes. Accessible software for use of ML in plant physiology is highlighted. PMID:20566707
Identification of Hard X-ray Sources in Galactic Globular Clusters: Simbol-X Simulations
NASA Astrophysics Data System (ADS)
Servillat, M.
2009-05-01
Globular clusters harbour an excess of X-ray sources compared to the number of X-ray sources in the Galactic plane. It has been proposed that many of these X-ray sources are cataclysmic variables that have an intermediate magnetic field, i.e. intermediate polars, which remains to be confirmed and understood. We present here several methods to identify intermediate polars in globular clusters from multiwavelength analysis. First, we report on XMM-Newton, Chandra and HST observations of the very dense Galactic globular cluster NGC 2808. By comparing UV and X-ray properties of the cataclysmic variable candidates, the fraction of intermediate polars in this cluster can be estimated. We also present the optical spectra of two cataclysmic variables in the globular cluster M 22. The HeII (4868 Å) emission line in these spectra could be related to the presence of a magnetic field in these objects. Simulations of Simbol-X observations indicate that the angular resolution is sufficient to study X-ray sources in the core of close, less dense globular clusters, such as M 22. The sensitivity of Simbol-X in an extended energy band up to 80 keV will allow us to discriminate between hard X-ray sources (such as magnetic cataclysmic variables) and soft X-ray sources (such as chromospherically active binaries).
Figueira, José; Câmara, Hugo; Pereira, Jorge; Câmara, José S
2014-02-15
To gain insights on the effects of cultivar on the volatile metabolomic expression of different tomato (Lycopersicon esculentum L.) cultivars--Plum, Campari, Grape, Cherry and Regional, cultivated under similar edafoclimatic conditions, and to identify the most discriminate volatile marker metabolites related to the cultivar, the chromatographic profiles resulting from headspace solid phase microextraction (HS-SPME) and gas chromatography-mass spectrometry (GC-qMS) analysis, combined with multivariate analysis were investigated. The data set composed by the 77 volatile metabolites identified in the target tomato cultivars, 5 of which (2,2,6-trimethylcyclohexanone, 2-methyl-6-methyleneoctan-2-ol, 4-octadecyl-morpholine, (Z)-methyl-3-hexenoate and 3-octanone) are reported for the first time in tomato volatile metabolomic composition, was evaluated by chemometrics. Firstly, principal component analysis was carried out in order to visualise data trends and clusters, and then, linear discriminant analysis in order to detect the set of volatile metabolites able to differentiate groups according to tomato cultivars. The results obtained revealed a perfect discrimination between the different Lycopersicon esculentum L. cultivars considered. The assignment success rate was 100% in classification and 80% in prediction ability by using "leave-one-out" cross-validation procedure. The volatile profile was able to differentiate all five cultivars and revealed complex interactions between them including the participation in the same biosynthetic pathway. The volatile metabolomic platform for tomato samples obtained by HS-SPME/GC-qMS here described, and the interrelationship detected among the volatile metabolites can be used as a roadmap for biotechnological applications, namely to improve tomato aroma and their acceptance in the final consumer, and for traceability studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Li, Yan; Zhang, Ji; Zhao, Yanli; Liu, Honggao; Wang, Yuanzhong; Jin, Hang
2016-01-01
In this study the geographical differentiation of dried sclerotia of the medicinal mushroom Wolfiporia extensa, obtained from different regions in Yunnan Province, China, was explored using Fourier-transform infrared (FT-IR) spectroscopy coupled with multivariate data analysis. The FT-IR spectra of 97 samples were obtained for wave numbers ranging from 4000 to 400 cm-1. Then, the fingerprint region of 1800-600 cm-1 of the FT-IR spectrum, rather than the full spectrum, was analyzed. Different pretreatments were applied on the spectra, and a discriminant analysis model based on the Mahalanobis distance was developed to select an optimal pretreatment combination. Two unsupervised pattern recognition procedures- principal component analysis and hierarchical cluster analysis-were applied to enhance the authenticity of discrimination of the specimens. The results showed that excellent classification could be obtained after optimizing spectral pretreatment. The tested samples were successfully discriminated according to their geographical locations. The chemical properties of dried sclerotia of W. extensa were clearly dependent on the mushroom's geographical origins. Furthermore, an interesting finding implied that the elevations of collection areas may have effects on the chemical components of wild W. extensa sclerotia. Overall, this study highlights the feasibility of FT-IR spectroscopy combined with multivariate data analysis in particular for exploring the distinction of different regional W. extensa sclerotia samples. This research could also serve as a basis for the exploitation and utilization of medicinal mushrooms.
Froehle, A W; Kellner, C M; Schoeninger, M J
2012-03-01
Using a sample of published archaeological data, we expand on an earlier bivariate carbon model for diet reconstruction by adding bone collagen nitrogen stable isotope values (δ(15) N), which provide information on trophic level and consumption of terrestrial vs. marine protein. The bivariate carbon model (δ(13) C(apatite) vs. δ(13) C(collagen) ) provides detailed information on the isotopic signatures of whole diet and dietary protein, but is limited in its ability to distinguish between C(4) and marine protein. Here, using cluster analysis and discriminant function analysis, we generate a multivariate diet reconstruction model that incorporates δ(13) C(apatite) , δ(13) C(collagen) , and δ(15) N holistically. Inclusion of the δ(15) N data proves useful in resolving protein-related limitations of the bivariate carbon model, and splits the sample into five distinct dietary clusters. Two significant discriminant functions account for 98.8% of the sample variance, providing a multivariate model for diet reconstruction. Both carbon variables dominate the first function, while δ(15) N most strongly influences the second. Independent support for the functions' ability to accurately classify individuals according to diet comes from a small sample of experimental rats, which cluster as expected from their diets. The new model also provides a statistical basis for distinguishing between food sources with similar isotopic signatures, as in a previously analyzed archaeological population from Saipan (see Ambrose et al.: AJPA 104(1997) 343-361). Our model suggests that the Saipan islanders' (13) C-enriched signal derives mainly from sugarcane, not seaweed. Further development and application of this model can similarly improve dietary reconstructions in archaeological, paleontological, and primatological contexts. Copyright © 2011 Wiley Periodicals, Inc.
Kleinman, Ana; Caetano, Sheila Cavalcante; Brentani, Helena; Rocca, Cristiana Castanho de Almeida; dos Santos, Bernardo; Andrade, Enio Roberto; Zeni, Cristian Patrick; Tramontina, Silzá; Rohde, Luis Augusto Paim; Lafer, Beny
2015-03-01
The National Institute of Mental Health has initiated the Research Domain Criteria (RDoC) project. Instead of using disorder categories as the basis for grouping individuals, the RDoC suggests finding relevant dimensions that can cut across traditional disorders. Our aim was to use the RDoC's framework to study patterns of attention deficit based on results of Conners' Continuous Performance Test (CPT II) in youths diagnosed with bipolar disorder (BD), attention-deficit/hyperactivity disorder (ADHD), BD+ADHD and controls. Eighteen healthy controls, 23 patients with ADHD, 10 with BD and 33 BD+ADHD aged 12-17 years old were assessed. Pattern recognition was used to partition subjects into clusters based simultaneously on their performance in all CPT II variables. A Fisher's linear discriminant analysis was used to build a classifier. Using cluster analysis, the entire sample set was best clustered into two new groups, A and B, independently of the original diagnoses. ADHD and BD+ADHD were divided almost 50% in each subgroup, and there was an agglomeration of controls and BD in group B. Group A presented a greater impairment with higher means in all CPT II variables and lower Children's Global Assessment Scale. We found a high cross-validated classification accuracy for groups A and B: 95.2%. Variability of response time was the strongest CPT II measure in the discriminative pattern between groups A and B. Our classificatory exercise supports the concept behind new approaches, such as the RDoC framework, for child and adolescent psychiatry. Our approach was able to define clinical subgroups that could be used in future pathophysiological and treatment studies. © The Royal Australian and New Zealand College of Psychiatrists 2014.
Tanaka, Nao; Hasui, Chieko; Uji, Masayo; Hiramura, Hidetoshi; Chen, Zi; Shikai, Noriko; Kitamura, Toshinori
2008-02-01
To identify the psychosocial correlates of adolescents. Unmarried university students (n = 4226) aged 18-23 years were examined in a questionnaire survey. Four clusters of people (indifferent, secure, fearful, and preoccupied) identified by cluster analysis were plotted in 2-D using discriminant function analysis with the first function (father's and mother's Care, Cooperativeness, and family Cohesion on the positive end and Harm Avoidance and father's and mother's Overprotection on the negative end) representing the Self-model and the second function (Reward Dependence and experience of Peer Victimization on the positive end and Self-directedness on the negative end) representing the Other model. These findings partially support Bartholomew's notion that adult attachment is based on the good versus bad representations of the self and the other and that it is influenced by psychosocial environments experienced over the course of development.
Santos, D N; Nunes, C F; Setotaw, T A; Pio, R; Pasqual, M; Cançado, G M A
2016-12-19
Cambuci (Campomanesia phaea) belongs to the Myrtaceae family and is native to the Atlantic Forest of Brazil. It has ecological and social appeal but is exposed to problems associated with environmental degradation and expansion of agricultural activities in the region. Comprehensive studies on this species are rare, making its conservation and genetic improvement difficult. Thus, it is important to develop research activities to understand the current situation of the species as well as to make recommendations for its conservation and use. This study was performed to characterize the cambuci accessions found in the germplasm bank of Coordenadoria de Assistência Técnica Integral using inter-simple sequence repeat markers, with the goal of understanding the plant's population structure. The results showed the existence of some level of genetic diversity among the cambuci accessions that could be exploited for the genetic improvement of the species. Principal coordinate analysis and discriminant analysis clustered the 80 accessions into three groups, whereas Bayesian model-based clustering analysis clustered them into two groups. The formation of two cluster groups and the high membership coefficients within the groups pointed out the importance of further collection to cover more areas and more genetic variability within the species. The study also showed the lack of conservation activities; therefore, more attention from the appropriate organizations is needed to plan and implement natural and ex situ conservation activities.
Menz, Hylton B; Allan, Jamie J; Bonanno, Daniel R; Landorf, Karl B; Murley, George S
2017-01-01
Foot orthoses are widely used in the prevention and treatment of foot disorders. The aim of this study was to describe characteristics of custom-made foot orthosis prescriptions from a Australian podiatric orthotic laboratory. One thousand consecutive foot orthosis prescription forms were obtained from a commercial prescription foot orthosis laboratory located in Melbourne, Victoria, Australia (Footwork Podiatric Laboratory). Each item from the prescription form was documented in relation to orthosis type, cast correction, arch fill technique, cast modifications, shell material, shell modifications and cover material. Cluster analysis and discriminant function analysis were applied to identify patterns in the prescription data. Prescriptions were obtained from 178 clinical practices across Australia and Hong Kong, with patients ranging in age from 5 to 92 years. Three broad categories ('clusters') were observed that were indicative of increasing 'control' of rearfoot pronation. A combination of five variables (rearfoot cast correction, cover shape, orthosis type, forefoot cast correction and plantar fascial accommodation) was able to identify these clusters with an accuracy of 70%. Significant differences between clusters were observed in relation to age and sex of the patient and the geographic location of the prescribing clinician. Foot orthosis prescriptions are complex, but can be broadly classified into three categories. Selection of these prescription subtypes appears to be influenced by both patient factors (age and sex) and clinician factors (clinic location).
Differentiation of lard, chicken fat, beef fat and mutton fat by GCMS and EA-IRMS techniques.
Ahmad Nizar, Nina Naquiah; Nazrim Marikkar, Jalaldeen Mohamed; Hashim, Dzulkifly Mat
2013-01-01
A study was conducted to differentiate lard, chicken fat, beef fat and mutton fat using Gas Chromatography Mass Spectrometry (GC-MS) and Elemental Analyzer-Isotope Ratio Mass Spectrometry (EA-IRMS). The comparison of overall fatty acid data showed that lard and chicken fat share common characteristics by having palmitic, oleic and linoleic acid as major fatty acids while beef and mutton fats shared common characteristics by possessing palmitic, stearic and oleic acid as major fatty acids. The direct comparisons among the fatty acid data, therefore, may not be suitable for discrimination of different animal fats. When the fatty acid distributional data was subjected to Principle Component Analysis (PCA), it was demonstrated that stearic, oleic and linoleic acids as the most discriminating parameters in the clustering of animal fats into four subclasses. The bulk carbon analysis of animal fats using EA-IRMS showed that determination of the carbon isotope ratios (δ¹³C) would be a good indicator for discriminating lard, chicken fat, beef fat and mutton fat. This would lead to a faster and more efficient method to ascertain the source of origin of fats used in food products.
Seismic Source Scaling and Discrimination in Diverse Tectonic Environments
2008-09-30
provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a currently...least affected by travel through the earth. But finding well recorded earthquakes with -perfect- EGF events for direct wave analysis is difficult...North America. Each cluster contains a M- 2, and two contain M-3. as well as smaller aftershocks. We find that the corner frequencies and stress
Freitas, Daniela Fonseca; Coimbra, Susana; Marturano, Edna Maria; Marques, Susana C; Oliveira, José Egídio; Fontaine, Anne Marie
2017-08-01
Victimisation has a negative effect on psychosocial functioning. Based on the resilience theory, and with a sample of 2975 Portuguese students, the present study aims to: i) identify patterns of adjustment in the face of peer victimisation and perceptions of discrimination; ii) explore the association between the patterns of adjustment and the characteristics of participants (the who) and of the victimisation (the when and why). Cluster analysis revealed five patterns of adjustment: Unchallenged; Externally Maladjusted; Internally Maladjusted; Resilient, and At-Risk. The results suggest that there is no complete resilience in the face of social victimisation. Group differences were found regarding: i) gender, type of course, sexual orientation, ethnicity, nationality, parental educational level and religious beliefs; ii) the age at which peer victimisation was more frequent, and; iii) the motives underlying discrimination. Globally considered, peer victimisation is representative of the wider cultural environment and interventions should also target social prejudices. Copyright © 2017. Published by Elsevier Ltd.
Linear regression models and k-means clustering for statistical analysis of fNIRS data.
Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro
2015-02-01
We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.
Linear regression models and k-means clustering for statistical analysis of fNIRS data
Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro
2015-01-01
We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets. PMID:25780751
Hammond, R W
2003-06-01
Isolates of Prunus necrotic ringspot virus (PNRSV) were examined to establish the level of naturally occurring sequence variation in the coat protein (CP) gene and to identify group-specific genome features that may prove valuable for the generation of diagnostic reagents. Phylogenetic analysis of a 452 bp sequence of 68 virus isolates, 20 obtained from the European Union Ilarvirus Ringtest held in October 1998, confirmed the clustering of the isolates into three distinct groups. Although no correlation was found between the sequence and host or geographic origin, there was a general trend for severe isolates to cluster into one group. Group-specific features have been identified for discrimination between virus strains.
Zhang, Jingjing; Dennis, Todd E.
2015-01-01
We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1) behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2) hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3) k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known ‘artificial behaviours’ comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor) obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging), with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified. PMID:25922935
Zhang, Jingjing; O'Reilly, Kathleen M; Perry, George L W; Taylor, Graeme A; Dennis, Todd E
2015-01-01
We present a simple framework for classifying mutually exclusive behavioural states within the geospatial lifelines of animals. This method involves use of three sequentially applied statistical procedures: (1) behavioural change point analysis to partition movement trajectories into discrete bouts of same-state behaviours, based on abrupt changes in the spatio-temporal autocorrelation structure of movement parameters; (2) hierarchical multivariate cluster analysis to determine the number of different behavioural states; and (3) k-means clustering to classify inferred bouts of same-state location observations into behavioural modes. We demonstrate application of the method by analysing synthetic trajectories of known 'artificial behaviours' comprised of different correlated random walks, as well as real foraging trajectories of little penguins (Eudyptula minor) obtained by global-positioning-system telemetry. Our results show that the modelling procedure correctly classified 92.5% of all individual location observations in the synthetic trajectories, demonstrating reasonable ability to successfully discriminate behavioural modes. Most individual little penguins were found to exhibit three unique behavioural states (resting, commuting/active searching, area-restricted foraging), with variation in the timing and locations of observations apparently related to ambient light, bathymetry, and proximity to coastlines and river mouths. Addition of k-means clustering extends the utility of behavioural change point analysis, by providing a simple means through which the behaviours inferred for the location observations comprising individual movement trajectories can be objectively classified.
Sadsad, Rosemarie; Martinez, Elena; Jelfs, Peter; Hill-Cawthorne, Grant A.; Gilbert, Gwendolyn L.; Marais, Ben J.; Sintchenko, Vitali
2016-01-01
Background Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways. Methods We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants. Results Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade. Conclusion Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster. PMID:26938641
Babouee, B.; Frei, R.; Schultheiss, E.; Widmer, A. F.; Goldenberger, D.
2011-01-01
The emergence of methicillin-resistant Staphylococcus aureus (MRSA) has become an increasing problem worldwide in recent decades. Molecular typing methods have been developed to identify clonality of strains and monitor spread of MRSA. We compared a new commercially available DiversiLab (DL) repetitive element PCR system with spa typing, spa clonal cluster analysis, and pulsed-field gel electrophoresis (PFGE) in terms of discriminatory power and concordance. A collection of 106 well-defined MRSA strains from our hospital was analyzed, isolated between 1994 and 2006. In addition, we analyzed 6 USA300 strains collected in our institution. DL typing separated the 106 MRSA isolates in 10 distinct clusters and 8 singleton patterns. Clustering analysis into spa clonal complexes resulted in 3 clusters: spa-CC 067/548, spa-CC 008, and spa-CC 012. The discriminatory powers (Simpson's index of diversity) were 0.982, 0.950, 0.846, and 0.757 for PFGE, spa typing, DL typing, and spa clonal clustering, respectively. DL typing and spa clonal clustering showed the highest concordance, calculated by adjusted Rand's coefficients. The 6 USA300 isolates grouped homogeneously into distinct PFGE and DL clusters, and all belonged to spa type t008 and spa-CC 008. Among the three methods, DL proved to be rapid and easy to perform. DL typing qualifies for initial screening during outbreak investigation. However, compared to PFGE and spa typing, DL typing has limited discriminatory power and therefore should be complemented by more discriminative methods in isolates that share identical DL patterns. PMID:21307215
Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid
2014-12-30
Understanding neural functions requires knowledge from analysing electrophysiological data. The process of assigning spikes of a multichannel signal into clusters, called spike sorting, is one of the important problems in such analysis. There have been various automated spike sorting techniques with both advantages and disadvantages regarding accuracy and computational costs. Therefore, developing spike sorting methods that are highly accurate and computationally inexpensive is always a challenge in the biomedical engineering practice. An automatic unsupervised spike sorting method is proposed in this paper. The method uses features extracted by the locality preserving projection (LPP) algorithm. These features afterwards serve as inputs for the landmark-based spectral clustering (LSC) method. Gap statistics (GS) is employed to evaluate the number of clusters before the LSC can be performed. The proposed LPP-LSC is highly accurate and computationally inexpensive spike sorting approach. LPP spike features are very discriminative; thereby boost the performance of clustering methods. Furthermore, the LSC method exhibits its efficiency when integrated with the cluster evaluator GS. The proposed method's accuracy is approximately 13% superior to that of the benchmark combination between wavelet transformation and superparamagnetic clustering (WT-SPC). Additionally, LPP-LSC computing time is six times less than that of the WT-SPC. LPP-LSC obviously demonstrates a win-win spike sorting solution meeting both accuracy and computational cost criteria. LPP and LSC are linear algorithms that help reduce computational burden and thus their combination can be applied into real-time spike analysis. Copyright © 2014 Elsevier B.V. All rights reserved.
Operational foreshock forecasting: Fifteen years after
NASA Astrophysics Data System (ADS)
Ogata, Y.
2010-12-01
We are concerned with operational forecasting of the probability that events are foreshocks of a forthcoming earthquake that is significantly larger (mainshock). Specifically, we define foreshocks as the preshocks substantially smaller than the mainshock by a magnitude gap of 0.5 or larger. The probability gain of foreshock forecast is extremely high compare to long-term forecast by renewal processes or various alarm-based intermediate-term forecasts because of a large event’s low occurrence rate in a short period and a narrow target region. Thus, it is desired to establish operational foreshock probability forecasting as seismologists have done for aftershocks. When a series of earthquakes occurs in a region, we attempt to discriminate foreshocks from a swarm or mainshock-aftershock sequence. Namely, after real time identification of an earthquake cluster using methods such as the single-link algorithm, the probability is calculated by applying statistical features that discriminate foreshocks from other types of clusters, by considering the events' stronger proximity in time and space and tendency towards chronologically increasing magnitudes. These features were modeled for probability forecasting and the coefficients of the model were estimated in Ogata et al. (1996) for the JMA hypocenter data (M≧4, 1926-1993). Currently, fifteen years has passed since the publication of the above-stated work so that we are able to present the performance and validation of the forecasts (1994-2009) by using the same model. Taking isolated events into consideration, the probability of the first events in a potential cluster being a foreshock vary in a range between 0+% and 10+% depending on their locations. This conditional forecasting performs significantly better than the unconditional (average) foreshock probability of 3.7% throughout Japan region. Furthermore, when we have the additional events in a cluster, the forecast probabilities range more widely from nearly 0% to about 40% depending on the discrimination features among the events in the cluster. This conditional forecasting further performs significantly better than the unconditional foreshock probability of 7.3%, which is the average probability of the plural events in the earthquake clusters. Indeed, the frequency ratios of the actual foreshocks are consistent with the forecasted probabilities. Reference: Ogata, Y., Utsu, T. and Katsura, K. (1996). Statistical discrimination of foreshocks from other earthquake clusters, Geophys. J. Int. 127, 17-30.
H, Maulidiani; Khatib, Alfi; Shaari, Khozirah; Abas, Faridah; Shitan, Mahendran; Kneer, Ralf; Neto, Victor; Lajis, Nordin H
2012-01-11
The metabolites of three species of Apiaceae, also known as Pegaga, were analyzed utilizing (1)H NMR spectroscopy and multivariate data analysis. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) resolved the species, Centella asiatica, Hydrocotyle bonariensis, and Hydrocotyle sibthorpioides, into three clusters. The saponins, asiaticoside and madecassoside, along with chlorogenic acids were the metabolites that contributed most to the separation. Furthermore, the effects of growth-lighting condition to metabolite contents were also investigated. The extracts of C. asiatica grown in full-day light exposure exhibited a stronger radical scavenging activity and contained more triterpenes (asiaticoside and madecassoside), flavonoids, and chlorogenic acids as compared to plants grown in 50% shade. This study established the potential of using a combination of (1)H NMR spectroscopy and multivariate data analyses in differentiating three closely related species and the effects of growth lighting, based on their metabolite contents and identification of the markers contributing to their differences.
Descriptor Fingerprints and Their Application to WhiteWine Clustering and Discrimination.
NASA Astrophysics Data System (ADS)
Bangov, I. P.; Moskovkina, M.; Stojanov, B. P.
2018-03-01
This study continues the attempt to use the statistical process for a large-scale analytical data. A group of 3898 white wines, each with 11 analytical laboratory benchmarks was analyzed by a fingerprint similarity search in order to be grouped into separate clusters. A characterization of the wine's quality in each individual cluster was carried out according to individual laboratory parameters.
Low Back Pain Subgroups using Fear-Avoidance Model Measures: Results of a Cluster Analysis
Beneciuk, Jason M.; Robinson, Michael E.; George, Steven Z.
2012-01-01
Objectives The purpose of this secondary analysis was to test the hypothesis that an empirically derived psychological subgrouping scheme based on multiple Fear-Avoidance Model (FAM) constructs would provide additional capabilities for clinical outcomes in comparison to a single FAM construct. Methods Patients (n = 108) with acute or sub-acute low back pain (LBP) enrolled in a clinical trial comparing behavioral physical therapy interventions to classification based physical therapy completed baseline questionnaires for pain catastrophizing (PCS), fear-avoidance beliefs (FABQ-PA, FABQ-W), and patient-specific fear (FDAQ). Clinical outcomes were pain intensity and disability measured at baseline, 4-weeks, and 6-months. A hierarchical agglomerative cluster analysis was used to create distinct cluster profiles among FAM measures and discriminant analysis was used to interpret clusters. Changes in clinical outcomes were investigated with repeated measures ANOVA and differences in results based on cluster membership were compared to FABQ-PA subgrouping used in the original trial. Results Three distinct FAM subgroups (Low Risk, High Specific Fear, and High Fear & Catastrophizing) emerged from cluster analysis. Subgroups differed on baseline pain and disability (p’s<.01) with the High Fear & Catastrophizing subgroup associated with greater pain than the Low Risk subgroup (p<.01) and the greatest disability (p’s<.05). Subgroup × time interactions were detected for both pain and disability (p’s<.05) with the High Fear & Catastrophizing subgroup reporting greater changes in pain and disability than other subgroups (p’s<.05). In contrast, FABQ-PA subgroups used in the original trial were not associated with interactions for clinical outcomes. Discussion These data suggest that subgrouping based on multiple FAM measures may provide additional information on clinical outcomes in comparison to determining subgroup status by FABQ-PA alone. Subgrouping methods for patients with LBP should include multiple psychological factors to further explore if patients can be matched with appropriate interventions. PMID:22510537
Children's attitudes toward violence on television.
Hough, K J; Erwin, P G
1997-07-01
Children's attitudes toward television violence were studied. A 47-item questionnaire collecting attitudinal and personal information was administered to 316 children aged 11 to 16 years. Cluster analysis was used to split the participants into two groups based on their attitudes toward television violence. A stepwise discriminant function analysis was performed to determine which personal characteristics would predict group membership. The only significant predictor of attitudes toward violence on television was the amount of television watched on school days (p < .05), but we also found that the impact of other predictor variables may have been mediated by this factor.
Maric, Mark; Harvey, Lauren; Tomcsak, Maren; Solano, Angelique; Bridge, Candice
2017-06-30
In comparison to other violent crimes, sexual assaults suffer from very low prosecution and conviction rates especially in the absence of DNA evidence. As a result, the forensic community needs to utilize other forms of trace contact evidence, like lubricant evidence, in order to provide a link between the victim and the assailant. In this study, 90 personal bottled and condom lubricants from the three main marketing types, silicone-based, water-based and condoms, were characterized by direct analysis in real time time of flight mass spectrometry (DART-TOFMS). The instrumental data was analyzed by multivariate statistics including hierarchal cluster analysis, principal component analysis, and linear discriminant analysis. By interpreting the mass spectral data with multivariate statistics, 12 discrete groupings were identified, indicating inherent chemical diversity not only between but within the three main marketing groups. A number of unique chemical markers, both major and minor, were identified, other than the three main chemical components (i.e. PEG, PDMS and nonoxynol-9) currently used for lubricant classification. The data was validated by a stratified 20% withheld cross-validation which demonstrated that there was minimal overlap between the groupings. Based on the groupings identified and unique features of each group, a highly discriminating statistical model was then developed that aims to provide the foundation for the development of a forensic lubricant database that may eventually be applied to casework. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Song, Xiaowei; Wang, Yajun; Tang, Yezhong
2013-01-01
As one of the most conserved genes in vertebrates, FoxP2 is widely involved in a number of important physiological and developmental processes. We systematically studied the evolutionary history and functional adaptations of FoxP2 in teleosts. The duplicated FoxP2 genes (FoxP2a and FoxP2b), which were identified in teleosts using synteny and paralogon analysis on genome databases of eight organisms, were probably generated in the teleost-specific whole genome duplication event. A credible classification with FoxP2, FoxP2a and FoxP2b in phylogenetic reconstructions confirmed the teleost-specific FoxP2 duplication. The unavailability of FoxP2b in Danio rerio suggests that the gene was deleted through nonfunctionalization of the redundant copy after the Otocephala-Euteleostei split. Heterogeneity in evolutionary rates among clusters consisting of FoxP2 in Sarcopterygii (Cluster 1), FoxP2a in Teleostei (Cluster 2) and FoxP2b in Teleostei (Cluster 3), particularly between Clusters 2 and 3, reveals asymmetric functional divergence after the gene duplication. Hierarchical cluster analyses of hydrophobicity profiles demonstrated significant structural divergence among the three clusters with verification of subsequent stepwise discriminant analysis, in which FoxP2 of Leucoraja erinacea and Lepisosteus oculatus were classified into Cluster 1, whereas FoxP2b of Salmo salar was grouped into Cluster 2 rather than Cluster 3. The simulated thermodynamic stability variations of the forkhead box domain (monomer and homodimer) showed remarkable divergence in FoxP2, FoxP2a and FoxP2b clusters. Relaxed purifying selection and positive Darwinian selection probably were complementary driving forces for the accelerated evolution of FoxP2 in ray-finned fishes, especially for the adaptive evolution of FoxP2a and FoxP2b in teleosts subsequent to the teleost-specific gene duplication.
Song, Xiaowei; Wang, Yajun; Tang, Yezhong
2013-01-01
As one of the most conserved genes in vertebrates, FoxP2 is widely involved in a number of important physiological and developmental processes. We systematically studied the evolutionary history and functional adaptations of FoxP2 in teleosts. The duplicated FoxP2 genes (FoxP2a and FoxP2b), which were identified in teleosts using synteny and paralogon analysis on genome databases of eight organisms, were probably generated in the teleost-specific whole genome duplication event. A credible classification with FoxP2, FoxP2a and FoxP2b in phylogenetic reconstructions confirmed the teleost-specific FoxP2 duplication. The unavailability of FoxP2b in Danio rerio suggests that the gene was deleted through nonfunctionalization of the redundant copy after the Otocephala-Euteleostei split. Heterogeneity in evolutionary rates among clusters consisting of FoxP2 in Sarcopterygii (Cluster 1), FoxP2a in Teleostei (Cluster 2) and FoxP2b in Teleostei (Cluster 3), particularly between Clusters 2 and 3, reveals asymmetric functional divergence after the gene duplication. Hierarchical cluster analyses of hydrophobicity profiles demonstrated significant structural divergence among the three clusters with verification of subsequent stepwise discriminant analysis, in which FoxP2 of Leucoraja erinacea and Lepisosteus oculatus were classified into Cluster 1, whereas FoxP2b of Salmo salar was grouped into Cluster 2 rather than Cluster 3. The simulated thermodynamic stability variations of the forkhead box domain (monomer and homodimer) showed remarkable divergence in FoxP2, FoxP2a and FoxP2b clusters. Relaxed purifying selection and positive Darwinian selection probably were complementary driving forces for the accelerated evolution of FoxP2 in ray-finned fishes, especially for the adaptive evolution of FoxP2a and FoxP2b in teleosts subsequent to the teleost-specific gene duplication. PMID:24349554
The Unfolding of LGBT Lives: Key Events Associated With Health and Well-being in Later Life.
Fredriksen-Goldsen, Karen I; Bryan, Amanda E B; Jen, Sarah; Goldsen, Jayn; Kim, Hyun-Jun; Muraco, Anna
2017-02-01
Life events are associated with the health and well-being of older adults. Using the Health Equity Promotion Model, this article explores historical and environmental context as it frames life experiences and adaptation of lesbian, gay, bisexual, and transgender (LGBT) older adults. This was the largest study to date of LGBT older adults to identify life events related to identity development, work, and kin relationships and their associations with health and quality of life (QOL). Using latent profile analysis (LPA), clusters of life events were identified and associations between life event clusters were tested. On average, LGBT older adults first disclosed their identities in their 20s; many experienced job-related discrimination. More had been in opposite-sex marriage than in same-sex marriage. Four clusters emerged: "Retired Survivors" were the oldest and one of the most prevalent groups; "Midlife Bloomers" first disclosed their LGBT identities in mid-40s, on average; "Beleaguered At-Risk" had high rates of job-related discrimination and few social resources; and "Visibly Resourced" had a high degree of identity visibility and were socially and economically advantaged. Clusters differed significantly in mental and physical health and QOL, with the Visibly Resourced faring best and Beleaguered At-Risk faring worst on most indicators; Retired Survivors and Midlife Bloomers showed similar health and QOL. Historical and environmental contexts frame normative and non-normative life events. Future research will benefit from the use of longitudinal data and an assessment of timing and sequencing of key life events in the lives of LGBT older adults. © The Author 2017. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The Unfolding of LGBT Lives: Key Events Associated With Health and Well-being in Later Life
Fredriksen-Goldsen, Karen I.; Bryan, Amanda E. B.; Jen, Sarah; Goldsen, Jayn; Kim, Hyun-Jun; Muraco, Anna
2017-01-01
Purpose of the Study: Life events are associated with the health and well-being of older adults. Using the Health Equity Promotion Model, this article explores historical and environmental context as it frames life experiences and adaptation of lesbian, gay, bisexual, and transgender (LGBT) older adults. Design and Methods: This was the largest study to date of LGBT older adults to identify life events related to identity development, work, and kin relationships and their associations with health and quality of life (QOL). Using latent profile analysis (LPA), clusters of life events were identified and associations between life event clusters were tested. Results: On average, LGBT older adults first disclosed their identities in their 20s; many experienced job-related discrimination. More had been in opposite-sex marriage than in same-sex marriage. Four clusters emerged: “Retired Survivors” were the oldest and one of the most prevalent groups; “Midlife Bloomers” first disclosed their LGBT identities in mid-40s, on average; “Beleaguered At-Risk” had high rates of job-related discrimination and few social resources; and “Visibly Resourced” had a high degree of identity visibility and were socially and economically advantaged. Clusters differed significantly in mental and physical health and QOL, with the Visibly Resourced faring best and Beleaguered At-Risk faring worst on most indicators; Retired Survivors and Midlife Bloomers showed similar health and QOL. Implications: Historical and environmental contexts frame normative and non-normative life events. Future research will benefit from the use of longitudinal data and an assessment of timing and sequencing of key life events in the lives of LGBT older adults. PMID:28087792
Lau, Brian C; Collins, Michael W; Lovell, Mark R
2011-06-01
Concussions affect an estimated 136 000 high school athletes yearly. Computerized neurocognitive testing has been shown to be appropriately sensitive and specific in diagnosing concussions, but no studies have assessed its utility to predict length of recovery. Determining prognosis during subacute recovery after sports concussion will help clinicians more confidently address return-to-play and academic decisions. To quantify the prognostic ability of computerized neurocognitive testing in combination with symptoms during the subacute recovery phase from sports-related concussion. Cohort study (prognosis); Level of evidence, 2. In sum, 108 male high school football athletes completed a computer-based neurocognitive test battery within 2.23 days of injury and were followed until returned to play as set by international guidelines. Athletes were grouped into protracted recovery (>14 days; n = 50) or short-recovery (≤14 days; n = 58). Separate discriminant function analyses were performed using total symptom score on Post-Concussion Symptom Scale, symptom clusters (migraine, cognitive, sleep, neuropsychiatric), and Immediate Postconcussion Assessment and Cognitive Testing neurocognitive scores (verbal memory, visual memory, reaction time, processing speed). Multiple discriminant function analyses revealed that the combination of 4 symptom clusters and 4 neurocognitive composite scores had the highest sensitivity (65.22%), specificity (80.36%), positive predictive value (73.17%), and negative predictive value (73.80%) in predicting protracted recovery. Discriminant function analyses of total symptoms on the Post-Concussion Symptom Scale alone had a sensitivity of 40.81%; specificity, 79.31%; positive predictive value, 62.50%; and negative predictive value, 61.33%. The 4 symptom clusters alone discriminant function analyses had a sensitivity of 46.94%; specificity, 77.20%; positive predictive value, 63.90%; and negative predictive value, 62.86%. Discriminant function analyses of the 4 computerized neurocognitive scores alone had a sensitivity of 53.20%; specificity, 75.44%; positive predictive value, 64.10%; and negative predictive value, 66.15%. The use of computerized neurocognitive testing in conjunction with symptom clusters results improves sensitivity, specificity, positive predictive value, and negative predictive value of predicting protracted recovery compared with each used alone. There is also a net increase in sensitivity of 24.41% when using neurocognitive testing and symptom clusters together compared with using total symptoms on Post-Concussion Symptom Scale alone.
The Node Deployment of Intelligent Sensor Networks Based on the Spatial Difference of Farmland Soil.
Liu, Naisen; Cao, Weixing; Zhu, Yan; Zhang, Jingchao; Pang, Fangrong; Ni, Jun
2015-11-11
Considering that agricultural production is characterized by vast areas, scattered fields and long crop growth cycles, intelligent wireless sensor networks (WSNs) are suitable for monitoring crop growth information. Cost and coverage are the most key indexes for WSN applications. The differences in crop conditions are influenced by the spatial distribution of soil nutrients. If the nutrients are distributed evenly, the crop conditions are expected to be approximately uniform with little difference; on the contrary, there will be great differences in crop conditions. In accordance with the differences in the spatial distribution of soil information in farmland, fuzzy c-means clustering was applied to divide the farmland into several areas, where the soil fertility of each area is nearly uniform. Then the crop growth information in the area could be monitored with complete coverage by deploying a sensor node there, which could greatly decrease the deployed sensor nodes. Moreover, in order to accurately judge the optimal cluster number of fuzzy c-means clustering, a discriminant function for Normalized Intra-Cluster Coefficient of Variation (NICCV) was established. The sensitivity analysis indicates that NICCV is insensitive to the fuzzy weighting exponent, but it shows a strong sensitivity to the number of clusters.
The Nature of Red-Sequence Cluster Spiral Galaxies
NASA Astrophysics Data System (ADS)
Kashur, Lane; Barkhouse, Wayne; Sultanova, Madina; Kalawila Vithanage, Sandanuwa; Archer, Haylee; Foote, Gregory; Mathew, Elijah; Rude, Cody; Lopez-Cruz, Omar
2017-01-01
Preliminary analysis of the red-sequence galaxy population from a sample of 57 low-redshift galaxy clusters observed using the KPNO 0.9m telescope and 74 clusters from the WINGS dataset, indicates that a small fraction of red-sequence galaxies have a morphology consistent with spiral systems. For spiral galaxies to acquire the color of elliptical/S0s at a similar luminosity, they must either have been stripped of their star-forming gas at an earlier epoch, or contain a larger than normal fraction of dust. To test these ideas we have compiled a sample of red-sequence spiral galaxies and examined their infrared properties as measured by 2MASS, WISE, Spitzer, and Herschel. These IR data allows us to estimate the amount of dust in each of our red-sequence spiral galaxies. We compare the estimated dust mass in each of these red-sequence late-type galaxies with spiral galaxies located in the same cluster field but having colors inconsistent with the red-sequence. We thus provide a statistical measure to discriminate between purely passive spiral galaxy evolution and dusty spirals to explain the presence of these late-type systems in cluster red-sequences.
Sex Discrimination in Professional Employment: A Case Study.
ERIC Educational Resources Information Center
Osterman, Paul
1979-01-01
A study analyzed sex discrimination with data on over 700 professional employees in a metropolitan publishing firm. It was found that the sex differential in earnings within clusters of similar jobs was much greater if marriage and children variables were excluded: men received a large "payoff" from being married and having children. (JH)
NASA Astrophysics Data System (ADS)
Ness, M.; Rix, H.-W.; Hogg, David W.; Casey, A. R.; Holtzman, J.; Fouesneau, M.; Zasowski, G.; Geisler, D.; Shetrone, M.; Minniti, D.; Frinchaboy, Peter M.; Roman-Lopes, Alexandre
2018-02-01
We explore to what extent stars within Galactic disk open clusters resemble each other in the high-dimensional space of their photospheric element abundances and contrast this with pairs of field stars. Our analysis is based on abundances for 20 elements, homogeneously derived from APOGEE spectra (with carefully quantified uncertainties of typically 0.03 dex). We consider 90 red giant stars in seven open clusters and find that most stars within a cluster have abundances in most elements that are indistinguishable (in a {χ }2-sense) from those of the other members, as expected for stellar birth siblings. An analogous analysis among pairs of > 1000 field stars shows that highly significant abundance differences in the 20 dimensional space can be established for the vast majority of these pairs, and that the APOGEE-based abundance measurements have high discriminating power. However, pairs of field stars whose abundances are indistinguishable even at 0.03 dex precision exist: ∼0.3% of all field star pairs and ∼1.0% of field star pairs at the same (solar) metallicity [Fe/H] = 0 ± 0.02. Most of these pairs are presumably not birth siblings from the same cluster, but rather doppelgängers. Our analysis implies that “chemical tagging” in the strict sense, identifying birth siblings for typical disk stars through their abundance similarity alone, will not work with such data. However, our approach shows that abundances have extremely valuable information for probabilistic chemo-orbital modeling, and combined with velocities, we have identified new cluster members from the field.
Catherine, Faget-Agius; Aurélie, Vincenti; Eric, Guedj; Pierre, Michel; Raphaëlle, Richieri; Marine, Alessandrini; Pascal, Auquier; Christophe, Lançon; Laurent, Boyer
2017-12-30
This study aims to define functioning levels of patients with schizophrenia by using a method of interpretable clustering based on a specific functioning scale, the Functional Remission Of General Schizophrenia (FROGS) scale, and to test their validity regarding clinical and neuroimaging characterization. In this observational study, patients with schizophrenia have been classified using a hierarchical top-down method called clustering using unsupervised binary trees (CUBT). Socio-demographic, clinical, and neuroimaging SPECT perfusion data were compared between the different clusters to ensure their clinical relevance. A total of 242 patients were analyzed. A four-group functioning level structure has been identified: 54 are classified as "minimal", 81 as "low", 64 as "moderate", and 43 as "high". The clustering shows satisfactory statistical properties, including reproducibility and discriminancy. The 4 clusters consistently differentiate patients. "High" functioning level patients reported significantly the lowest scores on the PANSS and the CDSS, and the highest scores on the GAF, the MARS and S-QoL 18. Functioning levels were significantly associated with cerebral perfusion of two relevant areas: the left inferior parietal cortex and the anterior cingulate. Our study provides relevant functioning levels in schizophrenia, and may enhance the use of functioning scale. Copyright © 2017 Elsevier B.V. All rights reserved.
Identifying the ideal profile of French yogurts for different clusters of consumers.
Masson, M; Saint-Eve, A; Delarue, J; Blumenthal, D
2016-05-01
Identifying the sensory properties that affect consumer preferences for food products is an important feature of product development. Different methods, such as external preference mapping or partial least squares regression, are used to establish relationships between sensory data and consumer preferences and to identify sensory attributes that drive consumer preferences, by highlighting optimum products. Plain French yogurts were evaluated by a sensory profiling method performed by 12 trained judges. In parallel, 180 consumers were asked to score their overall liking and complete a cognitive restraint questionnaire. After hierarchical cluster analysis on the liking scores, preference mapping using a quadratic regression model was performed. Five clusters of consumers were identified as a function of different preference patterns. Contrary to our expectations, fat levels were not discriminating. For each cluster, the results of preference mapping enabled the identification of optimum products. A comparison of the 5 sensory profiles revealed numerous differences between key sensory attributes. For example, one consumer cluster had a strong preference for products perceived as very thick, grainy, but with a less flowing texture, less sticky, whey presence and color, in contrast to other clusters. In addition, each segment of consumers was characterized according to the results of the cognitive restraint questionnaire. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Morphological estimators on Sunyaev-Zel'dovich maps of MUSIC clusters of galaxies
NASA Astrophysics Data System (ADS)
Cialone, Giammarco; De Petris, Marco; Sembolini, Federico; Yepes, Gustavo; Baldi, Anna Silvia; Rasia, Elena
2018-06-01
The determination of the morphology of galaxy clusters has important repercussions for cosmological and astrophysical studies of them. In this paper, we address the morphological characterization of synthetic maps of the Sunyaev-Zel'dovich (SZ) effect for a sample of 258 massive clusters (Mvir > 5 × 1014 h-1 M⊙ at z = 0), extracted from the MUSIC hydrodynamical simulations. Specifically, we use five known morphological parameters (which are already used in X-ray) and two newly introduced ones, and we combine them in a single parameter. We analyse two sets of simulations obtained with different prescriptions of the gas physics (non-radiative and with cooling, star formation and stellar feedback) at four red shifts between 0.43 and 0.82. For each parameter, we test its stability and efficiency in discriminating the true cluster dynamical state, measured by theoretical indicators. The combined parameter is more efficient at discriminating between relaxed and disturbed clusters. This parameter had a mild correlation with the hydrostatic mass (˜0.3) and a strong correlation (˜0.8) with the offset between the SZ centroid and the cluster centre of mass. The latter quantity is, thus, the most accessible and efficient indicator of the dynamical state for SZ studies.
Mothers of young children cluster into 4 groups based on psychographic food decision influencers.
Byrd-Bredbenner, Carol; Abbot, Jaclyn Maurer; Cussler, Ellen
2008-08-01
This study explored how mothers grouped into clusters according to multiple psychographic food decision influencers and how the clusters differed in nutrient intake and nutrient content of their household food supply. Mothers (n = 201) completed a survey assessing basic demographic characteristics, food shopping and meal preparation activities, self and spouse employment, exposure to formal food or nutrition education, education level and occupation, weight status, nutrition and food preparation knowledge and skill, family member health and nutrition status, food decision influencer constructs, and dietary intake. In addition, an in-home inventory of 100 participants' household food supplies was conducted. Four distinct clusters presented when 26 psychographic food choice influencers were evaluated. These clusters appear to be valid and robust classifications of mothers in that they discriminated well on the psychographic variables used to construct the clusters as well as numerous other variables not used in the cluster analysis. In addition, the clusters appear to transcend demographic variables that often segment audiences (eg, race, mother's age, socioeconomic status), thereby adding a new dimension to the way in which this audience can be characterized. Furthermore, psychographically defined clusters predicted dietary quality. This study demonstrates that mothers are not a homogenous group and need to have their unique characteristics taken into consideration when designing strategies to promote health. These results can help health practitioners better understand factors affecting food decisions and tailor interventions to better meet the needs of mothers.
Tchabo, William; Ma, Yongkun; Kwaw, Emmanuel; Zhang, Haining; Xiao, Lulu; Apaliya, Maurice T
2018-01-15
The four different methods of color measurement of wine proposed by Boulton, Giusti, Glories and Commission International de l'Eclairage (CIE) were applied to assess the statistical relationship between the phytochemical profile and chromatic characteristics of sulfur dioxide-free mulberry (Morus nigra) wine submitted to non-thermal maturation processes. The alteration in chromatic properties and phenolic composition of non-thermal aged mulberry wine were examined, aided by the used of Pearson correlation, cluster and principal component analysis. The results revealed a positive effect of non-thermal processes on phytochemical families of wines. From Pearson correlation analysis relationships between chromatic indexes and flavonols as well as anthocyanins were established. Cluster analysis highlighted similarities between Boulton and Giusti parameters, as well as Glories and CIE parameters in the assessment of chromatic properties of wines. Finally, principal component analysis was able to discriminate wines subjected to different maturation techniques on the basis of their chromatic and phenolics characteristics. Copyright © 2017. Published by Elsevier Ltd.
Application of Artificial Intelligence For Euler Solutions Clustering
NASA Astrophysics Data System (ADS)
Mikhailov, V.; Galdeano, A.; Diament, M.; Gvishiani, A.; Agayan, S.; Bogoutdinov, Sh.; Graeva, E.; Sailhac, P.
Results of Euler deconvolution strongly depend on the selection of viable solutions. Synthetic calculations using multiple causative sources show that Euler solutions clus- ter in the vicinity of causative bodies even when they do not group densely about perimeter of the bodies. We have developed a clustering technique to serve as a tool for selecting appropriate solutions. The method RODIN, employed in this study, is based on artificial intelligence and was originally designed for problems of classification of large data sets. It is based on a geometrical approach to study object concentration in a finite metric space of any dimension. The method uses a formal definition of cluster and includes free parameters that facilitate the search for clusters of given proper- ties. Test on synthetic and real data showed that the clustering technique successfully outlines causative bodies more accurate than other methods of discriminating Euler solutions. In complicated field cases such as the magnetic field in the Gulf of Saint Malo region (Brittany, France), the method provides geologically insightful solutions. Other advantages of the clustering method application are: - Clusters provide solutions associated with particular bodies or parts of bodies permitting the analysis of different clusters of Euler solutions separately. This may allow computation of average param- eters for individual causative bodies. - Those measurements of the anomalous field that yield clusters also form dense clusters themselves. The application of cluster- ing technique thus outlines areas where the influence of different causative sources is more prominent. This allows one to focus on areas for reinterpretation, using different window sizes, structural indices and so on.
AUTOMATIC DIRT TRAIL ANALYSIS IN DERMOSCOPY IMAGES
Cheng, Beibei; Stanley, R. Joe; Stoecker, William V.; Osterwise, Christopher T.P.; Stricklin, Sherea M.; Hinton, Kristen A.; Moss, Randy H.; Oliviero, Margaret; Rabinovitz, Harold S.
2011-01-01
Basal cell carcinoma (BCC) is the most common cancer in the U.S. Dermatoscopes are devices used by physicians to facilitate the early detection of these cancers based on the identification of skin lesion structures often specific to BCCs. One new lesion structure, referred to as dirt trails, has the appearance of dark gray, brown or black dots and clods of varying sizes distributed in elongated clusters with indistinct borders, often appearing as curvilinear trails. In this research, we explore a dirt trail detection and analysis algorithm for extracting, measuring, and characterizing dirt trails based on size, distribution, and color in dermoscopic skin lesion images. These dirt trails are then used to automatically discriminate BCC from benign skin lesions. For an experimental data set of 35 BCC images with dirt trails and 79 benign lesion images, a neural network-based classifier achieved a 0.902 area under a receiver operating characteristic curve using a leave-one-out approach, demonstrating the potential of dirt trails for BCC lesion discrimination. PMID:22233099
Fernandes, Telmo J R; Costa, Joana; Oliveira, M Beatriz P P; Mafra, Isabel
2017-09-01
This work aimed to exploit the use of DNA mini-barcodes combined with high resolution melting (HRM) for the authentication of gadoid species: Atlantic cod (Gadus morhua), Pacific cod (Gadus macrocephalus), Alaska pollock (Theragra chalcogramma) and saithe (Pollachius virens). Two DNA barcode regions, namely cytochrome c oxidase subunit I (COI) and cytochrome b (cytb), were analysed in silico to identify genetic variability among the four species and used, subsequently, to develop a real-time PCR method coupled with HRM analysis. The cytb mini-barcode enabled best discrimination of the target species with a high level of confidence (99.3%). The approach was applied successfully to identify gadoid species in 30 fish-containing foods, 30% of which were not as declared on the label. Herein, a novel approach for rapid, simple and cost-effective discrimination/clustering, as a tool to authenticate Gadidae fish species, according to their genetic relationship, is proposed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Syazwan, AI; Rafee, B Mohd; Juahir, Hafizan; Azman, AZF; Nizar, AM; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, AA; Yunos, MA Syafiq; Anita, AR; Hanafiah, J Muhamad; Shaharuddin, MS; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, MN Mohamad; Azizan, HS; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, FT
2012-01-01
Purpose To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. Design A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. Method A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Result Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. Conclusion This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ investigations and complaints procedures. PMID:23055779
Syazwan, Ai; Rafee, B Mohd; Juahir, Hafizan; Azman, Azf; Nizar, Am; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, Aa; Yunos, Ma Syafiq; Anita, Ar; Hanafiah, J Muhamad; Shaharuddin, Ms; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, Mn Mohamad; Azizan, Hs; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, Ft
2012-01-01
To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ investigations and complaints procedures.
Quality evaluation of yellow peach chips prepared by explosion puffing drying.
Lyu, Jian; Zhou, Lin-Yan; Bi, Jin-Feng; Liu, Xuan; Wu, Xin-Ye
2015-12-01
Nineteen evaluation indicators in 15 yellow peach chips prepared by explosion puffing drying were analyzed, including color, rehydration ratio, texture, and so on. The analysis methods of principle component analysis (PCA), analytic hierarchy process (AHP), K-means cluster (KC) and Discriminate analysis (DA) were used to analyze the comprehensive quality of the yellow peach chips. The dispersed coefficient of variation of the 19 evaluation indicators varied from 3.58 to 852.89 %, suggesting significant differences among yellow peach cultivars. The characteristic evaluation indicators, namely, reducing sugar content, out-put ratio, water content, a value and L value were analyzed by PCA, and their weights 0.0429, 0.1140, 0.4816, 1.1807 and 0.1807 were obtained by AHP. The levels in 15 cultivars effectively were classified by discrimination functions which obtained by KC and DA. The results suggested that three levels of comprehensive quality for yellow peach chips were divided, and the highest synthesis scores was observed in "senggelin" (11.1037), while the lowest synthesis value was found in "goldbaby" (-3.7600).
Fang, Guihua; Goh, Jing Yeen; Tay, Manjun; Lau, Hiu Fung; Li, Sam Fong Yau
2013-06-01
The correct identification of oils and fats is important to consumers from both commercial and health perspectives. Proton nuclear magnetic resonance ((1)H NMR) spectroscopy, gas chromatography-mass spectrometry (GC/MS) fingerprinting and chemometrics were employed successfully for the quality control of oils and fats. Principal component analysis (PCA) of both techniques showed group clustering of 14 types of oils and fats. Partial least squares discriminant analysis (PLS-DA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA) using GC/MS data had excellent classification sensitivity and specificity compared to models using NMR data. Depending on the availability of the instruments, data from either technique can effectively be applied for the establishment of an oils and fats database to identify unknown samples. Partial least squares (PLS) models were successfully established for the detection of as low as 5% of lard and beef tallow spiked into canola oil, thus illustrating possible applications in Islamic and Jewish countries. Copyright © 2012 Elsevier Ltd. All rights reserved.
Video based object representation and classification using multiple covariance matrices.
Zhang, Yurong; Liu, Quan
2017-01-01
Video based object recognition and classification has been widely studied in computer vision and image processing area. One main issue of this task is to develop an effective representation for video. This problem can generally be formulated as image set representation. In this paper, we present a new method called Multiple Covariance Discriminative Learning (MCDL) for image set representation and classification problem. The core idea of MCDL is to represent an image set using multiple covariance matrices with each covariance matrix representing one cluster of images. Firstly, we use the Nonnegative Matrix Factorization (NMF) method to do image clustering within each image set, and then adopt Covariance Discriminative Learning on each cluster (subset) of images. At last, we adopt KLDA and nearest neighborhood classification method for image set classification. Promising experimental results on several datasets show the effectiveness of our MCDL method.
Wall, Martin; Casswell, Sally
2017-05-01
The aim was to identify a typology of drinkers in New Zealand based on alcohol consumption, beverage choice, and public versus private drinking locations and investigate the relationship between drinker types, harms experienced, and policy-related variables. Model-based cluster analysis of male and female drinkers including volumes of alcohol consumed in the form of beer, wine, spirits, and ready-to-drinks (RTDs) in off- and on-premise settings. Cluster membership was then related to harm measures: alcohol dependence, self-rated health; and to 3 policy-relevant variables: liking for alcohol adverts, price paid for alcohol, and time of purchase. Males and females were analyzed separately. Men fell into 4 and women into 14 clearly discriminated clusters. The male clusters consumed a relatively high proportion of alcohol in the form of beer. Women had a number of small extreme clusters and some consumed mainly spirits-based RTDs, while others drank mainly wine. Those in the higher consuming clusters were more likely to have signs of alcohol dependency, to report lower satisfaction with their health, to like alcohol ads, and to have purchased late at night. Consumption patterns are sufficiently distinctive to identify typologies of male and female alcohol consumers. Women drinkers are more heterogeneous than men. The clusters relate differently to policy-related variables. Copyright © 2017 by the Research Society on Alcoholism.
Gender-related dimensions of childhood adversities in the general population.
Coêlho, Bruno M; Santana, Geilson L; Viana, Maria C; Andrade, Laura H; Wang, Yuan-Pang
2018-06-11
Childhood adversities (CAs) comprise a group of negative experiences individuals may suffer in their lifetimes. The goal of the present study was to investigate the cluster discrimination of CAs through psychometric determination of the common attributes of such experiences for men and women. Parental mental illness, substance misuse, criminality, death, divorce, other parental loss, family violence, physical abuse, sexual abuse, neglect, physical illness, and economic adversity were assessed in a general-population sample (n=5,037). Exploratory and confirmatory factor analysis determined gender-related dimensions of CA. The contribution of each individual adversity was explored through Rasch analysis. Adversities were reported by 53.6% of the sample. A three-factor model of CA dimensions fit the data better for men, and a two-factor model for women. For both genders, the dimension of family maladjustment - encompassing physical abuse, neglect, parental mental disorders, and family violence - was the core cluster of CAs. Women endorsed more CAs than men. Rasch analysis found that sexual abuse, physical illness, parental criminal behavior, parental divorce, and economic adversity were difficult to report in face-to-face interviews. CAs embrace sensitive personal information, clustering of which differed by gender. Acknowledging CAs may have an impact on medical and psychiatric outcomes in adulthood.
A cluster pattern algorithm for the analysis of multiparametric cell assays.
Kaufman, Menachem; Bloch, David; Zurgil, Naomi; Shafran, Yana; Deutsch, Mordechai
2005-09-01
The issue of multiparametric analysis of complex single cell assays of both static and flow cytometry (SC and FC, respectively) has become common in recent years. In such assays, the analysis of changes, applying common statistical parameters and tests, often fails to detect significant differences between the investigated samples. The cluster pattern similarity (CPS) measure between two sets of gated clusters is based on computing the difference between their density distribution functions' set points. The CPS was applied for the discrimination between two observations in a four-dimensional parameter space. The similarity coefficient (r) ranges between 0 (perfect similarity) to 1 (dissimilar). Three CPS validation tests were carried out: on the same stock samples of fluorescent beads, yielding very low r's (0, 0.066); and on two cell models: mitogenic stimulation of peripheral blood mononuclear cells (PBMC), and apoptosis induction in Jurkat T cell line by H2O2. In both latter cases, r indicated similarity (r < 0.23) within the same group, and dissimilarity (r > 0.48) otherwise. This classification and algorithm approach offers a measure of similarity between samples. It relies on the multidimensional pattern of the sample parameters. The algorithm compensates for environmental drifts in this apparatus and assay; it also may be applied to more than four dimensions.
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Silva, D M; Siqueira, M V B M; Carrasco, N F; Mantello, C C; Nascimento, W F; Veasey, E A
2016-05-23
Dioscorea is the largest genus in the Dioscoreaceae family, and includes a number of economically important species including the air yam, D. bulbifera L. This study aimed to develop new single sequence repeat primers and characterize the genetic diversity of local varieties that originated in several municipalities of Brazil. We developed an enriched genomic library for D. bulbifera resulting in seven primers, six of which were polymorphic, and added four polymorphic loci developed for other Dioscorea species. This resulted in 10 polymorphic primers to evaluate 42 air yam accessions. Thirty-three alleles (bands) were found, with an average of 3.3 alleles per locus. The discrimination power ranged from 0.113 to 0.834, with an average of 0.595. Both principal coordinate and cluster analyses (using the Jaccard Index) failed to clearly separate the accessions according to their origins. However, the 13 accessions from Conceição dos Ouros, Minas Gerais State were clustered above zero on the principal coordinate 2 axis, and were also clustered into one subgroup in the cluster analysis. Accessions from Ubatuba, São Paulo State were clustered below zero on the same principal coordinate 2 axis, except for one accession, although they were scattered in several subgroups in the cluster analysis. Therefore, we found little spatial structure in the accessions, although those from Conceição dos Ouros and Ubatuba exhibited some spatial structure, and that there is a considerable level of genetic diversity in D. bulbifera maintained by traditional farmers in Brazil.
Clustering gene expression regulators: new approach to disease subtyping.
Pyatnitskiy, Mikhail; Mazo, Ilya; Shkrob, Maria; Schwartz, Elena; Kotelnikova, Ekaterina
2014-01-01
One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.
Clustering Gene Expression Regulators: New Approach to Disease Subtyping
Pyatnitskiy, Mikhail; Mazo, Ilya; Shkrob, Maria; Schwartz, Elena; Kotelnikova, Ekaterina
2014-01-01
One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient. PMID:24416320
Fernández-Arjona, María Del Mar; Grondona, Jesús M; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D
2017-01-01
It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor.
Fernández-Arjona, María del Mar; Grondona, Jesús M.; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D.
2017-01-01
It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor. PMID:28848398
Brief Report: Clusters and Trajectories Across the Autism and/or ADHD Spectrum.
LaBianca, S; Pagsberg, A K; Jakobsen, K D; Demur, A B; Bartalan, M; LaBianca, J; Werge, T
2018-06-07
Autism Spectrum Disorder (ASD) and Attention Deficit Hyperactivity Disorder (ADHD) frequently co-occur and show high genetic correlation. With the introduction of DSM-5, there is a new concept of an ASD and/or ADHD spectrum (ASD/ADHD). This study aimed to identify predictors of severity and need of healthcare within this spectrum. 39 families with multiple individuals affected by ASD/ADHD were recruited from a psychiatric clinic. Diagnoses, functional and demographic characteristics were retrieved from journals while hospital admissions were identified in the Danish health register. An estimated fraction of 31% ASD/ADHD patients had never been hospitalized and 35% remained undiagnosed despite hospitalization. Cluster analysis identified trajectories that discriminate age of diagnosis, educational attainment to degree of severity, need of hospitalization and genetic risk.
Pulley, Simon; Foster, Ian; Collins, Adrian L
2017-06-01
The objective classification of sediment source groups is at present an under-investigated aspect of source tracing studies, which has the potential to statistically improve discrimination between sediment sources and reduce uncertainty. This paper investigates this potential using three different source group classification schemes. The first classification scheme was simple surface and subsurface groupings (Scheme 1). The tracer signatures were then used in a two-step cluster analysis to identify the sediment source groupings naturally defined by the tracer signatures (Scheme 2). The cluster source groups were then modified by splitting each one into a surface and subsurface component to suit catchment management goals (Scheme 3). The schemes were tested using artificial mixtures of sediment source samples. Controlled corruptions were made to some of the mixtures to mimic the potential causes of tracer non-conservatism present when using tracers in natural fluvial environments. It was determined how accurately the known proportions of sediment sources in the mixtures were identified after unmixing modelling using the three classification schemes. The cluster analysis derived source groups (2) significantly increased tracer variability ratios (inter-/intra-source group variability) (up to 2122%, median 194%) compared to the surface and subsurface groupings (1). As a result, the composition of the artificial mixtures was identified an average of 9.8% more accurately on the 0-100% contribution scale. It was found that the cluster groups could be reclassified into a surface and subsurface component (3) with no significant increase in composite uncertainty (a 0.1% increase over Scheme 2). The far smaller effects of simulated tracer non-conservatism for the cluster analysis based schemes (2 and 3) was primarily attributed to the increased inter-group variability producing a far larger sediment source signal that the non-conservatism noise (1). Modified cluster analysis based classification methods have the potential to reduce composite uncertainty significantly in future source tracing studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Infant Discrimination of a Morphologically Relevant Word-Final Contrast
ERIC Educational Resources Information Center
Fais, Laurel; Kajikawa, Sachiyo; Amano, Shigeaki; Werker, Janet F.
2009-01-01
Six-, 12-, and 18-month-old English-hearing infants were tested on their ability to discriminate nonword forms ending in the final stop consonants /k/ and /t/ from their counterparts with final /s/ added, resulting in final clusters /ks/ and /ts/, in a habituation-dishabituation, looking time paradigm. Infants at all 3 ages demonstrated an ability…
Syed Abdul Mutalib, Sharifah Norsukhairin; Juahir, Hafizan; Azid, Azman; Mohd Sharif, Sharifah; Latif, Mohd Talib; Aris, Ahmad Zaharin; Zain, Sharifuddin M; Dominick, Doreena
2013-09-01
The objective of this study is to identify spatial and temporal patterns in the air quality at three selected Malaysian air monitoring stations based on an eleven-year database (January 2000-December 2010). Four statistical methods, Discriminant Analysis (DA), Hierarchical Agglomerative Cluster Analysis (HACA), Principal Component Analysis (PCA) and Artificial Neural Networks (ANNs), were selected to analyze the datasets of five air quality parameters, namely: SO2, NO2, O3, CO and particulate matter with a diameter size of below 10 μm (PM10). The three selected air monitoring stations share the characteristic of being located in highly urbanized areas and are surrounded by a number of industries. The DA results show that spatial characterizations allow successful discrimination between the three stations, while HACA shows the temporal pattern from the monthly and yearly factor analysis which correlates with severe haze episodes that have happened in this country at certain periods of time. The PCA results show that the major source of air pollution is mostly due to the combustion of fossil fuel in motor vehicles and industrial activities. The spatial pattern recognition (S-ANN) results show a better prediction performance in discriminating between the regions, with an excellent percentage of correct classification compared to DA. This study presents the necessity and usefulness of environmetric techniques for the interpretation of large datasets aiming to obtain better information about air quality patterns based on spatial and temporal characterizations at the selected air monitoring stations.
LIFE EVENTS AND SOMATOFORM DISORDERS
Chandrashekhar, C.R.; Reddy, Venkataswamy; Isaac, Mohan K.
1997-01-01
Presumptive Stressful Life Events Scale (PSLES) was administered to 69 physically ill, 23 patients with somatoform disorders and 45 patients with psychiatric disorders other than somatoform disorders who sought medical help in primary health care settings. The 137 patients were cluster analysed in orderto obtain the patterns of distribution of 39 life events. Five clusters emerged. All the patients in cluster Vhad somatoform disorders and life events had a significant occurrence and discrimination. PMID:21584065
Ishii, Genichiro; Aoyagi, Kazuhiko; Sasaki, Hiroki; Ochiai, Atsushi
2015-01-01
Background Fibroblasts are the principal stromal cells that exist in whole organs and play vital roles in many biological processes. Although the functional diversity of fibroblasts has been estimated, a comprehensive analysis of fibroblasts from the whole body has not been performed and their transcriptional diversity has not been sufficiently explored. The aim of this study was to elucidate the transcriptional diversity of human fibroblasts within the whole body. Methods Global gene expression analysis was performed on 63 human primary fibroblasts from 13 organs. Of these, 32 fibroblasts from gastrointestinal organs (gastrointestinal fibroblasts: GIFs) were obtained from a pair of 2 anatomical sites: the submucosal layer (submucosal fibroblasts: SMFs) and the subperitoneal layer (subperitoneal fibroblasts: SPFs). Using hierarchical clustering analysis, we elucidated identifiable subgroups of fibroblasts and analyzed the transcriptional character of each subgroup. Results In unsupervised clustering, 2 major clusters that separate GIFs and non-GIFs were observed. Organ- and anatomical site-dependent clusters within GIFs were also observed. The signature genes that discriminated GIFs from non-GIFs, SMFs from SPFs, and the fibroblasts of one organ from another organ consisted of genes associated with transcriptional regulation, signaling ligands, and extracellular matrix remodeling. Conclusions GIFs are characteristic fibroblasts with specific gene expressions from transcriptional regulation, signaling ligands, and extracellular matrix remodeling related genes. In addition, the anatomical site- and organ-dependent diversity of GIFs was also discovered. These features of GIFs contribute to their specific physiological function and homeostatic maintenance, and create a functional diversity of the gastrointestinal tract. PMID:26046848
Discrimination of lichen genera and species using element concentrations
Bennett, James P.
2008-01-01
The importance of organic chemistry in the classification of lichens is well established, but inorganic chemistry has been largely overlooked. Six lichen species were studied over a period of 23 years that were growing in 11 protected areas of the northern Great Lakes ecoregion, which were not greatly influenced by anthropogenic particulates or gaseous air pollutants. The elemental data from these studies were aggregated in order to test the hypothesis that differences among species in tissue element concentrations were large enough to discriminate between taxa faithfully. Concentrations of 16 chemical elements that were found in tissue samples from Cladonia rangiferina, Evernia mesomorpha, Flavopunctelia flaventior, Hypogymnia physodes, Parmelia sulcata, and Punctelia rudecta were analyzed statistically using multivariate discriminant functions and CART analyses, as well as t-tests. Genera and species were clearly separated in element space, and elemental discriminant functions were able to classify 91-100 of the samples correctly into species. At the broadest level, a Zn concentration of 51 ppm in tissues of four of the lichen species effectively discriminated foliose from fruticose species. Similarly, a S concentration of 680 ppm discriminated C. rangiferina and E. mesomorpha, and a Ca concentration of 10 436 ppm discriminated H. physodes from P. sulcata. For the three parmelioid species, a Ca concentration >32 837 ppm discriminated Punctelia rudecta from the other two species, while a Zn concentration of 56 ppm discriminated Parmelia sulcata from F. flaventior. Foliose species also had higher concentrations than did fruticose species of all elements except Na. Elemental signatures for each of the six species were developed using standardized means. Twenty-four mechanisms explaining the differences among species are summarized. Finally, the relationships of four species based on element concentrations, using additive-trees clustering of a Euclidean-distance matrix, produced identical relationships as did analyses based on secondary product chemistry that used additive-trees clustering of a Jaccard similarity matrix. At least for these six species, element composition has taxonomic significance, and may be useful for discriminating other taxa.
A new metaphor for projection-based visual analysis and data exploration
NASA Astrophysics Data System (ADS)
Schreck, Tobias; Panse, Christian
2007-01-01
In many important application domains such as Business and Finance, Process Monitoring, and Security, huge and quickly increasing volumes of complex data are collected. Strong efforts are underway developing automatic and interactive analysis tools for mining useful information from these data repositories. Many data analysis algorithms require an appropriate definition of similarity (or distance) between data instances to allow meaningful clustering, classification, and retrieval, among other analysis tasks. Projection-based data visualization is highly interesting (a) for visual discrimination analysis of a data set within a given similarity definition, and (b) for comparative analysis of similarity characteristics of a given data set represented by different similarity definitions. We introduce an intuitive and effective novel approach for projection-based similarity visualization for interactive discrimination analysis, data exploration, and visual evaluation of metric space effectiveness. The approach is based on the convex hull metaphor for visually aggregating sets of points in projected space, and it can be used with a variety of different projection techniques. The effectiveness of the approach is demonstrated by application on two well-known data sets. Statistical evidence supporting the validity of the hull metaphor is presented. We advocate the hull-based approach over the standard symbol-based approach to projection visualization, as it allows a more effective perception of similarity relationships and class distribution characteristics.
NASA Astrophysics Data System (ADS)
Sybilska, Agnieszka; Łokas, Ewa Luiza; Fouquet, Sylvain
2017-03-01
We combine high-quality IFU data with a new set of numerical simulations to study low-mass early type galaxies (dEs) in dense environments. Our earlier study of dEs in the Virgo cluster has produced the first large-scale maps of kinematic and stellar population properties of dEs in those environments (Ryś et al. 2013, 2014, 2015). A quantitative discrimination between various (trans)formation processes proposed for these objects is, however, a complex issue, requiring a priori assumptions about the progenitors of galaxies we observe and study today. To bridge this gap between observations and theoretical predictions, we use the expertise gained in the IFU data analysis to look ``through the eye of SAURON'' at our new suite of high-resolution N-body simulations of dEs in the Virgo cluster. Mimicking the observers perspective as closely as possible, we can also indicate the existing instrumental and viewer limitations regarding what we are/are not able to detect as observers.
Li, Yan; Rui, Xue; Li, Shuyu; Pu, Fang
2014-11-01
Graph theoretical analysis has recently become a popular research tool in neuroscience, however, there have been very few studies on brain responses to music perception, especially when culturally different styles of music are involved. Electroencephalograms were recorded from ten subjects listening to Chinese traditional music, light music and western classical music. For event-related potentials, phase coherence was calculated in the alpha band and then constructed into correlation matrices. Clustering coefficients and characteristic path lengths were evaluated for global properties, while clustering coefficients and efficiency were assessed for local network properties. Perception of light music and western classical music manifested small-world network properties, especially with a relatively low proportion of weights of correlation matrices. For local analysis, efficiency was more discernible than clustering coefficient. Nevertheless, there was no significant discrimination between Chinese traditional and western classical music perception. Perception of different styles of music introduces different network properties, both globally and locally. Research into both global and local network properties has been carried out in other areas; however, this is a preliminary investigation aimed at suggesting a possible new approach to brain network properties in music perception. Copyright © 2014 Elsevier Ltd. All rights reserved.
Feature Selection Using Information Gain for Improved Structural-Based Alert Correlation
Siraj, Maheyzah Md; Zainal, Anazida; Elshoush, Huwaida Tagelsir; Elhaj, Fatin
2016-01-01
Grouping and clustering alerts for intrusion detection based on the similarity of features is referred to as structurally base alert correlation and can discover a list of attack steps. Previous researchers selected different features and data sources manually based on their knowledge and experience, which lead to the less accurate identification of attack steps and inconsistent performance of clustering accuracy. Furthermore, the existing alert correlation systems deal with a huge amount of data that contains null values, incomplete information, and irrelevant features causing the analysis of the alerts to be tedious, time-consuming and error-prone. Therefore, this paper focuses on selecting accurate and significant features of alerts that are appropriate to represent the attack steps, thus, enhancing the structural-based alert correlation model. A two-tier feature selection method is proposed to obtain the significant features. The first tier aims at ranking the subset of features based on high information gain entropy in decreasing order. The second tier extends additional features with a better discriminative ability than the initially ranked features. Performance analysis results show the significance of the selected features in terms of the clustering accuracy using 2000 DARPA intrusion detection scenario-specific dataset. PMID:27893821
Temporal indiscriminateness: the case of cluster bombs.
Cavanaugh, T A
2010-03-01
This paper argues that the current stock of anti-personnel cluster bombs are temporally indiscriminate, and, therefore, unjust weapons. The paper introduces and explains the idea of temporal indiscriminateness. It argues that to honor non-combatant immunity-in addition to not targeting civilians-one must adequately target combatants. Due to their high dud rate, cluster submunitions fail to target combatants with sufficient temporal accuracy, and, thereby, result in avoidable serious harm to non-combatants. The paper concludes that non-combatant immunity and the principle of discrimination require a moratorium on the use of current cluster munitions.
Spyrakis, Francesca; Benedetti, Paolo; Decherchi, Sergio; Rocchia, Walter; Cavalli, Andrea; Alcaro, Stefano; Ortuso, Francesco; Baroni, Massimo; Cruciani, Gabriele
2015-10-26
The importance of taking into account protein flexibility in drug design and virtual ligand screening (VS) has been widely debated in the literature, and molecular dynamics (MD) has been recognized as one of the most powerful tools for investigating intrinsic protein dynamics. Nevertheless, deciphering the amount of information hidden in MD simulations and recognizing a significant minimal set of states to be used in virtual screening experiments can be quite complicated. Here we present an integrated MD-FLAP (molecular dynamics-fingerprints for ligand and proteins) approach, comprising a pipeline of molecular dynamics, clustering and linear discriminant analysis, for enhancing accuracy and efficacy in VS campaigns. We first extracted a limited number of representative structures from tens of nanoseconds of MD trajectories by means of the k-medoids clustering algorithm as implemented in the BiKi Life Science Suite ( http://www.bikitech.com [accessed July 21, 2015]). Then, instead of applying arbitrary selection criteria, that is, RMSD, pharmacophore properties, or enrichment performances, we allowed the linear discriminant analysis algorithm implemented in FLAP ( http://www.moldiscovery.com [accessed July 21, 2015]) to automatically choose the best performing conformational states among medoids and X-ray structures. Retrospective virtual screenings confirmed that ensemble receptor protocols outperform single rigid receptor approaches, proved that computationally generated conformations comprise the same quantity/quality of information included in X-ray structures, and pointed to the MD-FLAP approach as a valuable tool for improving VS performances.
Saavedra, Milene T; Quon, Bradley S; Faino, Anna; Caceres, Silvia M; Poch, Katie R; Sanders, Linda A; Malcolm, Kenneth C; Nichols, David P; Sagel, Scott D; Taylor-Cousar, Jennifer L; Leach, Sonia M; Strand, Matthew; Nick, Jerry A
2018-05-01
Cystic fibrosis pulmonary exacerbations accelerate pulmonary decline and increase mortality. Previously, we identified a 10-gene leukocyte panel measured directly from whole blood, which indicates response to exacerbation treatment. We hypothesized that molecular characteristics of exacerbations could also predict future disease severity. We tested whether a 10-gene panel measured from whole blood could identify patient cohorts at increased risk for severe morbidity and mortality, beyond standard clinical measures. Transcript abundance for the 10-gene panel was measured from whole blood at the beginning of exacerbation treatment (n = 57). A hierarchical cluster analysis of subjects based on their gene expression was performed, yielding four molecular clusters. An analysis of cluster membership and outcomes incorporating an independent cohort (n = 21) was completed to evaluate robustness of cluster partitioning of genes to predict severe morbidity and mortality. The four molecular clusters were analyzed for differences in forced expiratory volume in 1 second, C-reactive protein, return to baseline forced expiratory volume in 1 second after treatment, time to next exacerbation, and time to morbidity or mortality events (defined as lung transplant referral, lung transplant, intensive care unit admission for respiratory insufficiency, or death). Clustering based on gene expression discriminated between patient groups with significant differences in forced expiratory volume in 1 second, admission frequency, and overall morbidity and mortality. At 5 years, all subjects in cluster 1 (very low risk) were alive and well, whereas 90% of subjects in cluster 4 (high risk) had suffered a major event (P = 0.0001). In multivariable analysis, the ability of gene expression to predict clinical outcomes remained significant, despite adjustment for forced expiratory volume in 1 second, sex, and admission frequency. The robustness of gene clustering to categorize patients appropriately in terms of clinical characteristics, and short- and long-term clinical outcomes, remained consistent, even when adding in a secondary population with significantly different clinical outcomes. Whole blood gene expression profiling allows molecular classification of acute pulmonary exacerbations, beyond standard clinical measures, providing a predictive tool for identifying subjects at increased risk for mortality and disease progression.
A Comparative Study of Land Cover Classification by Using Multispectral and Texture Data
Qadri, Salman; Khan, Dost Muhammad; Ahmad, Farooq; Qadri, Syed Furqan; Babar, Masroor Ellahi; Shahid, Muhammad; Ul-Rehman, Muzammil; Razzaq, Abdul; Shah Muhammad, Syed; Fahad, Muhammad; Ahmad, Sarfraz; Pervez, Muhammad Tariq; Naveed, Nasir; Aslam, Naeem; Jamil, Mutiullah; Rehmani, Ejaz Ahmad; Ahmad, Nazir; Akhtar Khan, Naeem
2016-01-01
The main objective of this study is to find out the importance of machine vision approach for the classification of five types of land cover data such as bare land, desert rangeland, green pasture, fertile cultivated land, and Sutlej river land. A novel spectra-statistical framework is designed to classify the subjective land cover data types accurately. Multispectral data of these land covers were acquired by using a handheld device named multispectral radiometer in the form of five spectral bands (blue, green, red, near infrared, and shortwave infrared) while texture data were acquired with a digital camera by the transformation of acquired images into 229 texture features for each image. The most discriminant 30 features of each image were obtained by integrating the three statistical features selection techniques such as Fisher, Probability of Error plus Average Correlation, and Mutual Information (F + PA + MI). Selected texture data clustering was verified by nonlinear discriminant analysis while linear discriminant analysis approach was applied for multispectral data. For classification, the texture and multispectral data were deployed to artificial neural network (ANN: n-class). By implementing a cross validation method (80-20), we received an accuracy of 91.332% for texture data and 96.40% for multispectral data, respectively. PMID:27376088
Detection and monitoring of anaerobic rumen fungi using an ARISA method.
Denman, S E; Nicholson, M J; Brookman, J L; Theodorou, M K; McSweeney, C S
2008-12-01
To develop an automated ribosomal intergenic spacer region analysis (ARISA) method for the detection of anaerobic rumen fungi and also to demonstrate utility of the technique to monitor colonization and persistence of fungi, and diet-induced changes in community structure. The method could discriminate between three genera of anaerobic rumen fungal isolates, representing Orpinomyces, Piromyces and Neocallimastix species. Changes in anaerobic fungal composition were observed between animals fed a high-fibre diet compared with a grain-based diet. ARISA analysis of rumen samples from animals on grain showed a decrease in fungal diversity with a dominance of Orpinomyces and Piromyces spp. Clustering analysis of ARISA profile patterns grouped animals based on diet. A single strain of Orpinomyces was dosed into a cow and was detectable within the rumen fungal population for several weeks afterwards. The ARISA technique was capable of discriminating between pure cultures at the genus level. Diet composition has a significant influence on the diversity of anaerobic fungi in the rumen and the method can be used to monitor introduced strains. Through the use of ARISA analysis, a better understanding of the effect of diets on rumen anaerobic fungi populations is provided.
Arvanitoyannis, Ioannis S; Vlachos, Antonios
2007-01-01
The authenticity of products labeled as olive oils, and in particular as virgin olive oils, stands for a very important issue both in terms of its health and commercial aspects. In view of the continuously increasing interest in virgin olive oil therapeutic properties, the traditional methods of characterization and physical and sensory analysis were further enriched with more advanced and sophisticated methods such as HPLC-MS, HPLC-GC/C/IRMS, RPLC-GC, DEPT, and CSIA among others. The results of both traditional and "novel" methods were treated both by means of classical multivariate analysis (cluster, principal component, correspondence, canonical, and discriminant) and artificial intelligence methods showing that nowadays the adulteration of virgin olive oil with seed oil is detectable at very low percentages, sometimes even at less than 1%. Furthermore, the detection of geographical origin of olive oil is equally feasible and much more accurate in countries like Italy and Spain where databases of physical/chemical properties exist. However, this geographical origin classification can also be accomplished in the absence of such databases provided that an adequate number of oil samples are used and the parameters studied have "discriminating power."
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pang, Yuanjie, E-mail: yuanjie.p@gmail.com
Background: Natural and anthropogenic sources of metal exposure differ for urban and rural residents. We searched to identify patterns of metal mixtures which could suggest common environmental sources and/or metabolic pathways of different urinary metals, and compared metal-mixtures in two population-based studies from urban/sub-urban and rural/town areas in the US: the Multi-Ethnic Study of Atherosclerosis (MESA) and the Strong Heart Study (SHS). Methods: We studied a random sample of 308 White, Black, Chinese-American, and Hispanic participants in MESA (2000–2002) and 277 American Indian participants in SHS (1998–2003). We used principal component analysis (PCA), cluster analysis (CA), and linear discriminant analysismore » (LDA) to evaluate nine urinary metals (antimony [Sb], arsenic [As], cadmium [Cd], lead [Pb], molybdenum [Mo], selenium [Se], tungsten [W], uranium [U] and zinc [Zn]). For arsenic, we used the sum of inorganic and methylated species (∑As). Results: All nine urinary metals were higher in SHS compared to MESA participants. PCA and CA revealed the same patterns in SHS, suggesting 4 distinct principal components (PC) or clusters (∑As-U-W, Pb-Sb, Cd-Zn, Mo-Se). In MESA, CA showed 2 large clusters (∑As-Mo-Sb-U-W, Cd-Pb-Se-Zn), while PCA showed 4 PCs (Sb-U-W, Pb-Se-Zn, Cd-Mo, ∑As). LDA indicated that ∑As, U, W, and Zn were the most discriminant variables distinguishing MESA and SHS participants. Conclusions: In SHS, the ∑As-U-W cluster and PC might reflect groundwater contamination in rural areas, and the Cd-Zn cluster and PC could reflect common sources from meat products or metabolic interactions. Among the metals assayed, ∑As, U, W and Zn differed the most between MESA and SHS, possibly reflecting disproportionate exposure from drinking water and perhaps food in rural Native communities compared to urban communities around the US. - Highlights: • We identified and compared environmental sources of urinary metals in MESA and SHS. • ∑As-U-W in SHS may reflect groundwater contamination in rural areas. • Cd-Zn in SHS may reflect common sources from meat products or metabolic interaction. • ∑As, U, W, and Zn differed the most between MESA and SHS participants.« less
A comparison of hair colour measurement by digital image analysis with reflective spectrophotometry.
Vaughn, Michelle R; van Oorschot, Roland A H; Baindur-Hudson, Swati
2009-01-10
While reflective spectrophotometry is an established method for measuring macroscopic hair colour, it can be cumbersome to use on a large number of individuals and not all reflective spectrophotometry instruments are easily portable. This study investigates the use of digital photographs to measure hair colour and compares its use to reflective spectrophotometry. An understanding of the accuracy of colour determination by these methods is of relevance when undertaking specific investigations, such as those on the genetics of hair colour. Measurements of hair colour may also be of assistance in cases where a photograph is the only evidence of hair colour available (e.g. surveillance). Using the CIE L(*)a(*)b(*) colour space, the hair colour of 134 individuals of European ancestry was measured by both reflective spectrophotometry and by digital image analysis (in V++). A moderate correlation was found along all three colour axes, with Pearson correlation coefficients of 0.625, 0.593 and 0.513 for L(*), a(*) and b(*) respectively (p-values=0.000), with means being significantly overestimated by digital image analysis for all three colour components (by an average of 33.42, 3.38 and 8.00 for L(*), a(*) and b(*) respectively). When using digital image data to group individuals into clusters previously determined by reflective spectrophotometric analysis using a discriminant analysis, individuals were classified into the correct clusters 85.8% of the time when there were two clusters. The percentage of cases correctly classified decreases as the number of clusters increases. It is concluded that, although more convenient, hair colour measurement from digital images has limited use in situations requiring accurate and consistent measurements.
Shah, Yogendra; Maharjan, Bhagwan; Thapa, Jeewan; Poudel, Ajay; Diab, Hassan Mahmoud; Pandey, Basu Dev; Solo, Eddie S; Isoda, Norikazu; Suzuki, Yasuhiko; Nakajima, Chie
2017-10-01
Tuberculosis (TB) caused by Mycobacterium tuberculosis (MTB) poses a major public health problem in Nepal. Although it has been reported as one of the dominant genotypes of MTB in Nepal, little information on the Central Asian Strain (CAS) family is available, especially isolates related to multidrug resistance (MDR) cases. This study aimed to elucidate the genetic and epidemiological characteristics of MDR CAS isolates in Nepal. A total of 145 MDR CAS isolates collected in Nepal from 2008 to 2013 were characterized by spoligotyping, mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) analysis, and drug resistance-associated gene sequencing. Spoligotyping analysis showed CAS1_Delhi SIT26 as predominant (60/145, 41.4%). However, by combining spoligotyping and MIRU-VNTR typing, it was possible to successfully discriminate all 145 isolates into 116 different types including 18 clusters with 47 isolates (clustering rate 32.4%). About a half of these clustered isolates shared the same genetic and geographical characteristics with other isolates in each cluster, and some of them shared rare point mutations in rpoB that are thought to be associated with rifampicin resistance. Although the data obtained show little evidence that large outbreaks of MDR-TB caused by the CAS family have occurred in Nepal, they strongly suggest several MDR-MTB transmission cases. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Ladunga, I
1992-04-01
The markedly nonuniform, even systematic distribution of sequences in the protein "universe" has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two chi 2-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.
Boriollo, Marcelo Fabiano Gomes; Rosa, Edvaldo Antonio Ribeiro; Gonçalves, Reginaldo Bruno; Höfling, José Francisco
2006-03-01
The typing of C. albicans by MLEE (multilocus enzyme electrophoresis) is dependent on the interpretation of enzyme electrophoretic patterns, and the study of the epidemiological relationships of these yeasts can be conducted by cluster analysis. Therefore, the aims of the present study were to first determine the discriminatory power of genetic interpretation (deduction of the allelic composition of diploid organisms) and numerical interpretation (mere determination of the presence and absence of bands) of MLEE patterns, and then to determine the concordance (Pearson product-moment correlation coefficient) and similarity (Jaccard similarity coefficient) of the groups of strains generated by three cluster analysis models, and the discriminatory power of such models as well [model A: genetic interpretation, genetic distance matrix of Nei (d(ij)) and UPGMA dendrogram; model B: genetic interpretation, Dice similarity matrix (S(D1)) and UPGMA dendrogram; model C: numerical interpretation, Dice similarity matrix (S(D2)) and UPGMA dendrogram]. MLEE was found to be a powerful and reliable tool for the typing of C. albicans due to its high discriminatory power (>0.9). Discriminatory power indicated that numerical interpretation is a method capable of discriminating a greater number of strains (47 versus 43 subtypes), but also pointed to model B as a method capable of providing a greater number of groups, suggesting its use for the typing of C. albicans by MLEE and cluster analysis. Very good agreement was only observed between the elements of the matrices S(D1) and S(D2), but a large majority of the groups generated in the three UPGMA dendrograms showed similarity S(J) between 4.8% and 75%, suggesting disparities in the conclusions obtained by the cluster assays.
Cluster analysis reveals subclinical subgroups with shared autistic and schizotypal traits.
Ford, Talitha C; Apputhurai, Pragalathan; Meyer, Denny; Crewther, David P
2018-07-01
Autism and schizophrenia spectrum research is typically based on coarse diagnostic classification, which overlooks individual variation within clinical groups. This method limits the identification of underlying cognitive, genetic and neural correlates of specific symptom dimensions. This study, therefore, aimed to identify homogenous subclinical subgroups of specific autistic and schizotypal traits dimensions, that may be utilised to establish more effective diagnostic and treatment practices. Latent profile analysis of subscale scores derived from an autism-schizotypy questionnaire, completed by 1678 subclinical adults aged 18-40 years (1250 females), identified a local optimum of eight population clusters: High, Moderate and Low Psychosocial Difficulties; High, Moderate and Low Autism-Schizotypy; High Psychosis-Proneness; and Moderate Schizotypy. These subgroups represent the convergent and discriminant dimensions of autism and schizotypy in the subclinical population, and highlight the importance of examining subgroups of specific symptom characteristics across these spectra in order to identify the underlying genetic and neural correlates that can be utilised to advance diagnostic and treatment practices. Copyright © 2018 Elsevier B.V. All rights reserved.
Multivariate analysis in a genetic divergence study of Psidium guajava.
Nogueira, A M; Ferreira, M F S; Guilhen, J H S; Ferreira, A
2014-12-18
The family Myrtaceae is widespread in the Atlantic Forest and is well-represented in the Espírito Santo State in Brazil. In the genus Psidium of this family, guava (Psidium guajava L.) is the most economically important species. Guava is widely cultivated in tropical and subtropical countries; however, the widespread cultivation of only a small number of guava tree cultivars may cause the genetic vulnerability of this crop, making the search for promising genotypes in natural populations important for breeding programs and conservation. In this study, the genetic diversity of 66 guava trees sampled in the southern region of Espírito Santo and in Caparaó, MG, Brazil were evaluated. A total of 28 morphological descriptors (11 quantitative and 17 multicategorical) and 18 microsatellite markers were used. Principal component, discriminant and cluster analyses, descriptive analyses, and genetic diversity analyses using simple sequence repeats were performed. Discrimination of accessions using molecular markers resulted in clustering of genotypes of the same origin, which was not observed using morphological data. Genetic diversity was detected between and within the localities evaluated, regardless of the methodology used. Genetic differentiation among the populations using morphological and molecular data indicated the importance of the study area for species conservation, genetic erosion estimation, and exploitation in breeding programs.
Washio, Kana; Oka, Takashi; Abdalkader, Lamia; Muraoka, Michiko; Shimada, Akira; Oda, Megumi; Sato, Hiaki; Takata, Katsuyoshi; Kagami, Yoshitoyo; Shimizu, Norio; Kato, Seiichi; Kimura, Hiroshi; Nishizaki, Kazunori; Yoshino, Tadashi; Tsukahara, Hirokazu
2017-11-01
The human herpes virus, Epstein-Barr virus (EBV), is a known oncogenic virus and plays important roles in life-threatening T/NK-cell lymphoproliferative disorders (T/NK-cell LPD) such as hypersensitivity to mosquito bite (HMB), chronic active EBV infection (CAEBV), and NK/T-cell lymphoma/leukemia. During the clinical courses of HMB and CAEBV, patients frequently develop malignant lymphomas and the diseases passively progress sequentially. In the present study, gene expression of CD16 (-) CD56 (+) -, EBV (+) HMB, CAEBV, NK-lymphoma, and NK-leukemia cell lines, which were established from patients, was analyzed using oligonucleotide microarrays and compared to that of CD56 bright CD16 dim/- NK cells from healthy donors. Principal components analysis showed that CAEBV and NK-lymphoma cells were relatively closely located, indicating that they had similar expression profiles. Unsupervised hierarchal clustering analyses of microarray data and gene ontology analysis revealed specific gene clusters and identified several candidate genes responsible for disease that can be used to discriminate each category of NK-LPD and NK-cell lymphoma/leukemia.
Chen, Lin; Liu, Yuetao; Guo, Qingfeng; Zheng, Qingxia; Zhang, Wancun
2018-05-11
A systematic study on the metabolome differences between wild Ophiocordyceps sinensis and artificial cultured Cordyceps militaris was conducted using liquid chromatography-mass spectrometry. Principal component analysis and orthogonal projection on latent structure-discriminant analysis results showed that C. militaris grown on solid rice medium (R-CM) and C. militaris grown on tussah pupa (T-CM) evidently separated and individually separated from wild O. sinensis, indicating metabolome difference among wild O. sinensis, R-CM and T-CM. The metabolome differences between R-CM and T-CM indicated that C. militaris could accommodate to culture medium by differential metabolic regulation. Hierarchical clustering analysis was further performed to cluster the differential metabolites and samples based on their metabolic similarity. The higher content of amino acids (pyroglutamic acid, glutamic acid, histidine, phenylalanine and arginine), unsaturated fatty acid (linolenic acid and linoleic acid), peptides, mannitol, adenosine and succinoadenosine in O. sinensis make it as an excellent choice as a traditional Chinese medicine for invigoration or nutritional supplementation. Similar compositions with O. sinensis and easy cultivation make artificially cultured C. militaris a possible alternative to O. sinensis. Copyright © 2018 John Wiley & Sons, Ltd.
Nyarko, Esmond; Donnelly, Catherine
2015-03-01
Fourier transform infrared (FT-IR) spectroscopy was used to differentiate mixed strains of Listeria monocytogenes and mixed strains of L. monocytogenes and Listeria innocua. FT-IR spectroscopy was also applied to investigate the hypothesis that heat-injured and acid-injured cells would return to their original physiological integrity following repair. Thin smears of cells on infrared slides were prepared from cultures for mixed strains of L. monocytogenes, mixed strains of L. monocytogenes and L. innocua, and each individual strain. Heat-injured and acid-injured cells were prepared by exposing harvested cells of L. monocytogenes strain R2-764 to a temperature of 56 ± 0.2°C for 10 min or lactic acid at pH 3 for 60 min, respectively. Cellular repair involved incubating aliquots of acid-injured and heat-injured cells separately in Trypticase soy broth supplemented with 0.6% yeast extract for 22 to 24 h; bacterial thin smears on infrared slides were prepared for each treatment. Spectral collection was done using 250 scans at a resolution of 4 cm(-1) in the mid-infrared wavelength region. Application of multivariate discriminant analysis to the wavelength region from 1,800 to 900 cm(-1) separated the individual L. monocytogenes strains. Mixed strains of L. monocytogenes and L. monocytogenes cocultured with L. innocua were successfully differentiated from the individual strains when the discriminant analysis was applied. Different mixed strains of L. monocytogenes were also successfully separated when the discriminant analysis was applied. A data set for injury and repair analysis resulted in the separation of acid-injured, heat-injured, and intact cells; repaired cells clustered closer to intact cells when the discriminant analysis (1,800 to 600 cm(-1)) was applied. FT-IR spectroscopy can be used for the rapid source tracking of L. monocytogenes strains because it can differentiate between different mixed strains and individual strains of the pathogen.
The Node Deployment of Intelligent Sensor Networks Based on the Spatial Difference of Farmland Soil
Liu, Naisen; Cao, Weixing; Zhu, Yan; Zhang, Jingchao; Pang, Fangrong; Ni, Jun
2015-01-01
Considering that agricultural production is characterized by vast areas, scattered fields and long crop growth cycles, intelligent wireless sensor networks (WSNs) are suitable for monitoring crop growth information. Cost and coverage are the most key indexes for WSN applications. The differences in crop conditions are influenced by the spatial distribution of soil nutrients. If the nutrients are distributed evenly, the crop conditions are expected to be approximately uniform with little difference; on the contrary, there will be great differences in crop conditions. In accordance with the differences in the spatial distribution of soil information in farmland, fuzzy c-means clustering was applied to divide the farmland into several areas, where the soil fertility of each area is nearly uniform. Then the crop growth information in the area could be monitored with complete coverage by deploying a sensor node there, which could greatly decrease the deployed sensor nodes. Moreover, in order to accurately judge the optimal cluster number of fuzzy c-means clustering, a discriminant function for Normalized Intra-Cluster Coefficient of Variation (NICCV) was established. The sensitivity analysis indicates that NICCV is insensitive to the fuzzy weighting exponent, but it shows a strong sensitivity to the number of clusters. PMID:26569243
Individual bioaerosol particle discrimination by multi-photon excited fluorescence.
Kiselev, Denis; Bonacina, Luigi; Wolf, Jean-Pierre
2011-11-21
Femtosecond laser induced multi-photon excited fluorescence (MPEF) from individual airborne particles is tested for the first time for discriminating bioaerosols. The fluorescence spectra, analysed in 32 channels, exhibit a composite character originating from simultaneous two-photon and three-photon excitation at 790 nm. Simulants of bacteria aggregates (clusters of dyed polystyrene microspheres) and different pollen particles (Ragweed, Pecan, Mulberry) are clearly discriminated by their MPEF spectra. This demonstration experiment opens the way to more sophisticated spectroscopic schemes like pump-probe and coherent control. © 2011 Optical Society of America
A novel complex networks clustering algorithm based on the core influence of nodes.
Tong, Chao; Niu, Jianwei; Dai, Bin; Xie, Zhongyu
2014-01-01
In complex networks, cluster structure, identified by the heterogeneity of nodes, has become a common and important topological property. Network clustering methods are thus significant for the study of complex networks. Currently, many typical clustering algorithms have some weakness like inaccuracy and slow convergence. In this paper, we propose a clustering algorithm by calculating the core influence of nodes. The clustering process is a simulation of the process of cluster formation in sociology. The algorithm detects the nodes with core influence through their betweenness centrality, and builds the cluster's core structure by discriminant functions. Next, the algorithm gets the final cluster structure after clustering the rest of the nodes in the network by optimizing method. Experiments on different datasets show that the clustering accuracy of this algorithm is superior to the classical clustering algorithm (Fast-Newman algorithm). It clusters faster and plays a positive role in revealing the real cluster structure of complex networks precisely.
Career paths in physicians' postgraduate training - an eight-year follow-up study.
Buddeberg-Fischer, Barbara; Stamm, Martina; Klaghofer, Richard
2010-10-06
To date, there are hardly any studies on the choice of career path in medical school graduates. The present study aimed to investigate what career paths can be identified in the course of postgraduate training of physicians; what factors have an influence on the choice of a career path; and in what way the career paths are correlated with career-related factors as well as with work-life balance aspirations. The data reported originates from five questionnaire surveys of the prospective SwissMedCareer Study, beginning in 2001 (T1, last year of medical school). The study sample consisted of 358 physicians (197 females, 55%; 161 males, 45%) participating at each assessment from T2 (2003, first year of residency) to T5 (2009, seventh year of residency), answering the question: What career do you aspire to have? Furthermore, personal characteristics, chosen specialty, career motivation, mentoring experience, work-life balance as well as workload, career success and career satisfaction were assessed. Career paths were analysed with cluster analysis, and differences between clusters analysed with multivariate methods. The cluster analysis revealed four career clusters which discriminated distinctly between each other: (1) career in practice, (2) hospital career, (3) academic career, and (4) changing career goal. From T3 (third year of residency) to T5, respondents in Cluster 1-3 were rather stable in terms of their career path aspirations, while those assigned to Cluster 4 showed a high fluctuation in their career plans. Physicians in Cluster 1 showed high values in extraprofessional concerns and often consider part-time work. Cluster 2 and 3 were characterised by high instrumentality, intrinsic and extrinsic career motivation, career orientation and high career success. No cluster differences were seen in career satisfaction. In Cluster 1 and 4, females were overrepresented. Trainees should be supported to stay on the career path that best suits his/her personal and professional profile. Attention should be paid to the subgroup of physicians in Cluster 4 switching from one to another career goal in the course of their postgraduate training.
Selemetas, Nikolaos; de Waal, Theo
2015-04-30
Fasciolosis caused by Fasciola hepatica (liver fluke) can cause significant economic and production losses in dairy cow farms. The aim of the current study was to identify important weather and environmental predictors of the exposure risk to liver fluke by detecting clusters of fasciolosis in Ireland. During autumn 2012, bulk-tank milk samples from 4365 dairy farms were collected throughout Ireland. Using an in-house antibody-detection ELISA, the analysis of BTM samples showed that 83% (n=3602) of dairy farms had been exposed to liver fluke. The Getis-Ord Gi* statistic identified 74 high-risk and 130 low-risk significant (P<0.01) clusters of fasciolosis. The low-risk clusters were mostly located in the southern regions of Ireland, whereas the high-risk clusters were mainly situated in the western part. Several climatic variables (monthly and seasonal mean rainfall and temperatures, total wet days and rain days) and environmental datasets (soil types, enhanced vegetation index and normalised difference vegetation index) were used to investigate dissimilarities in the exposure to liver fluke between clusters. Rainfall, total wet days and rain days, and soil type were the significant classes of climatic and environmental variables explaining the differences between significant clusters. A discriminant function analysis was used to predict the exposure risk to liver fluke using 80% of data for modelling and the remaining subset of 20% for post hoc model validation. The most significant predictors of the model risk function were total rainfall in August and September and total wet days. The risk model presented 100% sensitivity and 91% specificity and an accuracy of 95% correctly classified cases. A risk map of exposure to liver fluke was constructed with higher probability of exposure in western and north-western regions. The results of this study identified differences between clusters of fasciolosis in Ireland regarding climatic and environmental variables and detected significant predictors of the exposure risk to liver fluke. Copyright © 2015 Elsevier B.V. All rights reserved.
Chou, A; Burke, J
1999-05-01
DNA sequence clustering has become a valuable method in support of gene discovery and gene expression analysis. Our interest lies in leveraging the sequence diversity within clusters of expressed sequence tags (ESTs) to model gene structure for the study of gene variants that arise from, among other things, alternative mRNA splicing, polymorphism, and divergence after gene duplication, fusion, and translocation events. In previous work, CRAW was developed to discover gene variants from assembled clusters of ESTs. Most importantly, novel gene features (the differing units between gene variants, for example alternative exons, polymorphisms, transposable elements, etc.) that are specialized to tissue, disease, population, or developmental states can be identified when these tools collate DNA source information with gene variant discrimination. While the goal is complete automation of novel feature and gene variant detection, current methods are far from perfect and hence the development of effective tools for visualization and exploratory data analysis are of paramount importance in the process of sifting through candidate genes and validating targets. We present CRAWview, a Java based visualization extension to CRAW. Features that vary between gene forms are displayed using an automatically generated color coded index. The reporting format of CRAWview gives a brief, high level summary report to display overlap and divergence within clusters of sequences as well as the ability to 'drill down' and see detailed information concerning regions of interest. Additionally, the alignment viewing and editing capabilities of CRAWview make it possible to interactively correct frame-shifts and otherwise edit cluster assemblies. We have implemented CRAWview as a Java application across windows NT/95 and UNIX platforms. A beta version of CRAWview will be freely available to academic users from Pangea Systems (http://www.pangeasystems.com). Contact :
Are There Subtypes of Panic Disorder? An Interpersonal Perspective
Zilcha-Mano, Sigal; McCarthy, Kevin S.; Dinger, Ulrike; Chambless, Dianne L.; Milrod, Barbara L.; Kunik, Lauren; Barber, Jacques P.
2015-01-01
Objective Panic disorder (PD) is associated with significant personal, social, and economic costs. However, little is known about specific interpersonal dysfunctions that characterize the PD population. The current study systematically examined these interpersonal dysfunctions. Method The present analyses included 194 patients with PD out of a sample of 201 who were randomized to cognitive-behavioral therapy, panic-focused psychodynamic psychotherapy, or applied relaxation training. Interpersonal dysfunction was measured using the Inventory of Interpersonal Problems–Circumplex (Horowitz, Alden, Wiggins, & Pincus, 2000). Results Individuals with PD reported greater levels of interpersonal distress than that of a normative cohort (especially when PD was accompanied by agoraphobia), but lower than that of a cohort of patients with major depression. There was no single interpersonal profile that characterized PD patients. Symptom-based clusters (with versus without agoraphobia) could not be discriminated on core or central interpersonal problems. Rather, as revealed by cluster analysis based on the pathoplasticity framework, there were two empirically derived interpersonal clusters among PD patients which were not accounted for by symptom severity and were opposite in nature: domineering-intrusive and nonassertive. The empirically derived interpersonal clusters appear to be of clinical utility in predicting alliance development throughout treatment: While the domineering-intrusive cluster did not show any changes in the alliance throughout treatment, the non-assertive cluster showed a process of significant strengthening of the alliance. Conclusions Empirically derived interpersonal clusters in PD provide clinically useful and non-redundant information about individuals with PD. PMID:26030762
Challenges in discriminating profanity from hate speech
NASA Astrophysics Data System (ADS)
Malmasi, Shervin; Zampieri, Marcos
2018-03-01
In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes ?-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of ? accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface ?-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.
Phillips, Anastasia; Sotomayor, Cristina; Wang, Qinning; Holmes, Nadine; Furlong, Catriona; Ward, Kate; Howard, Peter; Octavia, Sophie; Lan, Ruiting; Sintchenko, Vitali
2016-09-15
Salmonella Typhimurium (STM) is an important cause of foodborne outbreaks worldwide. Subtyping of STM remains critical to outbreak investigation, yet current techniques (e.g. multilocus variable number tandem repeat analysis, MLVA) may provide insufficient discrimination. Whole genome sequencing (WGS) offers potentially greater discriminatory power to support infectious disease surveillance. We performed WGS on 62 STM isolates of a single, endemic MLVA type associated with two epidemiologically independent, food-borne outbreaks along with sporadic cases in New South Wales, Australia, during 2014. Genomes of case and environmental isolates were sequenced using HiSeq (Illumina) and the genetic distance between them was assessed by single nucleotide polymorphism (SNP) analysis. SNP analysis was compared to the epidemiological context. The WGS analysis supported epidemiological evidence and genomes of within-outbreak isolates were nearly identical. Sporadic cases differed from outbreak cases by a small number of SNPs, although their close relationship to outbreak cases may represent an unidentified common food source that may warrant further public health follow up. Previously unrecognised mini-clusters were detected. WGS of STM can discriminate foodborne community outbreaks within a single endemic MLVA clone. Our findings support the translation of WGS into public health laboratory surveillance of salmonellosis.
Yang, Yan-Qin; Yin, Hong-Xu; Yuan, Hai-Bo; Jiang, Yong-Wen; Dong, Chun-Wang; Deng, Yu-Liang
2018-01-01
In the present work, a novel infrared-assisted extraction coupled to headspace solid-phase microextraction (IRAE-HS-SPME) followed by gas chromatography-mass spectrometry (GC-MS) was developed for rapid determination of the volatile components in green tea. The extraction parameters such as fiber type, sample amount, infrared power, extraction time, and infrared lamp distance were optimized by orthogonal experimental design. Under optimum conditions, a total of 82 volatile compounds in 21 green tea samples from different geographical origins were identified. Compared with classical water-bath heating, the proposed technique has remarkable advantages of considerably reducing the analytical time and high efficiency. In addition, an effective classification of green teas based on their volatile profiles was achieved by partial least square-discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Furthermore, the application of a dual criterion based on the variable importance in the projection (VIP) values of the PLS-DA models and on the category from one-way univariate analysis (ANOVA) allowed the identification of 12 potential volatile markers, which were considered to make the most important contribution to the discrimination of the samples. The results suggest that IRAE-HS-SPME/GC-MS technique combined with multivariate analysis offers a valuable tool to assess geographical traceability of different tea varieties.
Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg
2015-03-01
Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.
Yin, Hong-Xu; Yuan, Hai-Bo; Jiang, Yong-Wen; Dong, Chun-Wang; Deng, Yu-Liang
2018-01-01
In the present work, a novel infrared-assisted extraction coupled to headspace solid-phase microextraction (IRAE-HS-SPME) followed by gas chromatography-mass spectrometry (GC-MS) was developed for rapid determination of the volatile components in green tea. The extraction parameters such as fiber type, sample amount, infrared power, extraction time, and infrared lamp distance were optimized by orthogonal experimental design. Under optimum conditions, a total of 82 volatile compounds in 21 green tea samples from different geographical origins were identified. Compared with classical water-bath heating, the proposed technique has remarkable advantages of considerably reducing the analytical time and high efficiency. In addition, an effective classification of green teas based on their volatile profiles was achieved by partial least square-discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Furthermore, the application of a dual criterion based on the variable importance in the projection (VIP) values of the PLS-DA models and on the category from one-way univariate analysis (ANOVA) allowed the identification of 12 potential volatile markers, which were considered to make the most important contribution to the discrimination of the samples. The results suggest that IRAE-HS-SPME/GC-MS technique combined with multivariate analysis offers a valuable tool to assess geographical traceability of different tea varieties. PMID:29494626
NASA Astrophysics Data System (ADS)
Li, Hui; Yu, Jun-Ling; Yu, Le-An; Sun, Jie
2014-05-01
Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy meanwhile. The proposed approach makes CBR useful in imbalanced forecasting.
Performance Analysis of Entropy Methods on K Means in Clustering Process
NASA Astrophysics Data System (ADS)
Dicky Syahputra Lubis, Mhd.; Mawengkang, Herman; Suwilo, Saib
2017-12-01
K Means is a non-hierarchical data clustering method that attempts to partition existing data into one or more clusters / groups. This method partitions the data into clusters / groups so that data that have the same characteristics are grouped into the same cluster and data that have different characteristics are grouped into other groups.The purpose of this data clustering is to minimize the objective function set in the clustering process, which generally attempts to minimize variation within a cluster and maximize the variation between clusters. However, the main disadvantage of this method is that the number k is often not known before. Furthermore, a randomly chosen starting point may cause two points to approach the distance to be determined as two centroids. Therefore, for the determination of the starting point in K Means used entropy method where this method is a method that can be used to determine a weight and take a decision from a set of alternatives. Entropy is able to investigate the harmony in discrimination among a multitude of data sets. Using Entropy criteria with the highest value variations will get the highest weight. Given this entropy method can help K Means work process in determining the starting point which is usually determined at random. Thus the process of clustering on K Means can be more quickly known by helping the entropy method where the iteration process is faster than the K Means Standard process. Where the postoperative patient dataset of the UCI Repository Machine Learning used and using only 12 data as an example of its calculations is obtained by entropy method only with 2 times iteration can get the desired end result.
Papaioannou, Vasilios E; Chouvarda, Ioanna G; Maglaveras, Nikos K; Pneumatikos, Ioannis A
2012-12-12
Even though temperature is a continuous quantitative variable, its measurement has been considered a snapshot of a process, indicating whether a patient is febrile or afebrile. Recently, other diagnostic techniques have been proposed for the association between different properties of the temperature curve with severity of illness in the Intensive Care Unit (ICU), based on complexity analysis of continuously monitored body temperature. In this study, we tried to assess temperature complexity in patients with systemic inflammation during a suspected ICU-acquired infection, by using wavelets transformation and multiscale entropy of temperature signals, in a cohort of mixed critically ill patients. Twenty-two patients were enrolled in the study. In five, systemic inflammatory response syndrome (SIRS, group 1) developed, 10 had sepsis (group 2), and seven had septic shock (group 3). All temperature curves were studied during the first 24 hours of an inflammatory state. A wavelet transformation was applied, decomposing the signal in different frequency components (scales) that have been found to reflect neurogenic and metabolic inputs on temperature oscillations. Wavelet energy and entropy per different scales associated with complexity in specific frequency bands and multiscale entropy of the whole signal were calculated. Moreover, a clustering technique and a linear discriminant analysis (LDA) were applied for permitting pattern recognition in data sets and assessing diagnostic accuracy of different wavelet features among the three classes of patients. Statistically significant differences were found in wavelet entropy between patients with SIRS and groups 2 and 3, and in specific ultradian bands between SIRS and group 3, with decreased entropy in sepsis. Cluster analysis using wavelet features in specific bands revealed concrete clusters closely related with the groups in focus. LDA after wrapper-based feature selection was able to classify with an accuracy of more than 80% SIRS from the two sepsis groups, based on multiparametric patterns of entropy values in the very low frequencies and indicating reduced metabolic inputs on local thermoregulation, probably associated with extensive vasodilatation. We suggest that complexity analysis of temperature signals can assess inherent thermoregulatory dynamics during systemic inflammation and has increased discriminating value in patients with infectious versus noninfectious conditions, probably associated with severity of illness.
2012-01-01
Background Even though temperature is a continuous quantitative variable, its measurement has been considered a snapshot of a process, indicating whether a patient is febrile or afebrile. Recently, other diagnostic techniques have been proposed for the association between different properties of the temperature curve with severity of illness in the Intensive Care Unit (ICU), based on complexity analysis of continuously monitored body temperature. In this study, we tried to assess temperature complexity in patients with systemic inflammation during a suspected ICU-acquired infection, by using wavelets transformation and multiscale entropy of temperature signals, in a cohort of mixed critically ill patients. Methods Twenty-two patients were enrolled in the study. In five, systemic inflammatory response syndrome (SIRS, group 1) developed, 10 had sepsis (group 2), and seven had septic shock (group 3). All temperature curves were studied during the first 24 hours of an inflammatory state. A wavelet transformation was applied, decomposing the signal in different frequency components (scales) that have been found to reflect neurogenic and metabolic inputs on temperature oscillations. Wavelet energy and entropy per different scales associated with complexity in specific frequency bands and multiscale entropy of the whole signal were calculated. Moreover, a clustering technique and a linear discriminant analysis (LDA) were applied for permitting pattern recognition in data sets and assessing diagnostic accuracy of different wavelet features among the three classes of patients. Results Statistically significant differences were found in wavelet entropy between patients with SIRS and groups 2 and 3, and in specific ultradian bands between SIRS and group 3, with decreased entropy in sepsis. Cluster analysis using wavelet features in specific bands revealed concrete clusters closely related with the groups in focus. LDA after wrapper-based feature selection was able to classify with an accuracy of more than 80% SIRS from the two sepsis groups, based on multiparametric patterns of entropy values in the very low frequencies and indicating reduced metabolic inputs on local thermoregulation, probably associated with extensive vasodilatation. Conclusions We suggest that complexity analysis of temperature signals can assess inherent thermoregulatory dynamics during systemic inflammation and has increased discriminating value in patients with infectious versus noninfectious conditions, probably associated with severity of illness. PMID:22424316
Swartz, J R; Miller, B L; Lesser, I M; Booth, R; Darby, A; Wohl, M; Benson, D F
1997-04-01
Often patients in the early stages of Alzheimer's disease (AD), frontotemporal dementia (FTD), and late-life depression can be difficult to differentiate clinically. Although subtle cognitive distinctions exist between these disorders, noncognitive behavioral phenomenology may provide additional discriminating power. In 19 subjects with AD, 19 with FTD, 16 with late-life psychotic depression (LLPD), and 19 with late-life nonpsychotic depression (LLNPD), noncognitive behavioral symptoms were quantified retrospectively using the Schedules for Clinical Assessment in Neuropsychiatry (SCAN) and compared using both a one-way ANOVA and a multivariate stepwise discriminant analysis, which utilized a jackknife procedure. The FTD group showed the highest mean total SCAN score, while the AD group showed the lowest. ANOVA showed significant differences in the mean total SCAN scores between the four diagnostic groups (P < .0001). With the discriminant analysis, the four disorders demonstrated different clusters of behavioral abnormalities and were differentiated by these symptoms (P < .0001). A subset of 14 SCAN item group symptoms was identified that collectively classified the following percentages of subjects in each diagnostic category: AD 94.7%, FTD 100%, LLPD 87.5%, and LLNPD 100%. These results indicate that AD, FTD, LLPD, and LLNPD were distinguished retrospectively by the SCAN without using cognitive data. Better definition of the longitudinal course of noncognitive behavioral symptoms in different dementias and psychiatric disorders will be valuable both for diagnosis and to help define behavioral syndromes that are associated with selective neuroanatomic and neurochemical brain pathology.
Frías, Sergio; Conde, José E; Rodríguez-Bencomo, Juan J; García-Montelongo, Francisco; Pérez-Trujillo, Juan P
2003-02-06
Eleven elements, K, Na, Ca, Mg, Fe, Cu, Zn, Mn, Sr, Li and Rb, were determined in dry and sweet wines bearing the denominations of origin of El Hierro, La Palma and Lanzarote islands (Canary Islands, Spain). Analyses were performed by flame atomic absorption spectrophotometry, with the exceptions of lithium and rubidium for which flame atomic emission spectrophotometry was used. Sweet wines from La Palma were elaborated as naturally sweet with over-ripe grapes and significant differences were found in all the analysed elements with the exceptions of sodium, iron and rubidium with regard to dry wines from the same island. Contrarily, sweet wines from Lanzarote elaborated with grapes in a similar ripening state to dry wines did not present significant differences between them with the exception of strontium, the content of which was greater in dry wines. Among the three islands, significant differences in mean content were found with the exceptions of iron and copper. Cluster analysis and principal component analysis show differences in wines according to the island of origin and the ripening state of the grapes. Linear discriminant analysis using rubidium, sodium, manganese and strontium, the four most discriminant elements, gave 100% recognition ability and 95.6% prediction ability. The sensitivity and specificity obtained using soft independent modelling of class analogy (SIMCA) as a modelling multivariate technique were both 100% for El Hierro and Lanzarote, and 100 and 95%, respectively, for La Palma. The modelling and discriminant capacities of the different metals were also studied.
Bruno, Rossella; Alì, Greta; Giannini, Riccardo; Proietti, Agnese; Lucchi, Marco; Chella, Antonio; Melfi, Franca; Mussi, Alfredo; Fontanini, Gabriella
2017-01-10
Malignant pleural mesothelioma (MPM) is a rare asbestos related cancer, aggressive and unresponsive to therapies. Histological examination of pleural lesions is the gold standard of MPM diagnosis, although it is sometimes hard to discriminate the epithelioid type of MPM from benign mesothelial hyperplasia (MH).This work aims to define a new molecular tool for the differential diagnosis of MPM, using the expression profile of 117 genes deregulated in this tumour.The gene expression analysis was performed by nanoString System on tumour tissues from 36 epithelioid MPM and 17 MH patients, and on 14 mesothelial pleural samples analysed in a blind way. Data analysis included raw nanoString data normalization, unsupervised cluster analysis by Pearson correlation, non-parametric Mann Whitney U-test and molecular classification by the Uncorrelated Shrunken Centroid (USC) Algorithm.The Mann-Whitney U-test found 35 genes upregulated and 31 downregulated in MPM. The unsupervised cluster analysis revealed two clusters, one composed only of MPM and one only of MH samples, thus revealing class-specific gene profiles. The Uncorrelated Shrunken Centroid algorithm identified two classifiers, one including 22 genes and the other 40 genes, able to properly classify all the samples as benign or malignant using gene expression data; both classifiers were also able to correctly determine, in a blind analysis, the diagnostic categories of all the 14 unknown samples.In conclusion we delineated a diagnostic tool combining molecular data (gene expression) and computational analysis (USC algorithm), which can be applied in the clinical practice for the differential diagnosis of MPM.
[Identification of two varieties of Citri Fructus by fingerprint and chemometrics].
Su, Jing-hua; Zhang, Chao; Sun, Lei; Gu, Bing-ren; Ma, Shuang-cheng
2015-06-01
Citri Fructus identification by fingerprint and chemometrics was investigated in this paper. Twenty-three Citri Fructus samples were collected which referred to two varieties as Cirtus wilsonii and C. medica recorded in Chinese Pharmacopoeia. HPLC chromatograms were obtained. The components were partly identified by reference substances, and then common pattern was established for chemometrics analysis. Similarity analysis, principal component analysis (PCA) , partial least squares-discriminant analysis (PLS-DA) and hierarchical cluster analysis heatmap were applied. The results indicated that C. wilsonii and C. medica could be ideally classified with common pattern contained twenty-five characteristic peaks. Besides, preliminary pattern recognition had verified the chemometrics analytical results. Absolute peak area (APA) was used for relevant quantitative analysis, results showed the differences between two varieties and it was valuable for further quality control as selection of characteristic components.
NASA Astrophysics Data System (ADS)
Su, Rongguo; Chen, Xiaona; Wu, Zhenzhen; Yao, Peng; Shi, Xiaoyong
2015-07-01
The feasibility of using fluorescence excitation-emission matrix (EEM) along with parallel factor analysis (PARAFAC) and nonnegative least squares (NNLS) method for the differentiation of phytoplankton taxonomic groups was investigated. Forty-one phytoplankton species belonging to 28 genera of five divisions were studied. First, the PARAFAC model was applied to EEMs, and 15 fluorescence components were generated. Second, 15 fluorescence components were found to have a strong discriminating capability based on Bayesian discriminant analysis (BDA). Third, all spectra of the fluorescence component compositions for the 41 phytoplankton species were spectrographically sorted into 61 reference spectra using hierarchical cluster analysis (HCA), and then, the reference spectra were used to establish a database. Finally, the phytoplankton taxonomic groups was differentiated by the reference spectra database using the NNLS method. The five phytoplankton groups were differentiated with the correct discrimination ratios (CDRs) of 100% for single-species samples at the division level. The CDRs for the mixtures were above 91% for the dominant phytoplankton species and above 73% for the subdominant phytoplankton species. Sixteen of the 85 field samples collected from the Changjiang River estuary were analyzed by both HPLC-CHEMTAX and the fluorometric technique developed. The results of both methods reveal that Bacillariophyta was the dominant algal group in these 16 samples and that the subdominant algal groups comprised Dinophyta, Chlorophyta and Cryptophyta. The differentiation results by the fluorometric technique were in good agreement with those from HPLC-CHEMTAX. The results indicate that the fluorometric technique could differentiate algal taxonomic groups accurately at the division level.
In vivo diagnosis of mammary adenocarcinoma using Raman spectroscopy: an animal model study
NASA Astrophysics Data System (ADS)
Bitar, R. A.; Ribeiro, D. G.; dos Santos, E. A. P.; Ramalho, L. N. Z.; Ramalho, F. S.; Martin, A. A.; Martinho, H. S.
2010-02-01
Breast cancer is the most frequent cancer type in women Worldwide. Sensitivity and specificity of clinical breast examinations have been estimated from clinical trials to be approximately 54 % and 94 %, respectively. Further, approximately 95 % of all positive breast cancer screenings turn out to be false-positive. The optimal method for early detection should be both highly sensitive to ensure that all cancers are detected, and also highly specific to avoid the humanistic and economic costs associated with false-positive results. In vivo optical spectroscopy techniques, Raman in particular, have been pointed out as promising tools to improve the accuracy of screening mammography. The aim of the present study was to apply FT-Raman spectroscopy to discriminate normal and adenocarcinoma breast tissues of Sprague-Dawley female rats. The study was performed on 32 rats divided in the control (N=5) and experimental (N=27) groups. Histological analysis indicated that mammary hyperplasia, cribriform, papillary and solid adenocarcinomas were found in the experimental group subjects. The spectral collection was made using a commercial FT-Raman Spectrometer (Bruker RFS100) equipped with fiber-optic probe (RamProbe) and the spectral region between 900 and 1800 cm-1 was analyzed. Principal Components Analysis, Cluster Analysis, and Linear Discriminant Analysis with cross-validation were applied as spectral classification algorithm. As concluding remarks it is show that normal and adenocarcinoma tissues discriminations was possible (correct proportion for Transcutaneous collection mode was 80.80% and for "Open Sky" mode was 91.70%); however, a conclusive diagnosis among the four lesion subtypes was not possible.
Martínez Bueno, María Jesús; Díaz-Galiano, Francisco José; Rajski, Łukasz; Cutillas, Víctor; Fernández-Alba, Amadeo R
2018-04-20
In the last decade, the consumption trend of organic food has increased dramatically worldwide. However, the lack of reliable chemical markers to discriminate between organic and conventional products makes this market susceptible to food fraud in products labeled as "organic". Metabolomic fingerprinting approach has been demonstrated as the best option for a full characterization of metabolome occurring in plants, since their pattern may reflect the impact of both endogenous and exogenous factors. In the present study, advanced technologies based on high performance liquid chromatography-high-resolution accurate mass spectrometry (HPLC-HRAMS) has been used for marker search in organic and conventional tomatoes grown in greenhouse under controlled agronomic conditions. The screening of unknown compounds comprised the retrospective analysis of all tomato samples throughout the studied period and data processing using databases (mzCloud, ChemSpider and PubChem). In addition, stable nitrogen isotope analysis (δ 15 N) was assessed as a possible indicator to support discrimination between both production systems using crop/fertilizer correlations. Pesticide residue analyses were also applied as a well-established way to evaluate the organic production. Finally, the evaluation by combined chemometric analysis of high-resolution accurate mass spectrometry (HRAMS) and δ 15 N data provided a robust classification model in accordance with the agricultural practices. Principal component analysis (PCA) showed a sample clustering according to farming systems and significant differences in the sample profile was observed for six bioactive components (L-tyrosyl-L-isoleucyl-L-threonyl-L-threonine, trilobatin, phloridzin, tomatine, phloretin and echinenone). Copyright © 2018 Elsevier B.V. All rights reserved.
[Advances in clustered regularly interspaced short palindromic repeats--a review].
Wang, Lili; He, Jin; Wang, Jieping
2011-08-01
The recently discovered Clustered Regularly Interspaced Short Palindromic Repeat (CRISPRs) can protect bacteria and archaea with adaptive and heritable defense systems against the invasion of phage- and plasmid- associated mobile genetic elements. Here, we review the structure, diversity, mechanism of interference and self versus non-self discrimination of CRISPR systems. We also discuss the potential applications of this novel interference system.
Effect of the statin therapy on biochemical laboratory tests--a chemometrics study.
Durceková, Tatiana; Mocák, Ján; Boronová, Katarína; Balla, Ján
2011-01-05
Statins are the first-line choice for lowering total and LDL cholesterol levels and very important medicaments for reducing the risk of coronary artery disease. The aim of this study is therefore assessment of the results of biochemical tests characterizing the condition of 172 patients before and after administration of statins. For this purpose, several chemometric tools, namely principal component analysis, cluster analysis, discriminant analysis, logistic regression, KNN classification, ROC analysis, descriptive statistics and ANOVA were used. Mutual relations of 11 biochemical laboratory tests, the patient's age and gender were investigated in detail. Achieved results enable to evaluate the extent of the statin treatment in each individual case. They may also help in monitoring the dynamic progression of the disease. Copyright © 2010 Elsevier B.V. All rights reserved.
González-Álvarez, Mariana; Noguerol-Pato, Raquel; González-Barreiro, Carmen; Cancho-Grande, Beatriz; Simal-Gándara, Jesús
2014-02-15
The effect of winemaking procedures on the sensory modification of sweet wines was investigated. Garnacha Tintorera-based sweet wines were obtained by two different processes: by using raisins for vinification to obtain a naturally sweet wine and by using freshly harvested grapes with the stoppage of the fermentation by the addition of alcohol. Eight international sweet wines were also subjected to sensory analysis for comparative description purposes. Wines were described with a sensory profile by 12 trained panellists on 70 sensory attributes by employing the frequency of citation method. Analysis of variance of the descriptive data confirmed the existence of subtle sensory differences among Garnacha Tintorera-based sweet wines depending on the procedure used for their production. Cluster analysis emphasised discriminated attributes between the Garnacha Tintorera-based and the commercial groups of sweet wines for both those obtained by raisining and by fortification. Several kinds of discriminant functions were used to separate groups of sweet wines--obtained by botrytisation, raisining and fortification--to show the key descriptors that contribute to their separation and define the sensory perception of each type of wine. Copyright © 2013 Elsevier Ltd. All rights reserved.
Feasibility of laser-induced breakdown spectroscopy (LIBS) for classification of sea salts.
Tan, Man Minh; Cui, Sheng; Yoo, Jonghyun; Han, Song-Hee; Ham, Kyung-Sik; Nam, Sang-Ho; Lee, Yonghoon
2012-03-01
We have investigated the feasibility of laser-induced breakdown spectroscopy (LIBS) as a fast, reliable classification tool for sea salts. For 11 kinds of sea salts, potassium (K), magnesium (Mg), calcium (Ca), and aluminum (Al), concentrations were measured by inductively coupled plasma-atomic emission spectroscopy (ICP-AES), and the LIBS spectra were recorded in the narrow wavelength region between 760 and 800 nm where K (I), Mg (I), Ca (II), Al (I), and cyanide (CN) band emissions are observed. The ICP-AES measurements revealed that the K, Mg, Ca, and Al concentrations varied significantly with the provenance of each salt. The relative intensities of the K (I), Mg (I), Ca (II), and Al (I) peaks observed in the LIBS spectra are consistent with the results using ICP-AES. The principal component analysis of the LIBS spectra provided the score plot with quite a high degree of clustering. This indicates that classification of sea salts by chemometric analysis of LIBS spectra is very promising. Classification models were developed by partial least squares discriminant analysis (PLS-DA) and evaluated. In addition, the Al (I) peaks enabled us to discriminate between different production methods of the salts. © 2012 Society for Applied Spectroscopy
VizieR Online Data Catalog: Clusters of galaxies in SDSS-III (Wen+, 2012)
NASA Astrophysics Data System (ADS)
Wen, Z. L.; Han, J. L.; Liu, F. S.
2012-06-01
Wen et al. (2009, Cat. J/ApJS/183/197) identified 39668 galaxy clusters from the SDSS DR6 by the discrimination of member galaxies of clusters using photometric redshifts of galaxies. Wen & Han (2011ApJ...734...68W) improved the method and successfully identified the high-redshift clusters from the deep fields of the Canada-France-Hawaii Telescope (CFHT) Wide survey, the CHFT Deep survey, the Cosmic Evolution Survey, and the Spitzer Wide-area InfraRed Extragalactic survey. Here, we follow and improve the algorithm to identify clusters from SDSS-III (SDSS Data Release 8; Aihara et al. 2011ApJS..193...29A, see Cat. II/306). (1 data file).
Uchikoga, Nobuyuki; Hirokawa, Takatsugu
2010-05-11
Protein-protein docking for proteins with large conformational changes was analyzed by using interaction fingerprints, one of the scales for measuring similarities among complex structures, utilized especially for searching near-native protein-ligand or protein-protein complex structures. Here, we have proposed a combined method for analyzing protein-protein docking by taking large conformational changes into consideration. This combined method consists of ensemble soft docking with multiple protein structures, refinement of complexes, and cluster analysis using interaction fingerprints and energy profiles. To test for the applicability of this combined method, various CaM-ligand complexes were reconstructed from the NMR structures of unbound CaM. For the purpose of reconstruction, we used three known CaM-ligands, namely, the CaM-binding peptides of cyclic nucleotide gateway (CNG), CaM kinase kinase (CaMKK) and the plasma membrane Ca2+ ATPase pump (PMCA), and thirty-one structurally diverse CaM conformations. For each ligand, 62000 CaM-ligand complexes were generated in the docking step and the relationship between their energy profiles and structural similarities to the native complex were analyzed using interaction fingerprint and RMSD. Near-native clusters were obtained in the case of CNG and CaMKK. The interaction fingerprint method discriminated near-native structures better than the RMSD method in cluster analysis. We showed that a combined method that includes the interaction fingerprint is very useful for protein-protein docking analysis of certain cases.
Aggressive behavior in children: the role of temperament and family socialization.
González-Peña, Paloma; Egido, Begoña Delgado; Carrasco, Miguel Á; Tello, Francisco Pablo Holgado
2013-01-01
This study's objective is to analyze temperament and parenting variables as they relate to proactive and reactive aggression in children. To be specific, profiles based on these variables were analyzed in children with high levels of proactive versus reactive aggression. The sample was made up of two groups: 482 children (52.3% boys) between 1 and 3 years-old, and 422 children (42.42% boys) 3 to 6 years-old. Statistical analyses of the two age groups included: Pearson's correlations to explore the relationships among variables, Cluster Analysis to create groups with different levels of aggression, and finally discriminant analysis to determine which variables discriminate between groups. The results show that high levels of frustration/negative affect in the 1-3 year-old group and low effortful control in children 3 to 6 years old are the most relevant variables in differentiating between aggressive and non-aggressive subjects. Nevertheless, differential profiles of subjects with high levels of proactive versus reactive aggression were not observed. The implications of these different types of aggression in terms of development and prevention are discussed.
Wang, Yulan; Tang, Huiru; Nicholson, Jeremy K; Hylands, Peter J; Sampson, J; Holmes, Elaine
2005-01-26
A metabonomic strategy, utilizing high-resolution 1H NMR spectroscopy in conjunction with chemometric methods (discriminant analysis with orthogonal signal correction), has been applied to the study of human biological responses to chamomile tea ingestion. Daily urine samples were collected from volunteers during a 6-week period incorporating a 2-week baseline period, 2 weeks of daily chamomile tea ingestion, and a 2-week post-treatment phase. Although strong intersubject variation in metabolite profiles was observed, clear differentiation between the samples obtained before and after chamomile ingestion was achieved on the basis of increased urinary excretion of hippurate and glycine with depleted creatinine concentration. Samples obtained up to 2 weeks after daily chamomile intake formed an isolated cluster in the discriminant analysis map, from which it was inferred that the metabolic effects of chamomile ingestion were prolonged during the 2-week postdosing period. This study highlights the potential for metabonomic technology in the assessment of nutritional interventions, despite the high degree of variation from genetic and environmental sources.
Zhang, Yu-Dang; Shen, Chun-Mei; Jin, Rui; Li, Ya-Ni; Wang, Bo; Ma, Li-Xia; Meng, Hao-Tian; Yan, Jiang-Wei; Dan Wang, Hong-; Yang, Ze-Long; Zhu, Bo-Feng
2015-05-01
Insertion/deletion polymorphisms have become a research hot spot in forensic science due to their tremendous potential in recent years. In the present study, we investigated 30 indel loci in a Chinese Yi ethnic group. The allele frequencies of the short allele of the 30 indel loci were in the range of 0.1025-0.9221. The power of discrimination values were observed ranging from to 0.2630 (HLD111 locus) to 0.6607 (HLD70 locus) and probability of exclusion values ranged from 0.0189 (HLD111 locus) to 0.2343 (HLD56 locus). The combined power of discrimination and power of exclusion for 30 loci in the studied Yi group were 0.99999999995713 and 0.97746, respectively, which showed tremendous potential for forensic personal identification in the Yi group. Moreover, the DA distances, phylogenetic tree, principal component analysis, and cluster analysis showed the Yi group had close genetic relationships with the Tibetan, South Korean, Chinese Han, and She groups. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Discrimination of lichen genera and species using element concentrations
Bennett, J.P.
2008-01-01
The importance of organic chemistry in the classification of lichens is well established, but inorganic chemistry has been largely overlooked. Six lichen species were studied over a period of 23 years that were growing in 11 protected areas of the northern Great Lakes ecoregion, which were not greatly influenced by anthropogenic particulates or gaseous air pollutants. The elemental data from these studies were aggregated in order to test the hypothesis that differences among species in tissue element concentrations were large enough to discriminate between taxa faithfully. Concentrations of 16 chemical elements that were found in tissue samples from Cladonia rangiferina, Evernia mesomorpha, Flavopunctelia flaventior, Hypogymnia physodes, Parmelia sulcata, and Punctelia rudecta were analyzed statistically using multivariate discriminant functions and CART analyses, as well as t-tests. Genera and species were clearly separated in element space, and elemental discriminant functions were able to classify 91-100 of the samples correctly into species. At the broadest level, a Zn concentration of 51 ppm in tissues of four of the lichen species effectively discriminated foliose from fruticose species. Similarly, a S concentration of 680 ppm discriminated C. rangiferina and E. mesomorpha, and a Ca concentration of 10 436 ppm discriminated H. physodes from P. sulcata. For the three parmelioid species, a Ca concentration >32 837 ppm discriminated Punctelia rudecta from the other two species, while a Zn concentration of 56 ppm discriminated Parmelia sulcata from F. flaventior. Foliose species also had higher concentrations than did fruticose species of all elements except Na. Elemental signatures for each of the six species were developed using standardized means. Twenty-four mechanisms explaining the differences among species are summarized. Finally, the relationships of four species based on element concentrations, using additive-trees clustering of a Euclidean-distance matrix, produced identical relationships as did analyses based on secondary product chemistry that used additive-trees clustering of a Jaccard similarity matrix. At least for these six species, element composition has taxonomic significance, and may be useful for discriminating other taxa. ?? 2008 British Lichen Society.
Genetic variation in Southern USA rice genotypes for seedling salinity tolerance
De Leon, Teresa B.; Linscombe, Steven; Gregorio, Glenn; Subudhi, Prasanta K.
2015-01-01
The success of a rice breeding program in developing salt tolerant varieties depends on genetic variation and the salt stress response of adapted and donor rice germplasm. In this study, we used a combination of morphological and physiological traits in multivariate analyses to elucidate the phenotypic and genetic variation in salinity tolerance of 30 Southern USA rice genotypes, along with 19 donor genotypes with varying degree of tolerance. Significant genotypic variation and correlations were found among the salt injury score (SIS), ion leakage, chlorophyll reduction, shoot length reduction, shoot K+ concentration, and shoot Na+/K+ ratio. Using these parameters, the combined methods of cluster analysis and discriminant analysis validated the salinity response of known genotypes and classified most of the USA varieties into sensitive groups, except for three and seven varieties placed in the tolerant and moderately tolerant groups, respectively. Discriminant function and MANOVA delineated the differences in tolerance and suggested no differences between sensitive and highly sensitive (HS) groups. DNA profiling using simple sequence repeat markers showed narrow genetic diversity among USA genotypes. However, the overall genetic clustering was mostly due to subspecies and grain type differentiation and not by varietal grouping based on salinity tolerance. Among the donor genotypes, Nona Bokra, Pokkali, and its derived breeding lines remained the donors of choice for improving salinity tolerance during the seedling stage. However, due to undesirable agronomic attributes and photosensitivity of these donors, alternative genotypes such as TCCP266, Geumgangbyeo, and R609 are recommended as useful and novel sources of salinity tolerance for USA rice breeding programs. PMID:26074937
NASA Astrophysics Data System (ADS)
Wang, Guoqing; Bu, Tong; Zako, Tamotsu; Watanabe-Tamaki, Ryoko; Tanaka, Takuo; Maeda, Mizuo
2017-09-01
Due to the potential of gold nanoparticle (AuNP)-based trace analysis, the discrimination of small AuNP clusters with different assembling stoichiometry is a subject of fundamental and technological importance. Here we prepare oligomerized AuNPs with controlled stoichiometry through DNA-directed assembly, and demonstrate that AuNP monomers, dimers and trimers can be clearly distinguished using dark field microscopy (DFM). The scattering intensity for of AuNP structures with stoichiometry ranging from 1 to 3 agrees well with our theoretical calculations. This study demonstrates the potential of utilizing the DFM approach in ultra-sensitive detection as well as the use of DNA-directed assembly for plasmonic nano-architectures.
Kalkreuth, W.; Holz, M.; Kern, M.; Machado, G.; Mexias, A.; Silva, M.B.; Willett, J.; Finkelman, R.; Burger, H.
2006-01-01
Hierarchical cluster analysis identified three groups of major minerals and seven groups of trace elements based on similarity levels. On a regional scale, the coalfields can be separated by the differences in rank (Candiota and Leão-Butiá versus Santa Terezinha) and by applying discriminant analysis based on 4 trace elements (Li, As, Sr, Sb). Highest Rb and Sr values occur at Candiota and are linked to syngenetic volcanism of the area, whereas high Y and Sr values at Santa Terezinha can be related to the frequent diabase intrusions in that area.
The development and cross-validation of an MMPI typology of murderers.
Holcomb, W R; Adams, N A; Ponder, H M
1985-06-01
A sample of 80 male offenders charged with premeditated murder were divided into five personality types using MMPI scores. A hierarchical clustering procedure was used with a subsequent internal cross-validation analysis using a second sample of 80 premeditated murderers. A Discriminant Analysis resulted in a 96.25% correct classification of subjects from the second sample into the five types. Clinical data from a mental status interview schedule supported the external validity of these types. There were significant differences among the five types in hallucinations, disorientation, hostility, depression, and paranoid thinking. Both similarities and differences of the present typology with prior research was discussed. Additional research questions were suggested.
NASA Astrophysics Data System (ADS)
Zheng, Yongjun; Zheng, Kang; Li, Yantuan
2012-09-01
In order to investigate the relationship between the trace elements and the characteristics of the oysters, we analyzed the trace elements present in the germplasm of oysters from different producing areas in the Jiaozhou Bay. The element fingerprints were established to reflect the elemental characteristics of the oysters. Concentration patterns of the elements were deciphered by principle component analysis (PCA) and hierarchical cluster analysis (HCA). The six regions were discriminated with accuracy using HCA and PCA based on the concentration of 16 trace elements. The elements were viewed as characteristic elements of the oysters and the fingerprints of these elements could be used to distinguish the quality of the oysters.
Cui, G F; Wu, L F; Wang, X N; Jia, W J; Duan, Q; Ma, L L; Jiang, Y L; Wang, J H
2014-07-29
Inter-simple sequence repeat (ISSR) markers were used to discriminate 62 lily cultivars of 5 hybrid series. Eight ISSR primers generated 104 bands in total, which all showed 100% polymorphism, and an average of 13 bands were amplified by each primer. Two software packages, POPGENE 1.32 and NTSYSpc 2.1, were used to analyze the data matrix. Our results showed that the observed number of alleles (NA), effective number of alleles (NE), Nei's genetic diversity (H), and Shannon's information index (I) were 1.9630, 1.4179, 0.2606, and 0.4080, respectively. The highest genetic similarity (0.9601) was observed between the Oriental x Trumpet and Oriental lilies, which indicated that the two hybrids had a close genetic relationship. An unweighted pair-group method with arithmetic means dendrogram showed that the 62 lily cultivars clustered into two discrete groups. The first group included the Oriental and OT cultivars, while the Asiatic, LA, and Longiflorum lilies were placed in the second cluster. The distribution of individuals in the principal component analysis was consistent with the clustering of the dendrogram. Fingerprints of all lily cultivars built from 8 primers could be separated completely. This study confirmed the effect and efficiency of ISSR identification in lily cultivars.
Size and shape variations of the bony components of sperm whale cochleae.
Schnitzler, Joseph G; Frédérich, Bruno; Früchtnicht, Sven; Schaffeld, Tobias; Baltzer, Johannes; Ruser, Andreas; Siebert, Ursula
2017-04-25
Several mass strandings of sperm whales occurred in the North Sea during January and February 2016. Twelve animals were necropsied and sampled around 48 h after their discovery on German coasts of Schleswig Holstein. The present study aims to explore the morphological variation of the primary sensory organ of sperm whales, the left and right auditory system, using high-resolution computerised tomography imaging. We performed a quantitative analysis of size and shape of cochleae using landmark-based geometric morphometrics to reveal inter-individual anatomical variations. A hierarchical cluster analysis based on thirty-one external morphometric characters classified these 12 individuals in two stranding clusters. A relative amount of shape variation could be attributable to geographical differences among stranding locations and clusters. Our geometric data allowed the discrimination of distinct bachelor schools among sperm whales that stranded on German coasts. We argue that the cochleae are individually shaped, varying greatly in dimensions and that the intra-specific variation observed in the morphology of the cochleae may partially reflect their affiliation to their bachelor school. There are increasing concerns about the impact of noise on cetaceans and describing the auditory periphery of odontocetes is a key conservation issue to further assess the effect of noise pollution.
Schrader, Alexandra; Meyer, Katharina; Walther, Neele; Stolz, Ailine; Feist, Maren; Hand, Elisabeth; von Bonin, Frederike; Evers, Maurits; Kohler, Christian; Shirneshan, Katayoon; Vockerodt, Martina; Klapper, Wolfram; Szczepanowski, Monika; Murray, Paul G.; Bastians, Holger; Trümper, Lorenz; Spang, Rainer; Kube, Dieter
2016-01-01
To discover new regulatory pathways in B lymphoma cells, we performed a combined analysis of experimental, clinical and global gene expression data. We identified a specific cluster of genes that was coherently expressed in primary lymphoma samples and suppressed by activation of the B cell receptor (BCR) through αIgM treatment of lymphoma cells in vitro. This gene cluster, which we called BCR.1, includes numerous cell cycle regulators. A reduced expression of BCR.1 genes after BCR activation was observed in different cell lines and also in CD10+ germinal center B cells. We found that BCR activation led to a delayed entry to and progression of mitosis and defects in metaphase. Cytogenetic changes were detected upon long-term αIgM treatment. Furthermore, an inverse correlation of BCR.1 genes with c-Myc co-regulated genes in distinct groups of lymphoma patients was observed. Finally, we showed that the BCR.1 index discriminates activated B cell-like and germinal centre B cell-like diffuse large B cell lymphoma supporting the functional relevance of this new regulatory circuit and the power of guided clustering for biomarker discovery. PMID:27166259
Warren, Janet I; South, Susan C
2009-01-01
The psychometric properties and structure of the Cluster B Personality Disorder criteria (Antisocial, Borderline, Histrionic, and Narcissistic) are examined in a sample of 261 female inmates using a self-report screen followed by a full diagnostic interview. The results of the structural analyses in this sample demonstrated good internal consistency and convergence, but poor discriminant validity between disorders. An exploratory factor analysis found that the structure of these disorders was best accounted for by a four-factor solution that paralleled the Diagnostic and Statistical Manual (DSM-IV-TR; APA, 2000) classification scheme with some significant and notable exceptions. Using the factor scores generated from the factor analysis, the personality profiles of the women were compared with several behavioral indices, including instant offense, institutional infractions, and self-report violence and victimization within the prison. Of particular importance was the consistent relationship observed between narcissistic personality traits and threatening and violent behavior within the prison combined with the impulsive but less malignant presentation of antisocial personality traits among this sample of women. Results are discussed as they inform our understanding of the structural integrity of the four Cluster B diagnostic categories and the relationship of these personality disorders to different types of criminality and violence.
Wallace, C.S.A.; Marsh, S.E.
2005-01-01
Our study used geostatistics to extract measures that characterize the spatial structure of vegetated landscapes from satellite imagery for mapping endangered Sonoran pronghorn habitat. Fine spatial resolution IKONOS data provided information at the scale of individual trees or shrubs that permitted analysis of vegetation structure and pattern. We derived images of landscape structure by calculating local estimates of the nugget, sill, and range variogram parameters within 25 ?? 25-m image windows. These variogram parameters, which describe the spatial autocorrelation of the 1-m image pixels, are shown in previous studies to discriminate between different species-specific vegetation associations. We constructed two independent models of pronghorn landscape preference by coupling the derived measures with Sonoran pronghorn sighting data: a distribution-based model and a cluster-based model. The distribution-based model used the descriptive statistics for variogram measures at pronghorn sightings, whereas the cluster-based model used the distribution of pronghorn sightings within clusters of an unsupervised classification of derived images. Both models define similar landscapes, and validation results confirm they effectively predict the locations of an independent set of pronghorn sightings. Such information, although not a substitute for field-based knowledge of the landscape and associated ecological processes, can provide valuable reconnaissance information to guide natural resource management efforts. ?? 2005 Taylor & Francis Group Ltd.
Srůtková, Dagmar; Spanova, Alena; Spano, Miroslav; Dráb, Vladimír; Schwarzer, Martin; Kozaková, Hana; Rittich, Bohuslav
2011-10-01
Bifidobacterium longum is considered to play an important role in health maintenance of the human gastrointestinal tract. Probiotic properties of bifidobacterial isolates are strictly strain-dependent and reliable methods for the identification and discrimination of this species at both subspecies and strain levels are thus required. Differentiation between B. longum ssp. longum and B. longum ssp. infantis is difficult due to high genomic similarities. In this study, four molecular-biological methods (species- and subspecies-specific PCRs, random amplified polymorphic DNA (RAPD) method using 5 primers, repetitive sequence-based (rep)-PCR with BOXA1R and (GTG)(5) primers and amplified ribosomal DNA restriction analysis (ARDRA)) and biochemical analysis, were compared for the classification of 30 B. longum strains (28 isolates and 2 collection strains) on subspecies level. Strains originally isolated from the faeces of breast-fed healthy infants (25) and healthy adults (3) showed a high degree of genetic homogeneity by PCR with subspecies-specific primers and rep-PCR. When analysed by RAPD, the strains formed many separate clusters without any potential for subspecies discrimination. These methods together with arabionose/melezitose fermentation analysis clearly differentiated only the collection strains into B. longum ssp. longum and B. longum ssp. infantis at the subspecies level. On the other hand, ARDRA analysis differentiated the strains into the B. longum/infantis subspecies using the cleavage analysis of genus-specific amplicon with just one enzyme, Sau3AI. According to our results the majority of the strains belong to the B. longum ssp. infantis (75%). Therefore we suggest ARDRA using Sau3AI restriction enzyme as the first method of choice for distinguishing between B. longum ssp. longum and B. longum ssp. infantis. Copyright © 2011 Elsevier B.V. All rights reserved.
Van Rheenen, Tamsyn E; Bryce, Shayden; Tan, Eric J; Neill, Erica; Gurvich, Caroline; Louise, Stephanie; Rossell, Susan L
2016-03-01
Despite known overlaps in the pattern of cognitive impairments in individuals with bipolar disorder (BD), schizophrenia (SZ) and schizoaffective disorder (SZA), few studies have examined the extent to which cognitive performance validates traditional diagnostic boundaries in these groups. Individuals with SZ (n=49), schizoaffective disorder (n=33) and BD (n=35) completed a battery of cognitive tests measuring the domains of processing speed, immediate memory, semantic memory, learning, working memory, executive function and sustained attention. A discriminant functions analysis revealed a significant function comprising semantic memory, immediate memory and processing speed that maximally separated patients with SZ from those with BD. Initial classification scores on the basis of this function showed modest diagnostic accuracy, owing in part to the misclassification of SZA patients as having SZ. When SZA patients were removed from the model, a second cross-validated classifier yielded slightly improved diagnostic accuracy and a single function solution, of which semantic memory loaded most heavily. A cluster of non-executive cognitive processes appears to have some validity in mapping onto traditional nosological boundaries. However, since semantic memory performance was the primary driver of the discrimination between BD and SZ, it is possible that performance differences between the disorders in this cognitive domain in particular, index separate underlying aetiologies. Copyright © 2015 Elsevier B.V. All rights reserved.
Sharp, Michael D; Kocaoglu-Vurma, Nurdan A; Langford, Vaughan; Rodriguez-Saona, Luis E; Harper, W James
2012-03-01
Vanilla beans have been shown to contain over 200 compounds, which can vary in concentration depending on the region where the beans are harvested. Several compounds including vanillin, p-hydroxybenzaldehyde, guaiacol, and anise alcohol have been found to be important for the aroma profile of vanilla. Our objective was to evaluate the performance of selected ion flow tube mass spectrometry (SIFT-MS) and Fourier-transform infrared (FTIR) spectroscopy for rapid discrimination and characterization of vanilla bean extracts. Vanilla extracts were obtained from different countries including Uganda, Indonesia, Papua New Guinea, Madagascar, and India. Multivariate data analysis (soft independent modeling of class analogy, SIMCA) was utilized to determine the clustering patterns between samples. Both methods provided differentiation between samples for all vanilla bean extracts. FTIR differentiated on the basis of functional groups, whereas the SIFT-MS method provided more specific information about the chemical basis of the differentiation. SIMCA's discriminating power showed that the most important compounds responsible for the differentiation between samples by SIFT-MS were vanillin, anise alcohol, 4-methylguaiacol, p-hydroxybenzaldehyde/trimethylpyrazine, p-cresol/anisole, guaiacol, isovaleric acid, and acetic acid. ATR-IR spectroscopy analysis showed that the classification of samples was related to major bands at 1523, 1573, 1516, 1292, 1774, 1670, 1608, and 1431 cm(-1) , associated with vanillin and vanillin derivatives. © 2012 Institute of Food Technologists®
Distefano, Gaetano; Caruso, Marco; La Malfa, Stefano; Gentile, Alessandra; Wu, Shu-Biao
2012-01-01
High resolution melting curve analysis (HRM) has been used as an efficient, accurate and cost-effective tool to detect single nucleotide polymorphisms (SNPs) or insertions or deletions (INDELs). However, its efficiency, accuracy and applicability to discriminate microsatellite polymorphism have not been extensively assessed. The traditional protocols used for SSR genotyping include PCR amplification of the DNA fragment and the separation of the fragments on electrophoresis-based platform. However, post-PCR handling processes are laborious and costly. Furthermore, SNPs present in the sequences flanking repeat motif cannot be detected by polyacrylamide-gel-electrophoresis based methods. In the present study, we compared the discriminating power of HRM with the traditional electrophoresis-based methods and provided a panel of primers for HRM genotyping in Citrus. The results showed that sixteen SSR markers produced distinct polymorphic melting curves among the Citrus spp investigated through HRM analysis. Among those, 10 showed more genotypes by HRM analysis than capillary electrophoresis owing to the presence of SNPs in the amplicons. For the SSR markers without SNPs present in the flanking region, HRM also gave distinct melting curves which detected same genotypes as were shown in capillary electrophoresis (CE) analysis. Moreover, HRM analysis allowed the discrimination of most of the 15 citrus genotypes and the resulting genetic distance analysis clustered them into three main branches. In conclusion, it has been approved that HRM is not only an efficient and cost-effective alternative of electrophoresis-based method for SSR markers, but also a method to uncover more polymorphisms contributed by SNPs present in SSRs. It was therefore suggested that the panel of SSR markers could be used in a variety of applications in the citrus biodiversity and breeding programs using HRM analysis. Furthermore, we speculate that the HRM analysis can be employed to analyse SSR markers in a wide range of applications in all other species.
NASA Astrophysics Data System (ADS)
Zhang, Haiying; Bai, Jiaojiao; Li, Zhengjie; Liu, Yan; Liu, Kunhong
2017-06-01
The detection and discrimination of infrared small dim targets is a challenge in automatic target recognition (ATR), because there is no salient information of size, shape and texture. Many researchers focus on mining more discriminative information of targets in temporal-spatial. However, such information may not be available with the change of imaging environments, and the targets size and intensity keep changing in different imaging distance. So in this paper, we propose a novel research scheme using density-based clustering and backtracking strategy. In this scheme, the speeded up robust feature (SURF) detector is applied to capture candidate targets in single frame at first. And then, these points are mapped into one frame, so that target traces form a local aggregation pattern. In order to isolate the targets from noises, a newly proposed density-based clustering algorithm, fast search and find of density peak (FSFDP for short), is employed to cluster targets by the spatial intensive distribution. Two important factors of the algorithm, percent and γ , are exploited fully to determine the clustering scale automatically, so as to extract the trace with highest clutter suppression ratio. And at the final step, a backtracking algorithm is designed to detect and discriminate target trace as well as to eliminate clutter. The consistence and continuity of the short-time target trajectory in temporal-spatial is incorporated into the bounding function to speed up the pruning. Compared with several state-of-arts methods, our algorithm is more effective for the dim targets with lower signal-to clutter ratio (SCR). Furthermore, it avoids constructing the candidate target trajectory searching space, so its time complexity is limited to a polynomial level. The extensive experimental results show that it has superior performance in probability of detection (Pd) and false alarm suppressing rate aiming at variety of complex backgrounds.
Inductively coupled plasma mass spectrometer with axial field in a quadrupole reaction cell.
Bandura, Dmitry R; Baranov, Vladimir I; Tanner, Scott D
2002-10-01
A novel reaction cell for ICP-MS with an electric field provided inside the quadrupole along its axis is described. The field is implemented via a DC bias applied to additional auxiliary electrodes inserted between the rods of the quadrupole. The field reduces the settling time of the pressurized quadrupole when its mass bandpass is dynamically tuned. It also improves the transmission of analyte ions. It is shown that for the pressurized cell with the field activated, the recovery time for a change in quadrupole operating parameters is reduced to <4 ms, which allows fast tuning of the mass bandpass in concert with and at the speed of the analyzing quadrupole. When the cell is operated with ammonia, the field reduces ion-ammonia cluster formation, further enhancing the transmission of atomic ions that have a high cluster formation rate. Ni x (NH3)n+ cluster formation in a cell operated with a wide bandpass (i.e., Ni+ precursors are stable in the cell) is shown to be dependent on the axial field strength. Clusters at n = 2-4 can be suppressed by 9, 1200, and >610 times, respectively. The use of a retarding axial field for in-situ energy discrimination against cluster and polyatomic ions is shown. When the cell is pressurized with O2 for suppression of 129Xe+, the formation of 127IH2+ by reactions with gas impurities limits the detection of 129I to isotopic abundance of approximately 10(-6). In-cell energy discrimination against 127IH2+ utilizing a retarding axial field is shown to reduce the abundance of the background at m/z = 129 to ca. 3 x 10(-8) of the 127I+ signal. In-cell energy discrimination against 127IH2+ is shown to cause less I+ loss than a post-cell potential energy barrier for the same degree of 127IH2+ suppression.
NASA Astrophysics Data System (ADS)
Ives Torres-Silva, Ana; Eder, Wolfgang; Hohenegger, Johann; Briguglio, Antonino
2017-04-01
None other larger benthic foraminifera (LBF) group in the Caribbean realm has led to such diverse opinions and controversy about their classification than the nummulitids. Unlike the Tethys species, where delimitation and details of evolutionary changes within species are well known, intraspecific evolution in the Caribbean remains understudied and generic nomenclature has not reached consensus yet. Morphometric studies appear to be the most appropriate methods in solving this unsatisfactory taxonomical situation. For every proposed species, morphological variations correlating with paleoecological factors and precise stratigraphic occurrence and range has to be studied in detail. Thus, the morphology in equatorial sections of nummulitids without chamber partitions was quantified at seven localities from Western and Central Cuba and interpreted by eleven growth-independent and/or growth-invariant characters and attributes. 102 isolated megalospheric individuals originating from Cuban localities, spanning the time interval from lower Middle Eocene to lower Oligocene, were classified by nonmetric multidimensional scaling and cluster analysis. Thirteen Caribbean specimens, which are considered as type material, were included. Two clearly differentiated morphogroups could be differentiated according to cluster and ordination analysis into the genera Nummulites and Palaeonummulites. Main differences in morphological characters between the morphogroups were confirmed by discriminant analysis. Nummulites differs from Palaeonummulites in a weak increase of the marginal radius and weak backbend angles. All specimens of Nummulites s.stricto from different localities were regarded as Nummulites striatoreticulatus. Based on discriminant analysis, N. striatoreticulatus specimens with similar depositional environments, but of different stratigraphic occurrence, are strongly separated. The older forms have a smaller backbend angle, perimeter ratio and proloculus nominal diameter, thus documenting stratigraphic and evolutionary trends. The species Nummulites striatoreticulatus in the Cuban sections ranges from lower middle Eocene to lower Priabonian. Within the Palaeonummulites group, the exceptional range of morphological variation tends to obscure the fact that there are several well-defined morphological species. Based on discriminant analysis the species P. willcoxi, P. trinitatensis, P. floridensis, P. ocalanus and P. soldadensis were classified ranging from tightly coiled individuals that are very similar to Nummulites to loosely coiled moprhotypes. Major separators between the species are the marginal radius, proloculus nominal diameter, spiral chamber height increase and the length of the first chamber. Stratigraphic trends within species were not clearly detectable, but paleogeographic differences and the morphological overlap between morphogroups in certain species are obvious. Paleonummulites species have long stratigraphic ranges from late Middle Eocene to probably lower Oligocene.
Spike sorting based upon machine learning algorithms (SOMA).
Horton, P M; Nicol, A U; Kendrick, K M; Feng, J F
2007-02-15
We have developed a spike sorting method, using a combination of various machine learning algorithms, to analyse electrophysiological data and automatically determine the number of sampled neurons from an individual electrode, and discriminate their activities. We discuss extensions to a standard unsupervised learning algorithm (Kohonen), as using a simple application of this technique would only identify a known number of clusters. Our extra techniques automatically identify the number of clusters within the dataset, and their sizes, thereby reducing the chance of misclassification. We also discuss a new pre-processing technique, which transforms the data into a higher dimensional feature space revealing separable clusters. Using principal component analysis (PCA) alone may not achieve this. Our new approach appends the features acquired using PCA with features describing the geometric shapes that constitute a spike waveform. To validate our new spike sorting approach, we have applied it to multi-electrode array datasets acquired from the rat olfactory bulb, and from the sheep infero-temporal cortex, and using simulated data. The SOMA sofware is available at http://www.sussex.ac.uk/Users/pmh20/spikes.
Metabolic Response to XD14 Treatment in Human Breast Cancer Cell Line MCF-7
Pan, Daqiang; Kather, Michel; Willmann, Lucas; Schlimpert, Manuel; Bauer, Christoph; Lagies, Simon; Schmidtkunz, Karin; Eisenhardt, Steffen U.; Jung, Manfred; Günther, Stefan; Kammerer, Bernd
2016-01-01
XD14 is a 4-acyl pyrrole derivative, which was discovered by a high-throughput virtual screening experiment. XD14 inhibits bromodomain and extra-terminal domain (BET) proteins (BRD2, BRD3, BRD4 and BRDT) and consequently suppresses cell proliferation. In this study, metabolic profiling reveals the molecular effects in the human breast cancer cell line MCF-7 (Michigan Cancer Foundation-7) treated by XD14. A three-day time series experiment with two concentrations of XD14 was performed. Gas chromatography-mass spectrometry (GC-MS) was applied for untargeted profiling of treated and non-treated MCF-7 cells. The gained data sets were evaluated by several statistical methods: analysis of variance (ANOVA), clustering analysis, principle component analysis (PCA), and partial least squares discriminant analysis (PLS-DA). Cell proliferation was strongly inhibited by treatment with 50 µM XD14. Samples could be discriminated by time and XD14 concentration using PLS-DA. From the 117 identified metabolites, 67 were significantly altered after XD14 treatment. These metabolites include amino acids, fatty acids, Krebs cycle and glycolysis intermediates, as well as compounds of purine and pyrimidine metabolism. This massive intervention in energy metabolism and the lack of available nucleotides could explain the decreased proliferation rate of the cancer cells. PMID:27783056
Zeng, Yanling; Lu, Yang; Chen, Zhao; Tan, Jiawei; Bai, Jie; Li, Pengyue; Wang, Zhixin; Du, Shouying
2018-05-11
Bolbostemma paniculatum is a traditional Chinese medicine (TCM) showed various therapeutic effects. Owing to its complex chemical composition, few investigations have acquired a comprehensive cognition for the chemical profiles of this herb and explicated the differences between samples collected from different places. In this study, a strategy based on UPLC tandem LTQ-Orbitrap MS n was established for characterizing chemical components of B. paniculatum . Through a systematic identification strategy, a total of 60 components in B. paniculatum were rapidly separated in 30 min and identified. Then based on peak intensities of all the characterized components, principle component analysis (PCA) and hierarchical cluster analysis (HCA) were employed to classify 18 batches of B. paniculatum into four groups, which were highly consistent with the four climate types of their original places. And five compounds were finally screened out as chemical markers to discriminate the internal quality of B. paniculatum . As the first study to systematically characterize the chemical components of B. paniculatum by UPLC-MS n , the above results could offer essential data for its pharmacological research. And the current strategy could provide useful reference for future investigations on discovery of important chemical constituents in TCM, as well as establishment of quality control and evaluation method.
NASA Astrophysics Data System (ADS)
Wu, Xia; Zheng, Kang; Zhao, Fengjia; Zheng, Yongjun; Li, Yantuan
2014-08-01
Meretricis concha is a kind of marine traditional Chinese medicine (TCM), and has been commonly used for the treatment of asthma and scald burns. In order to investigate the relationship between the inorganic elemental fingerprint and the geographical origin identification of Meretricis concha, the elemental contents of M. concha from five sampling points in Rushan Bay have been determined by means of inductively coupled plasma optical emission spectrometry (ICP-OES). Based on the contents of 14 inorganic elements (Al, As, Cd, Co, Cr, Cu, Fe, Hg, Mn, Mo, Ni, Pb, Se, and Zn), the inorganic elemental fingerprint which well reflects the elemental characteristics was constructed. All the data from the five sampling points were discriminated with accuracy through hierarchical cluster analysis (HCA) and principle component analysis (PCA), indicating that a four-factor model which could explain approximately 80% of the detection data was established, and the elements Al, As, Cd, Cu, Ni and Pb could be viewed as the characteristic elements. This investigation suggests that the inorganic elemental fingerprint combined with multivariate statistical analysis is a promising method for verifying the geographical origin of M. concha, and this strategy should be valuable for the authenticity discrimination of some marine TCM.
Li, Yan; Zhang, Ji; Jin, Hang; Liu, Honggao; Wang, Yuanzhong
2016-08-05
A quality assessment system comprised of a tandem technique of ultraviolet (UV) spectroscopy and ultra-fast liquid chromatography (UFLC) aided by multivariate analysis was presented for the determination of geographic origin of Wolfiporia extensa collected from five regions in Yunnan Province of China. Characteristic UV spectroscopic fingerprints of samples were determined based on its methanol extract. UFLC was applied for the determination of pachymic acid (a biomarker) presented in individual test samples. The spectrum data matrix and the content of pachymic acid were integrated and analyzed by partial least squares discriminant analysis (PLS-DA) and hierarchical cluster analysis (HCA). The results showed that chemical properties of samples were clearly dominated by the epidermis and inner part as well as geographical origins. The relationships among samples obtained from these five regions have been also presented. Moreover, an interesting finding implied that geographical origins had much greater influence on the chemical properties of epidermis compared with that of the inner part. This study demonstrated that a rapid tool for accurate discrimination of W. extensa by UV spectroscopy and UFLC could be available for quality control of complicated medicinal mushrooms. Copyright © 2016 Elsevier B.V. All rights reserved.
Liu, Xuemei; Gu, Zhixin; Guo, Yuan; Liu, Jingjing; Ma, Ming; Chen, Bo; Wang, Liping
2017-04-15
Paper spray-mass spectrometry (PS-MS) is a rapid, solvent-efficient, and high-throughput analytical method for analyzing complex samples. In this study, a PS-MS method was developed to obtain MS profiles of Aurantii Fructus Immaturus (aka Zhishi in Chinese) in positive and negative ion modes. In combination with multivariate analyses, including principal component analysis and cluster analysis, the PS-MS profiles of 25 batches of Zhishi were discriminated in 25 batches of Citri Reticulatae Pericarpium Viride (aka Qingpi in Chinese; an adulterant of Zhishi). Moreover, a rapid quantitative analysis of synephrine, a prescriptive quality control component of Zhishi listed in the Chinese Pharmacopoeia, was conducted with PS-MS using synephrine-d2 as an internal standard (IS). The linearity range was 1.68-16.8μg/mL (R 2 =0.9985), the limit of quantitation was 0.5μg/mL. Relative standard deviations in the intra- and inter-day precision of the MS were 4.87 and 4.90%, respectively. Compared with HPLC results, there was no significant difference in the quantitation of synephrine. This study demonstrated that the PS-MS method is useful for the rapid discrimination and quality control of Zhishi samples. Copyright © 2017 Elsevier B.V. All rights reserved.
Event Networks and the Identification of Crime Pattern Motifs
2015-01-01
In this paper we demonstrate the use of network analysis to characterise patterns of clustering in spatio-temporal events. Such clustering is of both theoretical and practical importance in the study of crime, and forms the basis for a number of preventative strategies. However, existing analytical methods show only that clustering is present in data, while offering little insight into the nature of the patterns present. Here, we show how the classification of pairs of events as close in space and time can be used to define a network, thereby generalising previous approaches. The application of graph-theoretic techniques to these networks can then offer significantly deeper insight into the structure of the data than previously possible. In particular, we focus on the identification of network motifs, which have clear interpretation in terms of spatio-temporal behaviour. Statistical analysis is complicated by the nature of the underlying data, and we provide a method by which appropriate randomised graphs can be generated. Two datasets are used as case studies: maritime piracy at the global scale, and residential burglary in an urban area. In both cases, the same significant 3-vertex motif is found; this result suggests that incidents tend to occur not just in pairs, but in fact in larger groups within a restricted spatio-temporal domain. In the 4-vertex case, different motifs are found to be significant in each case, suggesting that this technique is capable of discriminating between clustering patterns at a finer granularity than previously possible. PMID:26605544
Stracy, Mathew; Lesterlin, Christian; Garza de Leon, Federico; Uphoff, Stephan; Zawadzki, Pawel; Kapanidis, Achillefs N.
2015-01-01
Despite the fundamental importance of transcription, a comprehensive analysis of RNA polymerase (RNAP) behavior and its role in the nucleoid organization in vivo is lacking. Here, we used superresolution microscopy to study the localization and dynamics of the transcription machinery and DNA in live bacterial cells, at both the single-molecule and the population level. We used photoactivated single-molecule tracking to discriminate between mobile RNAPs and RNAPs specifically bound to DNA, either on promoters or transcribed genes. Mobile RNAPs can explore the whole nucleoid while searching for promoters, and spend 85% of their search time in nonspecific interactions with DNA. On the other hand, the distribution of specifically bound RNAPs shows that low levels of transcription can occur throughout the nucleoid. Further, clustering analysis and 3D structured illumination microscopy (SIM) show that dense clusters of transcribing RNAPs form almost exclusively at the nucleoid periphery. Treatment with rifampicin shows that active transcription is necessary for maintaining this spatial organization. In faster growth conditions, the fraction of transcribing RNAPs increases, as well as their clustering. Under these conditions, we observed dramatic phase separation between the densest clusters of RNAPs and the densest regions of the nucleoid. These findings show that transcription can cause spatial reorganization of the nucleoid, with movement of gene loci out of the bulk of DNA as levels of transcription increase. This work provides a global view of the organization of RNA polymerase and transcription in living cells. PMID:26224838
Goutaudier, N; Chauchard, E; Melioli, T; Valls, M; van Leeuwen, N; Chabrol, H
2015-09-01
The aim of the study was to explore the typology of adolescents with immigrant background based on the orientations of acculturation and to estimate the psychosocial adaptation of the various subtypes. A sample of 228 French high school students with an immigrant background completed a questionnaire assessing acculturation orientations (Immigrant Acculturation Scale; Barrette et al., 2004), antisocial behaviors, depressive symptoms and self-esteem. Cluster analysis based on acculturation orientations was performed using the k-means method. Cluster analysis produced four distinct acculturation profiles: bicultural (31%), separated (28%), marginalized (21%), and assimilated-individualistic (20%). Adolescents in the separated and marginalized clusters, both characterized by rejection of the host culture, reported higher levels of antisocial behavior. Depressive symptoms and self-esteem did not differ between clusters. Several hypotheses may explain the association between separation and delinquency. First, separation and rejection of the host culture may lead to rebellious behavior such as delinquency. Conversely, delinquent behavior may provoke rejection or discrimination by peers or school, or legal sanctions that induce a reciprocal process of rejection of the host culture and separation. The relationship between separation and antisocial behavior may be bidirectional, each one reinforcing the other, resulting in a negative spiral. This study confirms the interest of the study of the orientations of acculturation in the understanding of the antisocial behavior of adolescents with immigrant background. Copyright © 2014 L’Encéphale, Paris. Published by Elsevier Masson SAS. All rights reserved.
Metabolomic analysis of primary metabolites in citrus leaf during defense responses.
Asai, Tomonori; Matsukawa, Tetsuya; Kajiyama, Shin'ichiro
2017-03-01
Mechanical damage is one of the unavoidable environmental stresses to plant growth and development. Plants induce a variety of reactions which defend against natural enemies and/or heal the wounded sites. Jasmonic acid (JA) and salicylic acid (SA), defense-related plant hormones, are well known to be involved in induction of defense reactions and play important roles as signal molecules. However, defense related metabolites are so numerous and diverse that roles of individual compounds are still to be elucidated. In this report, we carried out a comprehensive analysis of metabolic changes during wound response in citrus plants which are one of the most commercially important fruit tree families. Changes in amino acid, sugar, and organic acid profiles in leaves were surveyed after wounding, JA and SA treatments using gas chromatography-mass spectrometry (GC/MS) in seven citrus species, Citrus sinensis, Citrus limon, Citrus paradisi, Citrus unshiu, Citrus kinokuni, Citrus grandis, and Citrus hassaku. GC/MS data were applied to multivariate analyses including hierarchical cluster analysis (HCA), primary component analysis (PCA), and orthogonal partial least squares-discriminant analysis (OPLS-DA) to extract stress-related compounds. HCA showed the amino acid cluster including phenylalanine and tryptophan, suggesting that amino acids in this cluster are concertedly regulated during responses against treatments. OPLS-DA exhibited that tryptophan was accumulated after wounding and JA treatments in all species tested, while serine was down regulated. Our results suggest that tryptophan and serine are common biomarker candidates in citrus plants for wound stress. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Snellings, Patrick; van der Leij, Aryan; Blok, Henk; de Jong, Peter F.
2010-01-01
This study investigated the role of speech perception accuracy and speed in fluent word decoding of reading disabled (RD) children. A same-different phoneme discrimination task with natural speech tested the perception of single consonants and consonant clusters by young but persistent RD children. RD children were slower than chronological age…
A superior edge preserving filter with a systematic analysis
NASA Technical Reports Server (NTRS)
Holladay, Kenneth W.; Rickman, Doug
1991-01-01
A new, adaptive, edge preserving filter for use in image processing is presented. It had superior performance when compared to other filters. Termed the contiguous K-average, it aggregates pixels by examining all pixels contiguous to an existing cluster and adding the pixel closest to the mean of the existing cluster. The process is iterated until K pixels were accumulated. Rather than simply compare the visual results of processing with this operator to other filters, some approaches were developed which allow quantitative evaluation of how well and filter performs. Particular attention is given to the standard deviation of noise within a feature and the stability of imagery under iterative processing. Demonstrations illustrate the performance of several filters to discriminate against noise and retain edges, the effect of filtering as a preprocessing step, and the utility of the contiguous K-average filter when used with remote sensing data.
Patient clusters in acute, work-related back pain based on patterns of disability risk factors.
Shaw, William S; Pransky, Glenn; Patterson, William; Linton, Steven J; Winters, Thomas
2007-02-01
To identify subgroups of patients with work-related back pain based on disability risk factors. Patients with work-related back pain (N = 528) completed a 16-item questionnaire of potential disability risk factors before their initial medical evaluation. Outcomes of pain, functional limitation, and work disability were assessed 1 and 3 months later. A K-Means cluster analysis of 5 disability risk factors (pain, depressed mood, fear avoidant beliefs, work inflexibility, and poor expectations for recovery) resulted in 4 sub-groups: low risk (n = 182); emotional distress (n = 103); severe pain/fear avoidant (n = 102); and concerns about job accommodation (n = 141). Pain and disability outcomes at follow-up were superior in the low-risk group and poorest in the severe pain/fear avoidant group. Patients with acute back pain can be discriminated into subgroups depending on whether disability is related to pain beliefs, emotional distress, or workplace concerns.
Model-Free Reconstruction of Excitatory Neuronal Connectivity from Calcium Imaging Signals
Stetter, Olav; Battaglia, Demian; Soriano, Jordi; Geisel, Theo
2012-01-01
A systematic assessment of global neural network connectivity through direct electrophysiological assays has remained technically infeasible, even in simpler systems like dissociated neuronal cultures. We introduce an improved algorithmic approach based on Transfer Entropy to reconstruct structural connectivity from network activity monitored through calcium imaging. We focus in this study on the inference of excitatory synaptic links. Based on information theory, our method requires no prior assumptions on the statistics of neuronal firing and neuronal connections. The performance of our algorithm is benchmarked on surrogate time series of calcium fluorescence generated by the simulated dynamics of a network with known ground-truth topology. We find that the functional network topology revealed by Transfer Entropy depends qualitatively on the time-dependent dynamic state of the network (bursting or non-bursting). Thus by conditioning with respect to the global mean activity, we improve the performance of our method. This allows us to focus the analysis to specific dynamical regimes of the network in which the inferred functional connectivity is shaped by monosynaptic excitatory connections, rather than by collective synchrony. Our method can discriminate between actual causal influences between neurons and spurious non-causal correlations due to light scattering artifacts, which inherently affect the quality of fluorescence imaging. Compared to other reconstruction strategies such as cross-correlation or Granger Causality methods, our method based on improved Transfer Entropy is remarkably more accurate. In particular, it provides a good estimation of the excitatory network clustering coefficient, allowing for discrimination between weakly and strongly clustered topologies. Finally, we demonstrate the applicability of our method to analyses of real recordings of in vitro disinhibited cortical cultures where we suggest that excitatory connections are characterized by an elevated level of clustering compared to a random graph (although not extreme) and can be markedly non-local. PMID:22927808
Monitoring Fatigue Status with HRV Measures in Elite Athletes: An Avenue Beyond RMSSD?
Schmitt, Laurent; Regnard, Jacques; Millet, Grégoire P
2015-01-01
Among the tools proposed to assess the athlete's "fatigue," the analysis of heart rate variability (HRV) provides an indirect evaluation of the settings of autonomic control of heart activity. HRV analysis is performed through assessment of time-domain indices, the square root of the mean of the sum of the squares of differences between adjacent normal R-R intervals (RMSSD) measured during short (5 min) recordings in supine position upon awakening in the morning and particularly the logarithm of RMSSD (LnRMSSD) has been proposed as the most useful resting HRV indicator. However, if RMSSD can help the practitioner to identify a global "fatigue" level, it does not allow discriminating different types of fatigue. Recent results using spectral HRV analysis highlighted firstly that HRV profiles assessed in supine and standing positions are independent and complementary; and secondly that using these postural profiles allows the clustering of distinct sub-categories of "fatigue." Since, cardiovascular control settings are different in standing and lying posture, using the HRV figures of both postures to cluster fatigue state embeds information on the dynamics of control responses. Such, HRV spectral analysis appears more sensitive and enlightening than time-domain HRV indices. The wealthier information provided by this spectral analysis should improve the monitoring of the adaptive training-recovery process in athletes.
Joint spatial-spectral hyperspectral image clustering using block-diagonal amplified affinity matrix
NASA Astrophysics Data System (ADS)
Fan, Lei; Messinger, David W.
2018-03-01
The large number of spectral channels in a hyperspectral image (HSI) produces a fine spectral resolution to differentiate between materials in a scene. However, difficult classes that have similar spectral signatures are often confused while merely exploiting information in the spectral domain. Therefore, in addition to spectral characteristics, the spatial relationships inherent in HSIs should also be considered for incorporation into classifiers. The growing availability of high spectral and spatial resolution of remote sensors provides rich information for image clustering. Besides the discriminating power in the rich spectrum, contextual information can be extracted from the spatial domain, such as the size and the shape of the structure to which one pixel belongs. In recent years, spectral clustering has gained popularity compared to other clustering methods due to the difficulty of accurate statistical modeling of data in high dimensional space. The joint spatial-spectral information could be effectively incorporated into the proximity graph for spectral clustering approach, which provides a better data representation by discovering the inherent lower dimensionality from the input space. We embedded both spectral and spatial information into our proposed local density adaptive affinity matrix, which is able to handle multiscale data by automatically selecting the scale of analysis for every pixel according to its neighborhood of the correlated pixels. Furthermore, we explored the "conductivity method," which aims at amplifying the block diagonal structure of the affinity matrix to further improve the performance of spectral clustering on HSI datasets.
NASA Astrophysics Data System (ADS)
Xu, M. L.; Yu, Y.; Ramaswamy, H. S.; Zhu, S. M.
2017-01-01
Chinese liquor aroma components were characterized during the aging process using gas chromatography (GC). Principal component and cluster analysis (PCA, CA) were used to discriminate the Chinese liquor age which has a great economic value. Of a total of 21 major aroma components identified and quantified, 13 components which included several acids, alcohols, esters, aldehydes and furans decreased significantly in the first year of aging, maintained the same levels (p > 0.05) for next three years and decreased again (p < 0.05) in the fifth year. On the contrary, a significant increase was observed in propionic acid, furfural and phenylethanol. Ethyl lactate was found to be the most stable aroma component during aging process. Results of PCA and CA demonstrated that young liquor (fresh) and aged liquors were well separated from each other, which is in consistent with the evolution of aroma components along with the aging process. These findings provide a quantitative basis for discriminating the Chinese liquor age and a scientific basis for further research on elucidating the liquor aging process, and a possible tool to guard against counterfeit and defective products.
Takamura, Ayari; Watanabe, Ken; Akutsu, Tomoko; Ozawa, Takeaki
2018-05-31
Body fluid (BF) identification is a critical part of a criminal investigation because of its ability to suggest how the crime was committed and to provide reliable origins of DNA. In contrast to current methods using serological and biochemical techniques, vibrational spectroscopic approaches provide alternative advantages for forensic BF identification, such as non-destructivity and versatility for various BF types and analytical interests. However, unexplored issues remain for its practical application to forensics; for example, a specific BF needs to be discriminated from all other suspicious materials as well as other BFs, and the method should be applicable even to aged BF samples. Herein, we describe an innovative modeling method for discriminating the ATR FT-IR spectra of various BFs, including peripheral blood, saliva, semen, urine and sweat, to meet the practical demands described above. Spectra from unexpected non-BF samples were efficiently excluded as outliers by adopting the Q-statistics technique. The robustness of the models against aged BFs was significantly improved by using the discrimination scheme of a dichotomous classification tree with hierarchical clustering. The present study advances the use of vibrational spectroscopy and a chemometric strategy for forensic BF identification.
White-spot Lesions and Gingivitis Microbiotas in Orthodontic Patients
Tanner, A.C.R.; Sonis, A.L.; Lif Holgerson, P.; Starr, J.R.; Nunez, Y.; Kressirer, C.A.; Paster, B.J.; Johansson, I.
2012-01-01
White-spot lesions (WSL) associated with orthodontic appliances are a cosmetic problem and increase risk for cavities. We characterized the microbiota of WSL, accounting for confounding due to gingivitis. Participants were 60 children with fixed appliances, aged between 10 and 19 yrs, half with WSL. Plaque samples were assayed by a 16S rRNA-based microarray (HOMIM) and by PCR. Mean gingival index was positively associated with WSL (p = 0.018). Taxa associated with WSL by microarray included Granulicatella elegans (p = 0.01), Veillonellaceae sp. HOT 155 (p < 0.01), and Bifidobacterium Cluster 1 (p = 0.11), and by qPCR, Streptococcus mutans (p = 0.008) and Scardovia wiggsiae (p = 0.04) Taxa associated with gingivitis by microarray included: Gemella sanguinis (p = 0.002), Actinomyces sp. HOT 448 (p = 0.003), Prevotella cluster IV (p = 0.021), and Streptococcus sp. HOT 071/070 (p = 0.023); and levels of S. mutans (p = 0.02) and Bifidobacteriaceae (p = 0.012) by qPCR. Species’ associations with WSL were minimally changed with adjustment for gingivitis level. Partial least-squares discriminant analysis yielded good discrimination between children with and those without WSL. Granulicatella, Veillonellaceae and Bifidobacteriaceae, in addition to S. mutans and S. wiggsiae, were associated with the presence of WSL in adolescents undergoing orthodontic treatment. Many taxa showed a stronger association with gingivitis than with WSL. PMID:22837552
White-spot lesions and gingivitis microbiotas in orthodontic patients.
Tanner, A C R; Sonis, A L; Lif Holgerson, P; Starr, J R; Nunez, Y; Kressirer, C A; Paster, B J; Johansson, I
2012-09-01
White-spot lesions (WSL) associated with orthodontic appliances are a cosmetic problem and increase risk for cavities. We characterized the microbiota of WSL, accounting for confounding due to gingivitis. Participants were 60 children with fixed appliances, aged between 10 and 19 yrs, half with WSL. Plaque samples were assayed by a 16S rRNA-based microarray (HOMIM) and by PCR. Mean gingival index was positively associated with WSL (p = 0.018). Taxa associated with WSL by microarray included Granulicatella elegans (p = 0.01), Veillonellaceae sp. HOT 155 (p < 0.01), and Bifidobacterium Cluster 1 (p = 0.11), and by qPCR, Streptococcus mutans (p = 0.008) and Scardovia wiggsiae (p = 0.04) Taxa associated with gingivitis by microarray included: Gemella sanguinis (p = 0.002), Actinomyces sp. HOT 448 (p = 0.003), Prevotella cluster IV (p = 0.021), and Streptococcus sp. HOT 071/070 (p = 0.023); and levels of S. mutans (p = 0.02) and Bifidobacteriaceae (p = 0.012) by qPCR. Species' associations with WSL were minimally changed with adjustment for gingivitis level. Partial least-squares discriminant analysis yielded good discrimination between children with and those without WSL. Granulicatella, Veillonellaceae and Bifidobacteriaceae, in addition to S. mutans and S. wiggsiae, were associated with the presence of WSL in adolescents undergoing orthodontic treatment. Many taxa showed a stronger association with gingivitis than with WSL.
A new physical performance classification system for elite handball players: cluster analysis
Chirosa, Ignacio J.; Robinson, Joseph E.; van der Tillaar, Roland; Chirosa, Luis J.; Martín, Isidoro Martínez
2016-01-01
Abstract The aim of the present study was to identify different cluster groups of handball players according to their physical performance level assessed in a series of physical assessments, which could then be used to design a training program based on individual strengths and weaknesses, and to determine which of these variables best identified elite performance in a group of under-19 [U19] national level handball players. Players of the U19 National Handball team (n=16) performed a set of tests to determine: 10 m (ST10) and 20 m (ST20) sprint time, ball release velocity (BRv), countermovement jump (CMJ) height and squat jump (SJ) height. All players also performed an incremental-load bench press test to determine the 1 repetition maximum (1RMest), the load corresponding to maximum mean power (LoadMP), the mean propulsive phase power at LoadMP (PMPPMP) and the peak power at LoadMP (PPEAKMP). Cluster analyses of the test results generated four groupings of players. The variables best able to discriminate physical performance were BRv, ST20, 1RMest, PPEAKMP and PMPPMP. These variables could help coaches identify talent or monitor the physical performance of athletes in their team. Each cluster of players has a particular weakness related to physical performance and therefore, the cluster results can be applied to a specific training programmed based on individual needs. PMID:28149376
Impact of missing data imputation methods on gene expression clustering and classification.
de Souto, Marcilio C P; Jaskowiak, Pablo A; Costa, Ivan G
2015-02-26
Several missing value imputation methods for gene expression data have been proposed in the literature. In the past few years, researchers have been putting a great deal of effort into presenting systematic evaluations of the different imputation algorithms. Initially, most algorithms were assessed with an emphasis on the accuracy of the imputation, using metrics such as the root mean squared error. However, it has become clear that the success of the estimation of the expression value should be evaluated in more practical terms as well. One can consider, for example, the ability of the method to preserve the significant genes in the dataset, or its discriminative/predictive power for classification/clustering purposes. We performed a broad analysis of the impact of five well-known missing value imputation methods on three clustering and four classification methods, in the context of 12 cancer gene expression datasets. We employed a statistical framework, for the first time in this field, to assess whether different imputation methods improve the performance of the clustering/classification methods. Our results suggest that the imputation methods evaluated have a minor impact on the classification and downstream clustering analyses. Simple methods such as replacing the missing values by mean or the median values performed as well as more complex strategies. The datasets analyzed in this study are available at http://costalab.org/Imputation/ .
Arnaud-Haond, Sophie; Moalic, Yann; Barnabé, Christian; Ayala, Francisco José; Tibayrenc, Michel
2014-01-01
Micropathogens (viruses, bacteria, fungi, parasitic protozoa) share a common trait, which is partial clonality, with wide variance in the respective influence of clonality and sexual recombination on the dynamics and evolution of taxa. The discrimination of distinct lineages and the reconstruction of their phylogenetic history are key information to infer their biomedical properties. However, the phylogenetic picture is often clouded by occasional events of recombination across divergent lineages, limiting the relevance of classical phylogenetic analysis and dichotomic trees. We have applied a network analysis based on graph theory to illustrate the relationships among genotypes of Trypanosoma cruzi, the parasitic protozoan responsible for Chagas disease, to identify major lineages and to unravel their past history of divergence and possible recombination events. At the scale of T. cruzi subspecific diversity, graph theory-based networks applied to 22 isoenzyme loci (262 distinct Multi-Locus-Enzyme-Electrophoresis -MLEE) and 19 microsatellite loci (66 Multi-Locus-Genotypes -MLG) fully confirms the high clustering of genotypes into major lineages or "near-clades". The release of the dichotomic constraint associated with phylogenetic reconstruction usually applied to Multilocus data allows identifying putative hybrids and their parental lineages. Reticulate topology suggests a slightly different history for some of the main "near-clades", and a possibly more complex origin for the putative hybrids than hitherto proposed. Finally the sub-network of the near-clade T. cruzi I (28 MLG) shows a clustering subdivision into three differentiated lesser near-clades ("Russian doll pattern"), which confirms the hypothesis recently proposed by other investigators. The present study broadens and clarifies the hypotheses previously obtained from classical markers on the same sets of data, which demonstrates the added value of this approach. This underlines the potential of graph theory-based network analysis for describing the nature and relationships of major pathogens, thereby opening stimulating prospects to unravel the organization, dynamics and history of major micropathogen lineages.
Analysis of PETT images in psychiatric disorders
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brodie, J.D.; Gomez-Mont, F.; Volkow, N.D.
1983-01-01
A quantitative method is presented for studying the pattern of metabolic activity in a set of Positron Emission Transaxial Tomography (PETT) images. Using complex Fourier coefficients as a feature vector for each image, cluster, principal components, and discriminant function analyses are used to empirically describe metabolic differences between control subjects and patients with DSM III diagnosis for schizophrenia or endogenous depression. We also present data on the effects of neuroleptic treatment on the local cerebral metabolic rate of glucose utilization (LCMRGI) in a group of chronic schizophrenics using the region of interest approach. 15 references, 4 figures, 3 tables.
NASA Astrophysics Data System (ADS)
Goodacre, Royston; Rooney, Paul J.; Kell, Douglas B.
1998-04-01
FTIR spectra were obtained from 15 methicillin-resistant and 22 methicillin-susceptible Staphylococcus aureus strains using our DRASTIC approach. Cluster analysis showed that the major source of variation between the IR spectra was not due to their resistance or susceptibility to methicillin; indeed early studies suing pyrolysis mass spectrometry had shown that this unsupervised analysis gave information on the phage group of the bacteria. By contrast, artificial neural networks, based on a supervised learning, could be trained to recognize those aspects of the IR spectra which differentiated methicillin-resistant from methicillin- susceptible strains. These results give the first demonstration that the combination of FTIR with neural networks can provide a very rapid and accurate antibiotic susceptibility testing technique.
NASA Astrophysics Data System (ADS)
Malik, Riffat Naseem; Hashmi, Muhammad Zaffar
2017-10-01
Himalayan foothills streams, Pakistan play an important role in living water supply and irrigation of farmlands; thus, the water quality is closely related to public health. Multivariate techniques were applied to check spatial and seasonal trends, and metals contamination sources of the Himalayan foothills streams, Pakistan. Grab surface water samples were collected from different sites (5-15 cm water depth) in pre-washed polyethylene containers. Fast Sequential Atomic Absorption Spectrophotometer (Varian FSAA-240) was used to measure the metals concentration. Concentrations of Ni, Cu, and Mn were high in pre-monsoon season than the post-monsoon season. Cluster analysis identified impaired, moderately impaired and least impaired clusters based on water parameters. Discriminant function analysis indicated spatial variability in water was due to temperature, electrical conductivity, nitrates, iron and lead whereas seasonal variations were correlated with 16 physicochemical parameters. Factor analysis identified municipal and poultry waste, automobile activities, surface runoff, and soil weathering as major sources of contamination. Levels of Mn, Cr, Fe, Pb, Cd, Zn and alkalinity were above the WHO and USEPA standards for surface water. The results of present study will help to higher authorities for the management of the Himalayan foothills streams.
Anti-inflammatory drugs and prediction of new structures by comparative analysis.
Bartzatt, Ronald
2012-01-01
Nonsteroidal anti-inflammatory drugs (NSAIDs) are a group of agents important for their analgesic, anti-inflammatory, and antipyretic properties. This study presents several approaches to predict and elucidate new molecular structures of NSAIDs based on 36 known and proven anti-inflammatory compounds. Based on 36 known NSAIDs the mean value of Log P is found to be 3.338 (standard deviation= 1.237), mean value of polar surface area is 63.176 Angstroms2 (standard deviation = 20.951 A2), and the mean value of molecular weight is 292.665 (standard deviation = 55.627). Nine molecular properties are determined for these 36 NSAID agents, including Log P, number of -OH and -NHn, violations of Rule of 5, number of rotatable bonds, and number of oxygens and nitrogens. Statistical analysis of these nine molecular properties provides numerical parameters to conform to in the design of novel NSAID drug candidates. Multiple regression analysis is accomplished using these properties of 36 agents followed with examples of predicted molecular weight based on minimum and maximum property values. Hierarchical cluster analysis indicated that licofelone, tolfenamic acid, meclofenamic acid, droxicam, and aspirin are substantially distinct from all remaining NSAIDs. Analysis of similarity (ANOSIM) produced R = 0.4947, which indicates low to moderate level of dissimilarity between these 36 NSAIDs. Non-hierarchical K-means cluster analysis separated the 36 NSAIDs into four groups having members of greatest similarity. Likewise, discriminant analysis divided the 36 agents into two groups indicating the greatest level of distinction (discrimination) based on nine properties. These two multivariate methods together provide investigators a means to compare and elucidate novel drug designs to 36 proven compounds and ascertain to which of those are most analogous in pharmacodynamics. In addition, artificial neural network modeling is demonstrated as an approach to predict numerous molecular properties of new drug designs that is based on neural training from 36 proven NSAIDs. Comprehensive and effective approaches are presented in this study for the design of new NSAID type agents which are so very important for inhibition of COX-2 and COX-1 isoenzymes.
Does the 1H-NMR plasma metabolome reflect the host-tumor interactions in human breast cancer?
Richard, Vincent; Conotte, Raphaël; Mayne, David; Colet, Jean-Marie
2017-07-25
Breast cancer (BC) is the most common diagnosed cancer and the leading cause of cancer death in women worldwide. There is an obvious need for a better understanding of BC biology. Alterations in the serum metabolome of BC patients have been identified but their clinical significance remains elusive. We evaluated by 1H-Nuclear Magnetic Resonance (1H-NMR) spectroscopy, filtered plasma metabolome of 50 early (EBC) and 15 metastatic BC (MBC) patients. Using Principal Component Analysis, Partial Least-Squares Discriminant Analysis and Hierarchical Clustering we show that plasma levels of glucose, lactate, pyruvate, alanine, leucine, isoleucine, glutamate, glutamine, valine, lysine, glycine, threonine, tyrosine, phenylalanine, acetate, acetoacetate, β-hydroxy-butyrate, urea, creatine and creatinine are modulated across patients clusters. In particular lactate levels are inversely correlated with the tumor size in the EBC cohort (Pearson correlation r = -0.309; p = 0.044). We suggest that, in BC patients, tumor cells could induce modulation of the whole patient's metabolism even at early stages. If confirmed in a lager study these observations could be of clinical importance.
Adhikari, S; Biswas, A; Bandyopadhyay, T K; Ghosh, P D
2014-06-01
Pointed gourd (Trichosanthes dioica Roxb.) is an economically important cucurbit and is extensively propagated through vegetative means, viz vine and root cuttings. As the accessions are poorly characterized it is important at the beginning of a breeding programme to discriminate among available genotypes to establish the level of genetic diversity. The genetic diversity of 10 pointed gourd races, referred to as accessions was evaluated. DNA profiling was generated using 10 sequence independent RAPD markers. A total of 58 scorable loci were observed out of which 18 (31.03%) loci were considered polymorphic. Genetic diversity parameters [average and effective number of alleles, Shannon's index, percent polymorphism, Nei's gene diversity, polymorphic information content (PIC)] for RAPD along with UPGMA clustering based on Jaccard's coefficient were estimated. The UPGMA dendogram constructed based on RAPD analysis in 10 pointed gourd accessions were found to be grouped in a single cluster and may represent members of one heterotic group. RAPD analysis showed promise as an effective tool in estimating genetic polymorphism in different accessions of pointed gourd.
Korkmaz, Selcuk; Zararsiz, Gokmen; Goksuluk, Dincer
2015-01-01
Virtual screening is an important step in early-phase of drug discovery process. Since there are thousands of compounds, this step should be both fast and effective in order to distinguish drug-like and nondrug-like molecules. Statistical machine learning methods are widely used in drug discovery studies for classification purpose. Here, we aim to develop a new tool, which can classify molecules as drug-like and nondrug-like based on various machine learning methods, including discriminant, tree-based, kernel-based, ensemble and other algorithms. To construct this tool, first, performances of twenty-three different machine learning algorithms are compared by ten different measures, then, ten best performing algorithms have been selected based on principal component and hierarchical cluster analysis results. Besides classification, this application has also ability to create heat map and dendrogram for visual inspection of the molecules through hierarchical cluster analysis. Moreover, users can connect the PubChem database to download molecular information and to create two-dimensional structures of compounds. This application is freely available through www.biosoft.hacettepe.edu.tr/MLViS/. PMID:25928885
Healthcare managers' leadership profiles in relation to perceptions of work stressors and stress.
Lornudd, Caroline; Bergman, David; Sandahl, Christer; von Thiele Schwarz, Ulrica
2016-05-03
Purpose The purpose of this study is to investigate the relationship between leadership profiles and differences in managers' own levels of work stress symptoms and perceptions of work stressors causing stress. Design/methodology/approach Cross-sectional data were used. Healthcare managers ( n = 188) rated three dimensions of their leadership behavior and levels of work stressors and stress. Hierarchical cluster analysis was performed to identify leadership profiles based on leadership behaviors. Differences in stress-related outcomes between profiles were assessed using one-way analysis of variance. Findings Four distinct clusters of leadership profiles were found. They discriminated in perception of work stressors and stress: the profile distinguished by the lowest mean in all behavior dimensions, exhibited a pattern with significantly more negative ratings compared to the other profiles. Practical implications This paper proposes that leadership profile is an individual factor involved in the stress process, including work stressors and stress, which may inform targeted health promoting interventions for healthcare managers. Originality/value This is the first study to investigate the relationship between leadership profiles and work stressors and stress in healthcare managers.
Comprehensive Biothreat Cluster Identification by PCR/Electrospray-Ionization Mass Spectrometry
Sampath, Rangarajan; Mulholland, Niveen; Blyn, Lawrence B.; Massire, Christian; Whitehouse, Chris A.; Waybright, Nicole; Harter, Courtney; Bogan, Joseph; Miranda, Mary Sue; Smith, David; Baldwin, Carson; Wolcott, Mark; Norwood, David; Kreft, Rachael; Frinder, Mark; Lovari, Robert; Yasuda, Irene; Matthews, Heather; Toleno, Donna; Housley, Roberta; Duncan, David; Li, Feng; Warren, Robin; Eshoo, Mark W.; Hall, Thomas A.; Hofstadler, Steven A.; Ecker, David J.
2012-01-01
Technology for comprehensive identification of biothreats in environmental and clinical specimens is needed to protect citizens in the case of a biological attack. This is a challenge because there are dozens of bacterial and viral species that might be used in a biological attack and many have closely related near-neighbor organisms that are harmless. The biothreat agent, along with its near neighbors, can be thought of as a biothreat cluster or a biocluster for short. The ability to comprehensively detect the important biothreat clusters with resolution sufficient to distinguish the near neighbors with an extremely low false positive rate is required. A technological solution to this problem can be achieved by coupling biothreat group-specific PCR with electrospray ionization mass spectrometry (PCR/ESI-MS). The biothreat assay described here detects ten bacterial and four viral biothreat clusters on the NIAID priority pathogen and HHS/USDA select agent lists. Detection of each of the biothreat clusters was validated by analysis of a broad collection of biothreat organisms and near neighbors prepared by spiking biothreat nucleic acids into nucleic acids extracted from filtered environmental air. Analytical experiments were carried out to determine breadth of coverage, limits of detection, linearity, sensitivity, and specificity. Further, the assay breadth was demonstrated by testing a diverse collection of organisms from each biothreat cluster. The biothreat assay as configured was able to detect all the target organism clusters and did not misidentify any of the near-neighbor organisms as threats. Coupling biothreat cluster-specific PCR to electrospray ionization mass spectrometry simultaneously provides the breadth of coverage, discrimination of near neighbors, and an extremely low false positive rate due to the requirement that an amplicon with a precise base composition of a biothreat agent be detected by mass spectrometry. PMID:22768032
Wang, Quan; Wu, Xianhua; Zhao, Bin; Qin, Jie; Peng, Tingchun
2015-01-01
Understanding spatial and temporal variations in river water quality and quantitatively evaluating the trend of changes are important in order to study and efficiently manage water resources. In this study, an analysis of Water Pollution Index (WPI), Daniel Trend Test, Cluster Analysis and Discriminant Analysis are applied as an integrated approach to quantitatively explore the spatial and temporal variations and the latent sources of water pollution in the Shanchong River basin, Northwest Basin of Lake Fuxian, China. We group all field surveys into 2 clusters (dry season and rainy season). Moreover, 14 sampling sites have been grouped into 3 clusters for the rainy season (highly polluted, moderately polluted and less polluted sites) and 2 clusters for the dry season (highly polluted and less polluted sites) based on their similarities and the level of pollution during the two seasons. The results show that the main trend of pollution was aggravated during the transition from the dry to the rainy season. The Water Pollution Index of Total Nitrogen is the highest of all pollution parameters, whereas the Chemical Oxygen Demand (Chromium) is the lowest. Our results also show that the main sources of pollution are farming activities alongside the Shanchong River, soil erosion and fish culture at Shanchong River reservoir area and domestic sewage from scattered rural residential area. Our results suggest that strategies to prevent water pollutionat the Shanchong River basin need to focus on non-point pollution control by employing appropriate fertilizer formulas in farming, and take the measures of soil and water conservation at Shanchong reservoir area, and purifying sewage from scattered villages.
Wang, Quan; Wu, Xianhua; Zhao, Bin; Qin, Jie; Peng, Tingchun
2015-01-01
Understanding spatial and temporal variations in river water quality and quantitatively evaluating the trend of changes are important in order to study and efficiently manage water resources. In this study, an analysis of Water Pollution Index (WPI), Daniel Trend Test, Cluster Analysis and Discriminant Analysis are applied as an integrated approach to quantitatively explore the spatial and temporal variations and the latent sources of water pollution in the Shanchong River basin, Northwest Basin of Lake Fuxian, China. We group all field surveys into 2 clusters (dry season and rainy season). Moreover, 14 sampling sites have been grouped into 3 clusters for the rainy season (highly polluted, moderately polluted and less polluted sites) and 2 clusters for the dry season (highly polluted and less polluted sites) based on their similarities and the level of pollution during the two seasons. The results show that the main trend of pollution was aggravated during the transition from the dry to the rainy season. The Water Pollution Index of Total Nitrogen is the highest of all pollution parameters, whereas the Chemical Oxygen Demand (Chromium) is the lowest. Our results also show that the main sources of pollution are farming activities alongside the Shanchong River, soil erosion and fish culture at Shanchong River reservoir area and domestic sewage from scattered rural residential area. Our results suggest that strategies to prevent water pollutionat the Shanchong River basin need to focus on non-point pollution control by employing appropriate fertilizer formulas in farming, and take the measures of soil and water conservation at Shanchong reservoir area, and purifying sewage from scattered villages. PMID:25837673
X-Ray Detection of the Cluster Containing the Cepheid S Mus
NASA Astrophysics Data System (ADS)
Evans, Nancy Remage; Pillitteri, Ignazio; Wolk, Scott; Guinan, Edward; Engle, Scott; Bond, Howard E.; Schaefer, Gail H.; Karovska, Margarita; DePasquale, Joseph; Tingle, Evan
2014-04-01
The galactic Cepheid S Muscae has recently been added to the important list of Cepheids linked to open clusters, in this case the sparse young cluster ASCC 69. Low-mass members of a young cluster are expected to have rapid rotation and X-ray activity, making X-ray emission an excellent way to discriminate them from old field stars. We have made an XMM-Newton observation centered on S Mus and identified a population of X-ray sources whose near-IR Two Micron All Sky Survey counterparts lie at locations in the J, (J - K) color-magnitude diagram consistent with cluster membership at the distance of S Mus. Their median energy and X-ray luminosity are consistent with young cluster members as distinct from field stars. These strengthen the association of S Mus with the young cluster, making it a potential Leavitt law (period-luminosity relation) calibrator.
Self-descriptions on LinkedIn: Recruitment or friendship identity?
Garcia, Danilo; Cloninger, Kevin M; Granjard, Alexandre; Molander-Söderholm, Kristian; Amato, Clara; Sikström, Sverker
2018-04-26
We used quantitative semantics to find clusters of words in LinkedIn users' self-descriptions to an employer or a friend. Some of these clusters discriminated between worker and friend conditions (e.g., flexible vs. caring) and between LinkedIn users with high and low education (e.g., analytical vs. messy). © 2018 The Institute of Psychology, Chinese Academy of Sciences and John Wiley & Sons Australia, Ltd.
Li, Jie; Zhang, Ji; Zuo, Zhitian; Huang, Hengyu; Wang, Yuanzhong
2018-05-09
Background : Swertia nervosa (Wall. ex G. Don) C. B. Clarke, a promising traditional herbal medicine for the treatment of liver disorders, is endangered due to its extensive collection and unsustainable harvesting practices. Objective : The aim of this study is to discuss the diversity of metabolites (loganic acid, sweroside, swertiamarin, and gentiopicroside) at different growth stages and organs of Swertia nervosa using the ultra-high-performance LC (UPLC)/UV coupled with chemometric method. Methods : UPLC data, UV data, and data fusion were treated separately to find more useful information by partial least-squares discriminant analysis (PLS-DA). Hierarchical cluster analysis (HCA), an unsupervised method, was then employed for validating the results from PLS-DA. Results : Three strategies displayed different chemical information associated with the sample discrimination. UV information mainly contributed to the classification of different organs; UPLC information was prominently responsible for both organs and growth periods; the data fusion did not perform with apparent superiority compared with single data analysis, although it provided useful information to differentiate leaves that could not be recognized by UPLC. The quantification result showed that the content of swertiamarin was the highest compared with the other three metabolites, especially in leaves at the rooted stage (19.57 ± 5.34 mg/g). Therefore, we speculated that interactive transformations occurred among these four metabolites, facilitated by root formation. Conclusions : This work will contribute to exploitation of bioactive compounds of S. nervosa , as well as its large-scale propagation. Highlights : The roots formation may influence the distribution and accumulation of metabolites.
Jin, Qing; Jiao, Chunyan; Sun, Shiwei; Song, Cheng; Cai, Yongping; Lin, Yi; Fan, Honghong; Zhu, Yanfang
2016-01-01
Metabolomics technology has enabled an important method for the identification and quality control of Traditional Chinese Medical materials. In this study, we isolated metabolites from cultivated Dendrobium officinale and Dendrobium huoshanense stems of different growth years in the methanol/water phase and identified them using gas chromatography coupled with mass spectrometry (GC-MS). First, a metabolomics technology platform for Dendrobium was constructed. The metabolites in the Dendrobium methanol/water phase were mainly sugars and glycosides, amino acids, organic acids, alcohols. D. officinale and D. huoshanense and their growth years were distinguished by cluster analysis in combination with multivariate statistical analysis, including principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA). Eleven metabolites that contributed significantly to this differentiation were subjected to t-tests (P<0.05) to identify biomarkers that discriminate between D. officinale and D. huoshanense, including sucrose, glucose, galactose, succinate, fructose, hexadecanoate, oleanitrile, myo-inositol, and glycerol. Metabolic profiling of the chemical compositions of Dendrobium species revealed that the polysaccharide content of D. huoshanense was higher than that of D. officinale, indicating that the D. huoshanense was of higher quality. Based on the accumulation of Dendrobium metabolites, the optimal harvest time for Dendrobium was in the third year. This initial metabolic profiling platform for Dendrobium provides an important foundation for the further study of secondary metabolites (pharmaceutical active ingredients) and metabolic pathways. PMID:26752292
Jin, Qing; Jiao, Chunyan; Sun, Shiwei; Song, Cheng; Cai, Yongping; Lin, Yi; Fan, Honghong; Zhu, Yanfang
2016-01-01
Metabolomics technology has enabled an important method for the identification and quality control of Traditional Chinese Medical materials. In this study, we isolated metabolites from cultivated Dendrobium officinale and Dendrobium huoshanense stems of different growth years in the methanol/water phase and identified them using gas chromatography coupled with mass spectrometry (GC-MS). First, a metabolomics technology platform for Dendrobium was constructed. The metabolites in the Dendrobium methanol/water phase were mainly sugars and glycosides, amino acids, organic acids, alcohols. D. officinale and D. huoshanense and their growth years were distinguished by cluster analysis in combination with multivariate statistical analysis, including principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA). Eleven metabolites that contributed significantly to this differentiation were subjected to t-tests (P<0.05) to identify biomarkers that discriminate between D. officinale and D. huoshanense, including sucrose, glucose, galactose, succinate, fructose, hexadecanoate, oleanitrile, myo-inositol, and glycerol. Metabolic profiling of the chemical compositions of Dendrobium species revealed that the polysaccharide content of D. huoshanense was higher than that of D. officinale, indicating that the D. huoshanense was of higher quality. Based on the accumulation of Dendrobium metabolites, the optimal harvest time for Dendrobium was in the third year. This initial metabolic profiling platform for Dendrobium provides an important foundation for the further study of secondary metabolites (pharmaceutical active ingredients) and metabolic pathways.
Jenson, David; Bowers, Andrew L.; Harkrider, Ashley W.; Thornton, David; Cuellar, Megan; Saltuklaroglu, Tim
2014-01-01
Activity in anterior sensorimotor regions is found in speech production and some perception tasks. Yet, how sensorimotor integration supports these functions is unclear due to a lack of data examining the timing of activity from these regions. Beta (~20 Hz) and alpha (~10 Hz) spectral power within the EEG μ rhythm are considered indices of motor and somatosensory activity, respectively. In the current study, perception conditions required discrimination (same/different) of syllables pairs (/ba/ and /da/) in quiet and noisy conditions. Production conditions required covert and overt syllable productions and overt word production. Independent component analysis was performed on EEG data obtained during these conditions to (1) identify clusters of μ components common to all conditions and (2) examine real-time event-related spectral perturbations (ERSP) within alpha and beta bands. 17 and 15 out of 20 participants produced left and right μ-components, respectively, localized to precentral gyri. Discrimination conditions were characterized by significant (pFDR < 0.05) early alpha event-related synchronization (ERS) prior to and during stimulus presentation and later alpha event-related desynchronization (ERD) following stimulus offset. Beta ERD began early and gained strength across time. Differences were found between quiet and noisy discrimination conditions. Both overt syllable and word productions yielded similar alpha/beta ERD that began prior to production and was strongest during muscle activity. Findings during covert production were weaker than during overt production. One explanation for these findings is that μ-beta ERD indexes early predictive coding (e.g., internal modeling) and/or overt and covert attentional/motor processes. μ-alpha ERS may index inhibitory input to the premotor cortex from sensory regions prior to and during discrimination, while μ-alpha ERD may index sensory feedback during speech rehearsal and production. PMID:25071633
Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Kusuya, Yoko; Takahashi, Hiroki; Yaguchi, Takashi
2017-04-26
Accurate identification of Aspergillus species is a very important subject. Mass spectral fingerprinting using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is generally employed for the rapid identification of fungal isolates. However, the results are based on simple mass spectral pattern-matching, with no peak assignment and no taxonomic input. We propose here a ribosomal subunit protein (RSP) typing technique using MALDI-TOF MS for the identification and discrimination of Aspergillus species. The results are concluded to be phylogenetic in that they reflect the molecular evolution of housekeeping RSPs. The amino acid sequences of RSPs of genome-sequenced strains of Aspergillus species were first verified and compared to compile a reliable biomarker list for the identification of Aspergillus species. In this process, we revealed that many amino acid sequences of RSPs (about 10-60%, depending on strain) registered in the public protein databases needed to be corrected or newly added. The verified RSPs were allocated to RSP types based on their mass. Peak assignments of RSPs of each sample strain as observed by MALDI-TOF MS were then performed to set RSP type profiles, which were then further processed by means of cluster analysis. The resulting dendrogram based on RSP types showed a relatively good concordance with the tree based on β-tubulin gene sequences. RSP typing was able to further discriminate the strains belonging to Aspergillus section Fumigati. The RSP typing method could be applied to identify Aspergillus species, even for species within section Fumigati. The discrimination power of RSP typing appears to be comparable to conventional β-tubulin gene analysis. This method would therefore be suitable for species identification and discrimination at the strain to species level. Because RSP typing can characterize the strains within section Fumigati, this method has potential as a powerful and reliable tool in the field of clinical microbiology.
Angel, Roey; Nepel, Maximilian; Panhölzl, Christopher; Schmidt, Hannes; Herbold, Craig W.; Eichorst, Stephanie A.; Woebken, Dagmar
2018-01-01
Diazotrophic microorganisms introduce biologically available nitrogen (N) to the global N cycle through the activity of the nitrogenase enzyme. The genetically conserved dinitrogenase reductase (nifH) gene is phylogenetically distributed across four clusters (I–IV) and is widely used as a marker gene for N2 fixation, permitting investigators to study the genetic diversity of diazotrophs in nature and target potential participants in N2 fixation. To date there have been limited, standardized pipelines for analyzing the nifH functional gene, which is in stark contrast to the 16S rRNA gene. Here we present a bioinformatics pipeline for processing nifH amplicon datasets – NifMAP (“NifH MiSeq Illumina Amplicon Analysis Pipeline”), which as a novel aspect uses Hidden-Markov Models to filter out homologous genes to nifH. By using this pipeline, we evaluated the broadly inclusive primer pairs (Ueda19F–R6, IGK3–DVV, and F2–R6) that target the nifH gene. To evaluate any systematic biases, the nifH gene was amplified with the aforementioned primer pairs in a diverse collection of environmental samples (soils, rhizosphere and roots samples, biological soil crusts and estuarine samples), in addition to a nifH mock community consisting of six phylogenetically diverse members. We noted that all primer pairs co-amplified nifH homologs to varying degrees; up to 90% of the amplicons were nifH homologs with IGK3–DVV in some samples (rhizosphere and roots from tall oat-grass). In regards to specificity, we observed some degree of bias across the primer pairs. For example, primer pair F2–R6 discriminated against cyanobacteria (amongst others), yet captured many sequences from subclusters IIIE and IIIL-N. These aforementioned subclusters were largely missing by the primer pair IGK3–DVV, which also tended to discriminate against Alphaproteobacteria, but amplified sequences within clusters IIIC (affiliated with Clostridia) and clusters IVB and IVC. Primer pair Ueda19F–R6 exhibited the least bias and successfully captured diazotrophs in cluster I and subclusters IIIE, IIIL, IIIM, and IIIN, but tended to discriminate against Firmicutes and subcluster IIIC. Taken together, our newly established bioinformatics pipeline, NifMAP, along with our systematic evaluations of nifH primer pairs permit more robust, high-throughput investigations of diazotrophs in diverse environments. PMID:29760683
Khamis, Fathiya M.; Masiga, Daniel K.; Mohamed, Samira A.; Salifu, Daisy; de Meyer, Marc; Ekesi, Sunday
2012-01-01
In 2003, a new fruit fly pest species was recorded for the first time in Kenya and has subsequently been found in 28 countries across tropical Africa. The insect was described as Bactrocera invadens, due to its rapid invasion of the African continent. In this study, the morphometry and DNA Barcoding of different populations of B. invadens distributed across the species range of tropical Africa and a sample from the pest's putative aboriginal home of Sri Lanka was investigated. Morphometry using wing veins and tibia length was used to separate B. invadens populations from other closely related Bactrocera species. The Principal component analysis yielded 15 components which correspond to the 15 morphometric measurements. The first two principal axes contributed to 90.7% of the total variance and showed partial separation of these populations. Canonical discriminant analysis indicated that only the first five canonical variates were statistically significant. The first two canonical variates contributed a total of 80.9% of the total variance clustering B. invadens with other members of the B. dorsalis complex while distinctly separating B. correcta, B. cucurbitae, B. oleae and B. zonata. The largest Mahalanobis squared distance (D2 = 122.9) was found to be between B. cucurbitae and B. zonata, while the lowest was observed between B. invadens populations against B. kandiensis (8.1) and against B. dorsalis s.s (11.4). Evolutionary history inferred by the Neighbor-Joining method clustered the Bactrocera species populations into four clusters. First cluster consisted of the B. dorsalis complex (B. invadens, B. kandiensis and B. dorsalis s. s.), branching from the same node while the second group was paraphyletic clades of B. correcta and B. zonata. The last two are monophyletic clades, consisting of B. cucurbitae and B. oleae, respectively. Principal component analysis using the genetic distances confirmed the clustering inferred by the NJ tree. PMID:23028649
Gad, Haidy A; El-Ahmady, Sherweit H; Abou-Shoer, Mohamed I; Al-Azizi, Mohamed M
2013-01-01
Recently, the fields of chemometrics and multivariate analysis have been widely implemented in the quality control of herbal drugs to produce precise results, which is crucial in the field of medicine. Thyme represents an essential medicinal herb that is constantly adulterated due to its resemblance to many other plants with similar organoleptic properties. To establish a simple model for the quality assessment of Thymus species using UV spectroscopy together with known chemometric techniques. The success of this model may also serve as a technique for the quality control of other herbal drugs. The model was constructed using 30 samples of authenticated Thymus vulgaris and challenged with 20 samples of different botanical origins. The methanolic extracts of all samples were assessed using UV spectroscopy together with chemometric techniques: principal component analysis (PCA), soft independent modeling of class analogy (SIMCA) and hierarchical cluster analysis (HCA). The model was able to discriminate T. vulgaris from other Thymus, Satureja, Origanum, Plectranthus and Eriocephalus species, all traded in the Egyptian market as different types of thyme. The model was also able to classify closely related species in clusters using PCA and HCA. The model was finally used to classify 12 commercial thyme varieties into clusters of species incorporated in the model as thyme or non-thyme. The model constructed is highly recommended as a simple and efficient method for distinguishing T. vulgaris from other related species as well as the classification of marketed herbs as thyme or non-thyme. Copyright © 2013 John Wiley & Sons, Ltd.
Colorimetric sensing of anions in water using ratiometric indicator-displacement assay.
Feng, Liang; Li, Hui; Li, Xiao; Chen, Liang; Shen, Zheng; Guan, Yafeng
2012-09-19
The analysis of anions in water presents a difficult challenge due to their low charge-to-radius ratio, and the ability to discriminate among similar anions often remains problematic. The use of a 3×6 ratiometric indicator-displacement assay (RIDA) array for the colorimetric detection and identification of ten anions in water is reported. The sensor array consists of different combinations of colorimetric indicators and metal cations. The colorimetric indicators chelate with metal cations, forming the color changes. Upon the addition of anions, anions compete with the indicator ligands according to solubility product constants (K(sp)). The indicator-metal chelate compound changes color back dramatically when the competition of anions wins. The color changes of the RIDA array were used as a digital representation of the array response and analyzed with standard statistical methods, including principal component analysis and hierarchical clustering analysis. No confusion or errors in classification by hierarchical clustering analysis were observed in 44 trials. The limit of detection was calculated approximately, and most limits of detections of anions are well below μM level using our RIDA array. The pH effect, temperature influence, interfering anions were also investigated, and the RIDA array shows the feasibility of real sample testing. Copyright © 2012 Elsevier B.V. All rights reserved.
An exploratory study of organization design configurations in health care delivery organizations.
Sheppeck, Mick; Militello, Jack
2014-01-01
Organizations are configurations of variables that support each other to achieve customer satisfaction. Based on Treacy and Wiersema (1995), we predicted the emergence of two configurations, one supporting a product leadership stance and one predicting the customer intimate approach from a set of 73 for profit health care clinics. In addition, we predicted the emergence of a configuration where the scores on most variables were near the mean for each variable. Using cluster analysis and discriminant function analysis, we identified three configurations: one a "master of two" strategy, one "stuck-in-the-middle," and one showing scores well below the mean on most variables. The implications for organization design and manager actions in the health care industry are discussed.
Kmeans-ICA based automatic method for ocular artifacts removal in a motorimagery classification.
Bou Assi, Elie; Rihana, Sandy; Sawan, Mohamad
2014-01-01
Electroencephalogram (EEG) recordings aroused as inputs of a motor imagery based BCI system. Eye blinks contaminate the spectral frequency of the EEG signals. Independent Component Analysis (ICA) has been already proved for removing these artifacts whose frequency band overlap with the EEG of interest. However, already ICA developed methods, use a reference lead such as the ElectroOculoGram (EOG) to identify the ocular artifact components. In this study, artifactual components were identified using an adaptive thresholding by means of Kmeans clustering. The denoised EEG signals have been fed into a feature extraction algorithm extracting the band power, the coherence and the phase locking value and inserted into a linear discriminant analysis classifier for a motor imagery classification.
NASA Astrophysics Data System (ADS)
Yamashita, S.; Nakajo, T.; Naruse, H.
2009-12-01
In this study, we statistically classified the grain size distribution of the bottom surface sediment on a microtidal sand flat to analyze the depositional processes of the sediment. Multiple classification analysis revealed that two types of sediment populations exist in the bottom surface sediment. Then, we employed the sediment trend model developed by Gao and Collins (1992) for the estimation of sediment transport pathways. As a result, we found that statistical discrimination of the bottom surface sediment provides useful information for the sediment trend model while dealing with various types of sediment transport processes. The microtidal sand flat along the Kushida River estuary, Ise Bay, central Japan, was investigated, and 102 bottom surface sediment samples were obtained. Then, their grain size distribution patterns were measured by the settling tube method, and each grain size distribution parameter (mud and gravel contents, mean grain size, coefficient of variance (CV), skewness, kurtosis, 5, 25, 50, 75, and 95 percentile) was calculated. Here, CV is the normalized sorting value divided by the mean grain size. Two classical statistical methods—principal component analysis (PCA) and fuzzy cluster analysis—were applied. The results of PCA showed that the bottom surface sediment of the study area is mainly characterized by grain size (mean grain size and 5-95 percentile) and the CV value, indicating predominantly large absolute values of factor loadings in primal component (PC) 1. PC1 is interpreted as being indicative of the grain-size trend, in which a finer grain-size distribution indicates better size sorting. The frequency distribution of PC1 has a bimodal shape and suggests the existence of two types of sediment populations. Therefore, we applied fuzzy cluster analysis, the results of which revealed two groupings of the sediment (Cluster 1 and Cluster 2). Cluster 1 shows a lower value of PC1, indicating coarse and poorly sorted sediments. Cluster 1 sediments are distributed around the branched channel from Kushida River and show an expanding distribution from the river mouth toward the northeast direction. Cluster 2 shows a higher value of PC1, indicating fine and well-sorted sediments; this cluster is distributed in a distant area from the river mouth, including the offshore region. Therefore, Cluster 1 and Cluster 2 are interpreted as being deposited by fluvial and wave processes, respectively. Finally, on the basis of this distribution pattern, the sediment trend model was applied in areas dominated separately by fluvial and wave processes. Resultant sediment transport patterns showed good agreement with those obtained by field observations. The results of this study provide an important insight into the numerical models of sediment transport.
Demir, Özlem; Baronio, Roberta; Salehi, Faezeh; Wassman, Christopher D.; Hall, Linda; Hatfield, G. Wesley; Chamberlin, Richard; Kaiser, Peter; Lathrop, Richard H.; Amaro, Rommie E.
2011-01-01
The tumor suppressor protein p53 can lose its function upon single-point missense mutations in the core DNA-binding domain (“cancer mutants”). Activity can be restored by second-site suppressor mutations (“rescue mutants”). This paper relates the functional activity of p53 cancer and rescue mutants to their overall molecular dynamics (MD), without focusing on local structural details. A novel global measure of protein flexibility for the p53 core DNA-binding domain, the number of clusters at a certain RMSD cutoff, was computed by clustering over 0.7 µs of explicitly solvated all-atom MD simulations. For wild-type p53 and a sample of p53 cancer or rescue mutants, the number of clusters was a good predictor of in vivo p53 functional activity in cell-based assays. This number-of-clusters (NOC) metric was strongly correlated (r2 = 0.77) with reported values of experimentally measured ΔΔG protein thermodynamic stability. Interpreting the number of clusters as a measure of protein flexibility: (i) p53 cancer mutants were more flexible than wild-type protein, (ii) second-site rescue mutations decreased the flexibility of cancer mutants, and (iii) negative controls of non-rescue second-site mutants did not. This new method reflects the overall stability of the p53 core domain and can discriminate which second-site mutations restore activity to p53 cancer mutants. PMID:22028641
NASA Astrophysics Data System (ADS)
Shi, Yue; Huang, Wenjiang; Zhou, Xianfeng
2017-04-01
Hyperspectral absorption features are important indicators of characterizing plant biophysical variables for the automatic diagnosis of crop diseases. Continuous wavelet analysis has proven to be an advanced hyperspectral analysis technique for extracting absorption features; however, specific wavelet features (WFs) and their relationship with pathological characteristics induced by different infestations have rarely been summarized. The aim of this research is to determine the most sensitive WFs for identifying specific pathological lesions from yellow rust and powdery mildew in winter wheat, based on 314 hyperspectral samples measured in field experiments in China in 2002, 2003, 2005, and 2012. The resultant WFs could be used as proxies to capture the major spectral absorption features caused by infestation of yellow rust or powdery mildew. Multivariate regression analysis based on these WFs outperformed conventional spectral features in disease detection; meanwhile, a Fisher discrimination model exhibited considerable potential for generating separable clusters for each infestation. Optimal classification returned an overall accuracy of 91.9% with a Kappa of 0.89. This paper also emphasizes the WFs and their relationship with pathological characteristics in order to provide a foundation for the further application of this approach in monitoring winter wheat diseases at the regional scale.
Velasco-Tapia, Fernando
2014-01-01
Magmatic processes have usually been identified and evaluated using qualitative or semiquantitative geochemical or isotopic tools based on a restricted number of variables. However, a more complete and quantitative view could be reached applying multivariate analysis, mass balance techniques, and statistical tests. As an example, in this work a statistical and quantitative scheme is applied to analyze the geochemical features for the Sierra de las Cruces (SC) volcanic range (Mexican Volcanic Belt). In this locality, the volcanic activity (3.7 to 0.5 Ma) was dominantly dacitic, but the presence of spheroidal andesitic enclaves and/or diverse disequilibrium features in majority of lavas confirms the operation of magma mixing/mingling. New discriminant-function-based multidimensional diagrams were used to discriminate tectonic setting. Statistical tests of discordancy and significance were applied to evaluate the influence of the subducting Cocos plate, which seems to be rather negligible for the SC magmas in relation to several major and trace elements. A cluster analysis following Ward's linkage rule was carried out to classify the SC volcanic rocks geochemical groups. Finally, two mass-balance schemes were applied for the quantitative evaluation of the proportion of the end-member components (dacitic and andesitic magmas) in the comingled lavas (binary mixtures).
Anomaly metrics to differentiate threat sources from benign sources in primary vehicle screening.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cohen, Israel Dov; Mengesha, Wondwosen
2011-09-01
Discrimination of benign sources from threat sources at Port of Entries (POE) is of a great importance in efficient screening of cargo and vehicles using Radiation Portal Monitors (RPM). Currently RPM's ability to distinguish these radiological sources is seriously hampered by the energy resolution of the deployed RPMs. As naturally occurring radioactive materials (NORM) are ubiquitous in commerce, false alarms are problematic as they require additional resources in secondary inspection in addition to impacts on commerce. To increase the sensitivity of such detection systems without increasing false alarm rates, alarm metrics need to incorporate the ability to distinguish benign andmore » threat sources. Principal component analysis (PCA) and clustering technique were implemented in the present study. Such techniques were investigated for their potential to lower false alarm rates and/or increase sensitivity to weaker threat sources without loss of specificity. Results of the investigation demonstrated improved sensitivity and specificity in discriminating benign sources from threat sources.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kempson, Ivan M.; Henry, Dermot; Francis, James
Advanced analytical techniques have been used to characterize arsenic in taxidermy specimens. Arsenic was examined to aid in discriminating its use as a preservative from that incorporated by ingestion and hence indicate poisoning (in the case of historical figures). The results are relevant to museum curators, occupational and environmental exposure concerns, toxicological and anthropological investigations. Hair samples were obtained from six taxidermy specimens preserved with arsenic in the late 1800s and early 1900s to investigate the arsenic incorporation. The presence of arsenic poses a potential hazard in museum and private collections. For one sample, arsenic was confirmed to be presentmore » on the hair with time-of-flight secondary ion mass spectrometry and then measured with neutron activation analysis to comprise 176 {mu}g g{sup -1}. The hair cross section was analysed with synchrotron micro-X-ray fluorescence to investigate the transverse distribution of topically applied arsenic. It was found that the arsenic had significantly penetrated all hair samples. Association with melanin clusters and the medulla was observed. Lead and mercury were also identified in one sample. X-ray absorption near-edge spectroscopy of the As K-edge indicated that an arsenate species predominantly existed in all samples; however, analysis was hindered by very rapid photoreduction of the arsenic. It would be difficult to discriminate arsenic consumption from topically applied arsenic based on the physical transverse distribution. Longitudinal distributions and chemical speciation may still allow differentiation.« less
Phylogenetic Relationships of Citrus and Its Relatives Based on matK Gene Sequences
Penjor, Tshering; Uehara, Miki; Ide, Manami; Matsumoto, Natsumi; Matsumoto, Ryoji
2013-01-01
The genus Citrus includes mandarin, orange, lemon, grapefruit and lime, which have high economic and nutritional value. The family Rutaceae can be divided into 7 subfamilies, including Aurantioideae. The genus Citrus belongs to the subfamily Aurantioideae. In this study, we sequenced the chloroplast matK genes of 135 accessions from 22 genera of Aurantioideae and analyzed them phylogenetically. Our study includes many accessions that have not been examined in other studies. The subfamily Aurantioideae has been classified into 2 tribes, Clauseneae and Citreae, and our current molecular analysis clearly discriminate Citreae from Clauseneae by using only 1 chloroplast DNA sequence. Our study confirms previous observations on the molecular phylogeny of Aurantioideae in many aspects. However, we have provided novel information on these genetic relationships. For example, inconsistent with the previous observation, and consistent with our preliminary study using the chloroplast rbcL genes, our analysis showed that Feroniella oblata is not nested in Citrus species and is closely related with Feronia limonia. Furthermore, we have shown that Murraya paniculata is similar to Merrillia caloxylon and is dissimilar to Murraya koenigii. We found that “true citrus fruit trees” could be divided into 2 subclusters. One subcluster included Citrus, Fortunella, and Poncirus, while the other cluster included Microcitrus and Eremocitrus. Compared to previous studies, our current study is the most extensive phylogenetic study of Citrus species since it includes 93 accessions. The results indicate that Citrus species can be classified into 3 clusters: a citron cluster, a pummelo cluster, and a mandarin cluster. Although most mandarin accessions belonged to the mandarin cluster, we found some exceptions. We also obtained the information on the genetic background of various species of acid citrus grown in Japan. Because the genus Citrus contains many important accessions, we have comprehensively discussed the classification of this genus. PMID:23638116
NASA Technical Reports Server (NTRS)
La Duc, Myron T.; Satomi, Masataka; Agata, Norio; Venkateswaran, Kasthuri
2004-01-01
Bacillus anthracis, the causative agent of the human disease anthrax, Bacillus cereus, a food-borne pathogen capable of causing human illness, and Bacillus thuringiensis, a well-characterized insecticidal toxin producer, all cluster together within a very tight clade (B. cereus group) phylogenetically and are indistinguishable from one another via 16S rDNA sequence analysis. As new pathogens are continually emerging, it is imperative to devise a system capable of rapidly and accurately differentiating closely related, yet phenotypically distinct species. Although the gyrB gene has proven useful in discriminating closely related species, its sequence analysis has not yet been validated by DNA:DNA hybridization, the taxonomically accepted "gold standard". We phylogenetically characterized the gyrB sequences of various species and serotypes encompassed in the "B. cereus group," including lab strains and environmental isolates. Results were compared to those obtained from analyses of phenotypic characteristics, 16S rDNA sequence, DNA:DNA hybridization, and virulence factors. The gyrB gene proved more highly differential than 16S, while, at the same time, as analytical as costly and laborious DNA:DNA hybridization techniques in differentiating species within the B. cereus group.
NASA Astrophysics Data System (ADS)
Jacob, Rinku; Harikrishnan, K. P.; Misra, R.; Ambika, G.
2018-01-01
Recurrence networks and the associated statistical measures have become important tools in the analysis of time series data. In this work, we test how effective the recurrence network measures are in analyzing real world data involving two main types of noise, white noise and colored noise. We use two prominent network measures as discriminating statistic for hypothesis testing using surrogate data for a specific null hypothesis that the data is derived from a linear stochastic process. We show that the characteristic path length is especially efficient as a discriminating measure with the conclusions reasonably accurate even with limited number of data points in the time series. We also highlight an additional advantage of the network approach in identifying the dimensionality of the system underlying the time series through a convergence measure derived from the probability distribution of the local clustering coefficients. As examples of real world data, we use the light curves from a prominent black hole system and show that a combined analysis using three primary network measures can provide vital information regarding the nature of temporal variability of light curves from different spectroscopic classes.
NASA Astrophysics Data System (ADS)
Liu, Dan; Li, Yong-Guo; Xu, Hong; Sun, Su-Qin; Wang, Zheng-Tao
2008-07-01
Ginseng is one of the most widely used herbal medicines. Based on the grown environments and the cultivate method, three kinds of ginseng, Cultivated Ginseng (CG), Mountain Cultivated Ginseng (MCG) and Mountain Wild Ginseng (MWG) are classified. A novel and scientific-oriented method was developed and established to discriminate and identify three kinds of ginseng using Fourier transform infrared spectroscopy (FT-IR), secondary derivative IR spectra and two-dimensional correlation infrared spectroscopy (2D-IR). The findings indicated that the relative contents of starch in the CG were more than that in MCG and MWG, while the relative contents of calcium oxalate and lipids in MWG were more than that in CG and MCG, and the relative contents of fatty acid in MCG were more than that in CG and MWG. The hierarchical cluster analysis was applied to data analysis of MWG, CG and MWG, which could be classified successfully. The results demonstrated the macroscopic IR fingerprint method, including FT-IR, secondary derivative IR and 2D-IR, can be applied to discriminate different ginsengs rapidly, effectively and non-destructively.
Tomazzoli, Maíra M; Pai Neto, Remi D; Moresco, Rodolfo; Westphal, Larissa; Zeggio, Amelia R S; Specht, Leandro; Costa, Christopher; Rocha, Miguel; Maraschin, Marcelo
2015-12-01
Propolis is a chemically complex biomass produced by honeybees (Apis mellifera) from plant resins added of salivary enzymes, beeswax, and pollen. The biological activities described for propolis were also identified for donor plant's resin, but a big challenge for the standardization of the chemical composition and biological effects of propolis remains on a better understanding of the influence of seasonality on the chemical constituents of that raw material. Since propolis quality depends, among other variables, on the local flora which is strongly influenced by (a)biotic factors over the seasons, to unravel the harvest season effect on the propolis chemical profile is an issue of recognized importance. For that, fast, cheap, and robust analytical techniques seem to be the best choice for large scale quality control processes in the most demanding markets, e.g., human health applications. For that, UV-Visible (UV-Vis) scanning spectrophotometry of hydroalcoholic extracts (HE) of seventy-three propolis samples, collected over the seasons in 2014 (summer, spring, autumn, and winter) and 2015 (summer and autumn) in Southern Brazil was adopted. Further machine learning and chemometrics techniques were applied to the UV-Vis dataset aiming to gain insights as to the seasonality effect on the claimed chemical heterogeneity of propolis samples determined by changes in the flora of the geographic region under study. Descriptive and classification models were built following a chemometric approach, i.e. principal component analysis (PCA) and hierarchical clustering analysis (HCA) supported by scripts written in the R language. The UV-Vis profiles associated with chemometric analysis allowed identifying a typical pattern in propolis samples collected in the summer. Importantly, the discrimination based on PCA could be improved by using the dataset of the fingerprint region of phenolic compounds ( λ= 280-400 ηm), suggesting that besides the biological activities of those secondary metabolites, they also play a relevant role for the discrimination and classification of that complex matrix through bioinformatics tools. Finally, a series of machine learning approaches, e.g., partial least square-discriminant analysis (PLS-DA), k-Nearest Neighbors (kNN), and Decision Trees showed to be complementary to PCA and HCA, allowing to obtain relevant information as to the sample discrimination.
Tomazzoli, Maíra Maciel; Pai Neto, Remi Dal; Moresco, Rodolfo; Westphal, Larissa; Zeggio, Amélia Regina Somensi; Specht, Leandro; Costa, Christopher; Rocha, Miguel; Maraschin, Marcelo
2015-10-21
Propolis is a chemically complex biomass produced by honeybees (Apis mellifera) from plant resins added of salivary enzymes, beeswax, and pollen. The biological activities described for propolis were also identified for donor plant's resin, but a big challenge for the standardization of the chemical composition and biological effects of propolis remains on a better understanding of the influence of seasonality on the chemical constituents of that raw material. Since propolis quality depends, among other variables, on the local flora which is strongly influenced by (a)biotic factors over the seasons, to unravel the harvest season effect on the propolis' chemical profile is an issue of recognized importance. For that, fast, cheap, and robust analytical techniques seem to be the best choice for large scale quality control processes in the most demanding markets, e.g., human health applications. For that, UV-Visible (UV-Vis) scanning spectrophotometry of hydroalcoholic extracts (HE) of seventy-three propolis samples, collected over the seasons in 2014 (summer, spring, autumn, and winter) and 2015 (summer and autumn) in Southern Brazil was adopted. Further machine learning and chemometrics techniques were applied to the UV-Vis dataset aiming to gain insights as to the seasonality effect on the claimed chemical heterogeneity of propolis samples determined by changes in the flora of the geographic region under study. Descriptive and classification models were built following a chemometric approach, i.e. principal component analysis (PCA) and hierarchical clustering analysis (HCA) supported by scripts written in the R language. The UV-Vis profiles associated with chemometric analysis allowed identifying a typical pattern in propolis samples collected in the summer. Importantly, the discrimination based on PCA could be improved by using the dataset of the fingerprint region of phenolic compounds (λ = 280-400ηm), suggesting that besides the biological activities of those secondary metabolites, they also play a relevant role for the discrimination and classification of that complex matrix through bioinformatics tools. Finally, a series of machine learning approaches, e.g., partial least square-discriminant analysis (PLS-DA), k-Nearest Neighbors (kNN), and Decision Trees showed to be complementary to PCA and HCA, allowing to obtain relevant information as to the sample discrimination.
Petrofacies Analysis - A Petrophysical Tool for Geologic/Engineering Reservoir Characterization
Watney, W.L.; Guy, W.J.; Doveton, J.H.; Bhattacharya, S.; Gerlach, P.M.; Bohling, Geoffrey C.; Carr, T.R.
1998-01-01
Petrofacies analysis is defined as the characterization and classification of pore types and fluid saturations as revealed by petrophysical measurements of a reservoir. The word "petrofacies" makes an explicit link between petroleum engineers' concerns with pore characteristics as arbiters of production performance and the facies paradigm of geologists as a methodology for genetic understanding and prediction. In petrofacies analysis, the porosity and resistivity axes of the classical Pickett plot are used to map water saturation, bulk volume water, and estimated permeability, as well as capillary pressure information where it is available. When data points are connected in order of depth within a reservoir, the characteristic patterns reflect reservoir rock character and its interplay with the hydrocarbon column. A third variable can be presented at each point on the crossplot by assigning a color scale that is based on other well logs, often gamma ray or photoelectric effect, or other derived variables. Contrasts between reservoir pore types and fluid saturations are reflected in changing patterns on the crossplot and can help discriminate and characterize reservoir heterogeneity. Many hundreds of analyses of well logs facilitated by spreadsheet and object-oriented programming have provided the means to distinguish patterns typical of certain complex pore types (size and connectedness) for sandstones and carbonate reservoirs, occurrences of irreducible water saturation, and presence of transition zones. The result has been an improved means to evaluate potential production, such as bypassed pay behind pipe and in old exploration wells, or to assess zonation and continuity of the reservoir. Petrofacies analysis in this study was applied to distinguishing flow units and including discriminating pore type as an assessment of reservoir conformance and continuity. The analysis is facilitated through the use of colorimage cross sections depicting depositional sequences, natural gamma ray, porosity, and permeability. Also, cluster analysis was applied to discriminate petrophysically similar reservoir rock.
Quantitative diagnosis of bladder cancer by morphometric analysis of HE images
NASA Astrophysics Data System (ADS)
Wu, Binlin; Nebylitsa, Samantha V.; Mukherjee, Sushmita; Jain, Manu
2015-02-01
In clinical practice, histopathological analysis of biopsied tissue is the main method for bladder cancer diagnosis and prognosis. The diagnosis is performed by a pathologist based on the morphological features in the image of a hematoxylin and eosin (HE) stained tissue sample. This manuscript proposes algorithms to perform morphometric analysis on the HE images, quantify the features in the images, and discriminate bladder cancers with different grades, i.e. high grade and low grade. The nuclei are separated from the background and other types of cells such as red blood cells (RBCs) and immune cells using manual outlining, color deconvolution and image segmentation. A mask of nuclei is generated for each image for quantitative morphometric analysis. The features of the nuclei in the mask image including size, shape, orientation, and their spatial distributions are measured. To quantify local clustering and alignment of nuclei, we propose a 1-nearest-neighbor (1-NN) algorithm which measures nearest neighbor distance and nearest neighbor parallelism. The global distributions of the features are measured using statistics of the proposed parameters. A linear support vector machine (SVM) algorithm is used to classify the high grade and low grade bladder cancers. The results show using a particular group of nuclei such as large ones, and combining multiple parameters can achieve better discrimination. This study shows the proposed approach can potentially help expedite pathological diagnosis by triaging potentially suspicious biopsies.
Nonlinear dimensionality reduction methods for synthetic biology biobricks' visualization.
Yang, Jiaoyun; Wang, Haipeng; Ding, Huitong; An, Ning; Alterovitz, Gil
2017-01-19
Visualizing data by dimensionality reduction is an important strategy in Bioinformatics, which could help to discover hidden data properties and detect data quality issues, e.g. data noise, inappropriately labeled data, etc. As crowdsourcing-based synthetic biology databases face similar data quality issues, we propose to visualize biobricks to tackle them. However, existing dimensionality reduction methods could not be directly applied on biobricks datasets. Hereby, we use normalized edit distance to enhance dimensionality reduction methods, including Isomap and Laplacian Eigenmaps. By extracting biobricks from synthetic biology database Registry of Standard Biological Parts, six combinations of various types of biobricks are tested. The visualization graphs illustrate discriminated biobricks and inappropriately labeled biobricks. Clustering algorithm K-means is adopted to quantify the reduction results. The average clustering accuracy for Isomap and Laplacian Eigenmaps are 0.857 and 0.844, respectively. Besides, Laplacian Eigenmaps is 5 times faster than Isomap, and its visualization graph is more concentrated to discriminate biobricks. By combining normalized edit distance with Isomap and Laplacian Eigenmaps, synthetic biology biobircks are successfully visualized in two dimensional space. Various types of biobricks could be discriminated and inappropriately labeled biobricks could be determined, which could help to assess crowdsourcing-based synthetic biology databases' quality, and make biobricks selection.
Trace Element Study of H Chondrites: Evidence for Meteoroid Streams.
NASA Astrophysics Data System (ADS)
Wolf, Stephen Frederic
1993-01-01
Multivariate statistical analyses, both linear discriminant analysis and logistic regression, of the volatile trace elemental concentrations in H4-6 chondrites reveal compositionally distinguishable subpopulations. Observed difference in volatile trace element composition between Antarctic and non-Antarctic H4-6 chondrites (Lipschutz and Samuels, 1991) can be explained by a compositionaily distinct subpopulation found in Victoria Land, Antarctica. This population of H4-6 chondrites is compositionally distinct from non-Antarctic H4-6 chondrites and from Antarctic H4 -6 chondrites from Queen Maud Land. Comparisons of Queen Maud Land H4-6 chondrites with non-Antarctic H4-6 chondrites do not give reason to believe that these two populations are distinguishable from each other on the basis of the ten volatile trace element concentrations measured. ANOVA indicates that these differences are not the result of trivial causes such as weathering and analytical bias. Thermoluminescence properties of these populations parallels the results of volatile trace element comparisons. Given the differences in terrestrial age between Victoria Land, Queen Maud Land, and modern H4-6 chondrite falls, these results are consistent with a variation in H4-6 chondrite flux on a 300 ky timescale. This conclusion requires the existence of co-orbital meteoroid streams. Statistical analyses of the volatile trace elemental concentrations in non-Antarctic modern falls of H4-6 chondrites also demonstrate that a group of 13 H4-6 chondrites, Cluster 1, selected exclusively for their distinct fall parameters (Dodd, 1992) is compositionally distinguishable from a control group of 45 non-Antarctic modern H4-6 chondrites on the basis of the ten volatile trace element concentrations measured. Model-independent randomization-simulations based on both linear discriminant analysis and logistic regression verify these results. While ANOVA identifies two possible causes for this difference, analytical bias and group classification, a test validation experiment verifies that group classification is the more significant cause of compositional difference between Cluster 1 and non-Cluster 1 modern H4-6 chondrite falls. Thermoluminescence properties of these populations parallels the results of volatile trace element comparisons. This suggests that these meteorites are fragments of a co-orbital meteorite stream derived from a single parent body.
The interest of gait markers in the identification of subgroups among fibromyalgia patients.
Auvinet, Bernard; Chaleil, Denis; Cabane, Jean; Dumolard, Anne; Hatron, Pierre; Juvin, Robert; Lanteri-Minet, Michel; Mainguy, Yves; Negre-Pages, Laurence; Pillard, Fabien; Riviere, Daniel; Maugars, Yves-Michel
2011-11-11
Fibromyalgia (FM) is a heterogeneous syndrome and its classification into subgroups calls for broad-based discussion. FM subgrouping, which aims to adapt treatment according to different subgroups, relies in part, on psychological and cognitive dysfunctions. Since motor control of gait is closely related to cognitive function, we hypothesized that gait markers could be of interest in the identification of FM patients' subgroups. This controlled study aimed at characterizing gait disorders in FM, and subgrouping FM patients according to gait markers such as stride frequency (SF), stride regularity (SR), and cranio-caudal power (CCP) which measures kinesia. A multicentre, observational open trial enrolled patients with primary FM (44.1 ± 8.1 y), and matched controls (44.1 ± 7.3 y). Outcome measurements and gait analyses were available for 52 pairs. A 3-step statistical analysis was carried out. A preliminary single blind analysis using k-means cluster was performed as an initial validation of gait markers. Then in order to quantify FM patients according to psychometric and gait variables an open descriptive analysis comparing patients and controls were made, and correlations between gait variables and main outcomes were calculated. Finally using cluster analysis, we described subgroups for each gait variable and looked for significant differences in self-reported assessments. SF was the most discriminating gait variable (73% of patients and controls). SF, SR, and CCP were different between patients and controls. There was a non-significant association between SF, FIQ and physical components from Short-Form 36 (p = 0.06). SR was correlated to FIQ (p = 0.01) and catastrophizing (p = 0.05) while CCP was correlated to pain (p = 0.01). The SF cluster identified 3 subgroups with a particular one characterized by normal SF, low pain, high activity and hyperkinesia. The SR cluster identified 2 distinct subgroups: the one with a reduced SR was distinguished by high FIQ, poor coping and altered affective status. Gait analysis may provide additional information in the identification of subgroups among fibromyalgia patients. Gait analysis provided relevant information about physical and cognitive status, and pain behavior. Further studies are needed to better understand gait analysis implications in FM.
The interest of gait markers in the identification of subgroups among fibromyalgia patients
2011-01-01
Background Fibromyalgia (FM) is a heterogeneous syndrome and its classification into subgroups calls for broad-based discussion. FM subgrouping, which aims to adapt treatment according to different subgroups, relies in part, on psychological and cognitive dysfunctions. Since motor control of gait is closely related to cognitive function, we hypothesized that gait markers could be of interest in the identification of FM patients' subgroups. This controlled study aimed at characterizing gait disorders in FM, and subgrouping FM patients according to gait markers such as stride frequency (SF), stride regularity (SR), and cranio-caudal power (CCP) which measures kinesia. Methods A multicentre, observational open trial enrolled patients with primary FM (44.1 ± 8.1 y), and matched controls (44.1 ± 7.3 y). Outcome measurements and gait analyses were available for 52 pairs. A 3-step statistical analysis was carried out. A preliminary single blind analysis using k-means cluster was performed as an initial validation of gait markers. Then in order to quantify FM patients according to psychometric and gait variables an open descriptive analysis comparing patients and controls were made, and correlations between gait variables and main outcomes were calculated. Finally using cluster analysis, we described subgroups for each gait variable and looked for significant differences in self-reported assessments. Results SF was the most discriminating gait variable (73% of patients and controls). SF, SR, and CCP were different between patients and controls. There was a non-significant association between SF, FIQ and physical components from Short-Form 36 (p = 0.06). SR was correlated to FIQ (p = 0.01) and catastrophizing (p = 0.05) while CCP was correlated to pain (p = 0.01). The SF cluster identified 3 subgroups with a particular one characterized by normal SF, low pain, high activity and hyperkinesia. The SR cluster identified 2 distinct subgroups: the one with a reduced SR was distinguished by high FIQ, poor coping and altered affective status. Conclusion Gait analysis may provide additional information in the identification of subgroups among fibromyalgia patients. Gait analysis provided relevant information about physical and cognitive status, and pain behavior. Further studies are needed to better understand gait analysis implications in FM. PMID:22078002
NASA Astrophysics Data System (ADS)
Kathiravan, K.; Natesan, Usha; Vishnunath, R.
2017-03-01
The intention of this study was to appraise the spatial and temporal variations in the physico-chemical parameters of coastal waters of Rameswaram Island, Gulf of Mannar Marine Biosphere Reserve, south India, using multivariate statistical techniques, such as cluster analysis, factor analysis and principal component analysis. Spatio-temporal variations among the physico-chemical parameters are observed in the coastal waters of Gulf of Mannar, especially during northeast and post monsoon seasons. It is inferred that the high loadings of pH, temperature, suspended particulate matter, salinity, dissolved oxygen, biochemical oxygen demand, chlorophyll a, nutrient species of nitrogen and phosphorus strongly determine the discrimination of coastal water quality. Results highlight the important role of monsoonal variations to determine the coastal water quality around Rameswaram Island.
NASA Astrophysics Data System (ADS)
Belianinov, Alex; Ganesh, Panchapakesan; Lin, Wenzhi; Sales, Brian C.; Sefat, Athena S.; Jesse, Stephen; Pan, Minghu; Kalinin, Sergei V.
2014-12-01
Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe0.55Se0.45 (Tc = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe1-xSex structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.
An Updated Review of Meat Authenticity Methods and Applications.
Vlachos, Antonios; Arvanitoyannis, Ioannis S; Tserkezou, Persefoni
2016-05-18
Adulteration of foods is a serious economic problem concerning most foodstuffs, and in particular meat products. Since high-priced meat demand premium prices, producers of meat-based products might be tempted to blend these products with lower cost meat. Moreover, the labeled meat contents may not be met. Both types of adulteration are difficult to detect and lead to deterioration of product quality. For the consumer, it is of outmost importance to guarantee both authenticity and compliance with product labeling. The purpose of this article is to review the state of the art of meat authenticity with analytical and immunochemical methods with the focus on the issue of geographic origin and sensory characteristics. This review is also intended to provide an overview of the various currently applied statistical analyses (multivariate analysis (MAV), such as principal component analysis, discriminant analysis, cluster analysis, etc.) and their effectiveness for meat authenticity.
Bagur, M G; Morales, S; López-Chicano, M
2009-11-15
Unsupervised and supervised pattern recognition techniques such as hierarchical cluster analysis, principal component analysis, factor analysis and linear discriminant analysis have been applied to water samples recollected in Rodalquilar mining district (Southern Spain) in order to identify different sources of environmental pollution caused by the abandoned mining industry. The effect of the mining activity on waters was monitored determining the concentration of eleven elements (Mn, Ba, Co, Cu, Zn, As, Cd, Sb, Hg, Au and Pb) by inductively coupled plasma mass spectrometry (ICP-MS). The Box-Cox transformation has been used to transform the data set in normal form in order to minimize the non-normal distribution of the geochemical data. The environmental impact is affected mainly by the mining activity developed in the zone, the acid drainage and finally by the chemical treatment used for the benefit of gold.
Monitoring Fatigue Status with HRV Measures in Elite Athletes: An Avenue Beyond RMSSD?
Schmitt, Laurent; Regnard, Jacques; Millet, Grégoire P.
2015-01-01
Among the tools proposed to assess the athlete's “fatigue,” the analysis of heart rate variability (HRV) provides an indirect evaluation of the settings of autonomic control of heart activity. HRV analysis is performed through assessment of time-domain indices, the square root of the mean of the sum of the squares of differences between adjacent normal R-R intervals (RMSSD) measured during short (5 min) recordings in supine position upon awakening in the morning and particularly the logarithm of RMSSD (LnRMSSD) has been proposed as the most useful resting HRV indicator. However, if RMSSD can help the practitioner to identify a global “fatigue” level, it does not allow discriminating different types of fatigue. Recent results using spectral HRV analysis highlighted firstly that HRV profiles assessed in supine and standing positions are independent and complementary; and secondly that using these postural profiles allows the clustering of distinct sub-categories of “fatigue.” Since, cardiovascular control settings are different in standing and lying posture, using the HRV figures of both postures to cluster fatigue state embeds information on the dynamics of control responses. Such, HRV spectral analysis appears more sensitive and enlightening than time-domain HRV indices. The wealthier information provided by this spectral analysis should improve the monitoring of the adaptive training-recovery process in athletes. PMID:26635629
Myers, C E; Gluck, M A
1996-08-01
A previous model of hippocampal region function in classical conditioning is generalized to H. Eichenbaum, A. Fagan, P. Mathews, and N.J. Cohen's (1989) and H. Eichenbaum, A. Fagan, and N.J. Cohen's (1989) simultaneous odor discrimination studies in rats. The model assumes that the hippocampal region forms new stimulus representations that compress redundant information while differentiating predictie information; the piriform (olfactory) cortex meanwhile clusters similar and co-occurring odors. Hippocampal damage interrupts the ability to differentiate odor representations, while leaving piriform-mediated odor clustering unchecked. The result is a net tendency to overcompress in the lesioned model. Behavior in the model is very similar to that of the rats, including lesion deficits, facilitation of successively learned tasks, and transfer performance. The computational mechanisms underlying model performance are consistent with the qualitative interpretations suggested by Eichen baum et al. to explain their empirical data.
Tian, Simiao; Zhang, Xiuzhi; Xu, Yang; Dong, Huimin
2016-01-01
Abstract The body mass index (BMI) and waist circumference (WC) are commonly used anthropometric measures for predicting cardiovascular diseases risk factors, but it is uncertain which specific measure might be the most appropriate predictor of a cluster of cardiometabolic abnormalities (CMA) in Chinese adults. A body shape index (ABSI) and body roundness index (BRI) have been recently developed as alternative anthropometric indices that may better reflect health status. The main aims of this study were to investigate the predictive capacity of ABSI and BRI in identifying various CMA compared to BMI, WC, waist-to-hip ratio (WHpR), and waist-to-height ratio (WHtR), and to determine whether there exists a best single predictor of all CMA. We used data from the 2009 wave of the China Health and Nutrition Survey, and the final analysis included 8126 adults aged 18 to 85 years with available fasting blood samples and anthropometric measurements. Receiver-operating characteristic (ROC) analyses were conducted to assess the best anthropometric indices to predict the risk of hypertension, diabetes, dyslipidemia, hyperuricemia, and metabolic syndrome (MetS). Logistic regression models were fit to evaluate the OR of each CMA according to anthropometric indices. In women, the ROC analysis showed that BRI and WHtR had the best predictive capability in identifying all of CMA (area under the curves [AUCs] ranged from 0.658 to 0.721). In men, BRI and WHtR were better predictor of hypertension, diabetes, and at least 1 CMA (AUC: 0.668, 0.708, and 0.698, respectively), whereas BMI and WC were more sensitive predictor of dyslipidemia, hyperuricemia, and MetS. Furthermore, the ABSI showed the lowest AUCs for each CMA. According to the multivariate logistic regression analysis, BRI and WHtR were superior in discriminating hyperuricemia and at least 1 CMA while BMI performed better in predicting hypertension, diabetes, and MetS in women. In men, WC and BRI were the 2 best predictor of all CMA except MetS, and the ABSI was the worst. Our results showed the novel index BRI could be used as a single suitable anthropometric measure in simultaneously identifying a cluster of CMA compared to BMI and WHtR, especially in Chinese women, whereas the ABSI showed the weakest discriminative power. PMID:27559964
Tian, Simiao; Zhang, Xiuzhi; Xu, Yang; Dong, Huimin
2016-08-01
The body mass index (BMI) and waist circumference (WC) are commonly used anthropometric measures for predicting cardiovascular diseases risk factors, but it is uncertain which specific measure might be the most appropriate predictor of a cluster of cardiometabolic abnormalities (CMA) in Chinese adults. A body shape index (ABSI) and body roundness index (BRI) have been recently developed as alternative anthropometric indices that may better reflect health status. The main aims of this study were to investigate the predictive capacity of ABSI and BRI in identifying various CMA compared to BMI, WC, waist-to-hip ratio (WHpR), and waist-to-height ratio (WHtR), and to determine whether there exists a best single predictor of all CMA.We used data from the 2009 wave of the China Health and Nutrition Survey, and the final analysis included 8126 adults aged 18 to 85 years with available fasting blood samples and anthropometric measurements. Receiver-operating characteristic (ROC) analyses were conducted to assess the best anthropometric indices to predict the risk of hypertension, diabetes, dyslipidemia, hyperuricemia, and metabolic syndrome (MetS). Logistic regression models were fit to evaluate the OR of each CMA according to anthropometric indices.In women, the ROC analysis showed that BRI and WHtR had the best predictive capability in identifying all of CMA (area under the curves [AUCs] ranged from 0.658 to 0.721). In men, BRI and WHtR were better predictor of hypertension, diabetes, and at least 1 CMA (AUC: 0.668, 0.708, and 0.698, respectively), whereas BMI and WC were more sensitive predictor of dyslipidemia, hyperuricemia, and MetS. Furthermore, the ABSI showed the lowest AUCs for each CMA. According to the multivariate logistic regression analysis, BRI and WHtR were superior in discriminating hyperuricemia and at least 1 CMA while BMI performed better in predicting hypertension, diabetes, and MetS in women. In men, WC and BRI were the 2 best predictor of all CMA except MetS, and the ABSI was the worst.Our results showed the novel index BRI could be used as a single suitable anthropometric measure in simultaneously identifying a cluster of CMA compared to BMI and WHtR, especially in Chinese women, whereas the ABSI showed the weakest discriminative power.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ford, J; Lopez, C; Tschudi, Y
Purpose: To determine whether blood oxygenation level dependent (BOLD) MRI signal measured in prostate cancer patients, in addition to quantitative diffusion and perfusion parameters from multiparametric (mp)MRI exams, can help discriminate aggressive and/or radioresistant lesions. Methods: Several ongoing clinical trials in our institution require mpMRI exam to determine eligibility (presence of identifiable tumor lesion on mpMRI) and prostate volumes for dose escalation. Upon consent, patients undergo fiducial markers placement and a T2*-weighted imaging at the time of CT sim to facilitate the fusion. In a retrospective analysis eleven clinical trial patients were identified who had undergone mpMRI on GE 3Tmore » magnet, followed by T2*-weighted imaging (time-period mean±SD = 48±20 days) using a consistent protocol (gradient echo, TR/TE=30/11.8ms, flip angle=12, matrix=256×256×75, voxel size=1.25×1.25×2.5mm). ROIs for prostate tumor lesions were automatically determined using ADC threshold ≤1200 µm2/s. Although the MR protocol was not intended for BOLD analysis, we utilized the T2*-weighted signal normalized to that in nearby muscle; likewise, T2-weighted lesion signal was normalized to muscle, following rigid registration of the T2 to T2* images. The ratio of these normalized signals, T2*/T2, is a measure of BOLD effect in the prostate tumors. Perfusion parameters (Ktrans, ve, kep) were also calculated. Results: T2*/T2 (mean±SE) was found to be substantially lower for Gleason score (GS) 8&9 (0.82±0.04) compared to GS 7 (1.08±0.07). A k-means cluster analysis of T2*/T2 versus kep = Ktrans/ve revealed two distinct clusters, one with higher T2*/T2 and lower kep, containing only GS 7 lesions, and another with lower T2*/T2 and higher kep, associated with tumor aggressiveness. This latter cluster contained all GS 8&9 lesions, as well as some GS 7. Conclusion: BOLD MRI, in addition to ADC and kep, may play a role (perhaps orthogonal to Gleason score) in identifying prostate lesions that would benefit from more aggressive radiotherapy.« less
Tire traces - discrimination and classification of pyrolysis-GC/MS profiles.
Gueissaz, Line; Massonnet, Geneviève
2013-07-10
Tire traces can be observed on several crime scenes as vehicles are often used by criminals. The tread abrasion on the road, while braking or skidding, leads to the production of small rubber particles which can be collected for comparison purposes. This research focused on the statistical comparison of Py-GC/MS profiles of tire traces and tire treads. The optimisation of the analytical method was carried out using experimental designs. The aim was to determine the best pyrolysis parameters regarding the repeatability of the results. Thus, the pyrolysis factor effect could also be calculated. The pyrolysis temperature was found to be five time more important than time. Finally, a pyrolysis at 650°C during 15s was selected. Ten tires of different manufacturers and models were used for this study. Several samples were collected on each tire, and several replicates were carried out to study the variability within each tire (intravariability). More than eighty compounds were integrated for each analysis and the variability study showed that more than 75% presented a relative standard deviation (RSD) below 5% for the ten tires, thus supporting a low intravariability. The variability between the ten tires (intervariability) presented higher values and the ten most variant compounds had a RSD value above 13%, supporting their high potential of discrimination between the tires tested. Principal Component Analysis (PCA) was able to fully discriminate the ten tires with the help of the first three principal components. The ten tires were finally used to perform braking tests on a racetrack with a vehicle equipped with an anti-lock braking system. The resulting tire traces were adequately collected using sheets of white gelatine. As for tires, the intravariability for the traces was found to be lower than the intervariability. Clustering methods were carried out and the Ward's method based on the squared Euclidean distance was able to correctly group all of the tire traces replicates in the same cluster than the replicates of their corresponding tire. Blind tests on traces were performed and were correctly assigned to their tire source. These results support the hypothesis that the tested tires, of different manufacturers and models, can be discriminated by a statistical comparison of their chemical profiles. The traces were found to be not differentiable from their source but differentiable from all the other tires present in the subset. The results are promising and will be extended on a larger sample set. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Strong influence of variable treatment on the performance of numerically defined ecological regions.
Snelder, Ton; Lehmann, Anthony; Lamouroux, Nicolas; Leathwick, John; Allenbach, Karin
2009-10-01
Numerical clustering has frequently been used to define hierarchically organized ecological regionalizations, but there has been little robust evaluation of their performance (i.e., the degree to which regions discriminate areas with similar ecological character). In this study we investigated the effect of the weighting and treatment of input variables on the performance of regionalizations defined by agglomerative clustering across a range of hierarchical levels. For this purpose, we developed three ecological regionalizations of Switzerland of increasing complexity using agglomerative clustering. Environmental data for our analysis were drawn from a 400 m grid and consisted of estimates of 11 environmental variables for each grid cell describing climate, topography and lithology. Regionalization 1 was defined from the environmental variables which were given equal weights. We used the same variables in Regionalization 2 but weighted and transformed them on the basis of a dissimilarity model that was fitted to land cover composition data derived for a random sample of cells from interpretation of aerial photographs. Regionalization 3 was a further two-stage development of Regionalization 2 where specific classifications, also weighted and transformed using dissimilarity models, were applied to 25 small scale "sub-domains" defined by Regionalization 2. Performance was assessed in terms of the discrimination of land cover composition for an independent set of sites using classification strength (CS), which measured the similarity of land cover composition within classes and the dissimilarity between classes. Regionalization 2 performed significantly better than Regionalization 1, but the largest gains in performance, compared to Regionalization 1, occurred at coarse hierarchical levels (i.e., CS did not increase significantly beyond the 25-region level). Regionalization 3 performed better than Regionalization 2 beyond the 25-region level and CS values continued to increase to the 95-region level. The results show that the performance of regionalizations defined by agglomerative clustering are sensitive to variable weighting and transformation. We conclude that large gains in performance can be achieved by training classifications using dissimilarity models. However, these gains are restricted to a narrow range of hierarchical levels because agglomerative clustering is unable to represent the variation in importance of variables at different spatial scales. We suggest that further advances in the numerical definition of hierarchically organized ecological regionalizations will be possible with techniques developed in the field of statistical modeling of the distribution of community composition.
NASA Astrophysics Data System (ADS)
Hass, H. Christian; Mielck, Finn; Fiorentino, Dario; Papenmeier, Svenja; Holler, Peter; Bartholomä, Alexander
2017-04-01
Marine habitats of shelf seas are in constant dynamic change and therefore need regular assessment particularly in areas of special interest. In this study, the single-beam acoustic ground discrimination system RoxAnn served to assess seafloor hardness and roughness, and combine these parameters into one variable expressed as RGB (red green blue) color code followed by k-means fuzzy cluster analysis (FCA). The data were collected at a monitoring site west of the island of Helgoland (German Bight, SE North Sea) in the course of four surveys between September 2011 and November 2014. The study area has complex characteristics varying from outcropping bedrock to sandy and muddy sectors with mostly gradual transitions. RoxAnn data enabled to discriminate all seafloor types that were suggested by ground-truth information (seafloor samples, video). The area appears to be quite stable overall; sediment import (including fluid mud) was detected only from the NW. Although hard substrates (boulders, bedrock) are clearly identified, the signal can be modified by inclination and biocover. Manually, six RoxAnn zones were identified; for the FCA, only three classes are suggested. The latter classification based on `hard' boundaries would suffice for stakeholder issues, but the former classification based on `soft' boundaries is preferred to meet state-of-the-art scientific objectives.
Diagnosing the predisposition for diabetes mellitus by means of mid-IR spectroscopy
NASA Astrophysics Data System (ADS)
Frueh, Johanna; Jacob, Stephan; Dolenko, Brion; Haering, Hans-Ullrich; Mischler, Reinhold; Quarder, Ortrud; Renn, Walter; Somorjai, Raymond L.; Staib, Arnulf; Werner, Gerhard H.; Petrich, Wolfgang H.
2002-03-01
The vicious circle of insulin resistance and hyperinsulinemia is considered to precede the manifestation of diabetes type-2 by decades and the corresponding cluster of risk factors is described as the 'insulin resistance syndrome' or 'metabolic syndrome'. Since the present diagnosis of insulin resistance is expensive, time consuming and cumbersome, there is a need for diagnostic alternatives. We conducted a clinical study on 129 healthy volunteers and 99 patients suffering from the metabolic syndrome. We applied mid-infrared spectroscopy to dried serum samples from these donors and evaluated the spectra by means of disease pattern recognition (DPR). Substantial differences were found between the spectra originating from healthy volunteers and those spectra originating from patients with the metabolic syndrome. A linear discriminant analysis was performed using approximately one half of the sample set for teaching the classification algorithm. Within this teaching set, a classification sensitivity and specificity of 84 percent and 81 percent respectively can be derived. Furthermore, the resulting discriminant function was applied to an independent validation of the remaining half of the samples. For the discrimination between 'healthy' and 'metabolic syndrome' a sensitivity and a specificity of 80 percent and 82 percent respectively is obtained upon validating the algorithm with the independent validation set.
Bidargaddi, Niranjan; Sarela, Antti; Korhonen, Ilkka
2008-01-01
The objective is to identify whether it is possible to discriminate between normal and abnormal physiological state based on heart rate (HR), heart rate variability (HRV) and movement activity information in subjects with cardiovascular complications. HR, HRV and movement information were obtained from cardiac patients over a period of 6 weeks using an ambulatory activity and single lead ECG monitor. By applying k-means clustering on HR, HRV and movement information obtained from cardiac patients, we obtained 3 clusters in inactive state and one cluster in active state. Two clusters in inactive state characterized by - a) high HR and low HRV b) low HRV and low HR, could be inferred as pathological with abnormal autonomic function. Further, activity information was significant in differentiating between the normal cluster found in active and an abnormal cluster found in inactive states, both with low HRV. This indicates that the activity information must be taken into account while interpreting HR and HRV information.
Zhuang, Qianfen; Cao, Wei; Ni, Yongnian; Wang, Yong
2018-08-01
Most of the conventional multidimensional differential sensors currently need at least two-step fabrication, namely synthesis of probe(s) and identification of multiple analytes by mixing of analytes with probe(s), and were conducted using multiple sensing elements or several devices. In the study, we chose five different nucleobases (adenine, cytosine, guanine, thymine, and uracil) as model analytes, and found that under hydrothermal conditions, sodium citrate could react directly with various nucleobases to yield different nitrogen-doped carbon nanodots (CDs). The CDs synthesized from different nucleobases exhibited different fluorescent properties, leading to their respective characteristic fluorescence spectra. Hence, we combined the fluorescence spectra of the CDs with advanced chemometrics like principle component analysis (PCA), hierarchical cluster analysis (HCA), K-nearest neighbor (KNN) and soft independent modeling of class analogy (SIMCA), to present a conceptually novel "synthesis-identification integration" strategy to construct a multidimensional differential sensor for nucleobase discrimination. Single-wavelength excitation fluorescence spectral data, single-wavelength emission fluorescence spectral data, and fluorescence Excitation-Emission Matrices (EEMs) of the CDs were respectively used as input data of the differential sensor. The results showed that the discrimination ability of the multidimensional differential sensor with EEM data set as input data was superior to those with single-wavelength excitation/emission fluorescence data set, suggesting that increasing the number of the data input could improve the discrimination power. Two supervised pattern recognition methods, namely KNN and SIMCA, correctly identified the five nucleobases with a classification accuracy of 100%. The proposed "synthesis-identification integration" strategy together with a multidimensional array of experimental data holds great promise in the construction of differential sensors. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Luna-Pineda, Tatiana; Soto-Feliciano, Kristina; De La Cruz-Montoya, Edwin; Pacheco Londoño, Leonardo C.; Ríos-Velázquez, Carlos; Hernández-Rivera, Samuel P.
2007-04-01
FTIR, Raman spectroscopy and Surface Enhanced Raman Scattering (SERS) requires a minimum of sample allows fast identification of microorganisms. The use of this technique for characterizing the spectroscopic signatures of these agents and their stimulants has recently gained considerable attention due to the fact that these techniques can be easily adapted for standoff detection from considerable distances. The techniques also show high sensitivity and selectivity and offer near real time detection duty cycles. This research focuses in laying the grounds for the spectroscopic differentiation of Staphylococcus spp., Pseudomonas spp., Bacillus spp., Salmonella spp., Enterobacter aerogenes, Proteus mirabilis, Klebsiella pneumoniae, and E. coli, together with identification of their subspecies. In order to achieve the proponed objective, protocols to handle, cultivate and analyze the strains have been developed. Spectroscopic similarities and marked differences have been found for Spontaneous or Normal Raman spectra and for SERS using silver nanoparticles have been found. The use of principal component analysis (PCA), discriminate factor analysis (DFA) and a cluster analysis were used to evaluate the efficacy of identifying potential threat bacterial from their spectra collected on single bacteria. The DFA from the bacteria Raman spectra show a little discrimination between the diverse bacterial species however the results obtained from the SERS demonstrate to be high discrimination technique. The spectroscopic study will be extended to examine the spores produced by selected strains since these are more prone to be used as Biological Warfare Agents due to their increased mobility and possibility of airborne transport. Micro infrared spectroscopy as well as fiber coupled FTIR will also be used as possible sensors of target compounds.
From bird to sparrow: Learning-induced modulations in fine-grained semantic discrimination.
De Meo, Rosanna; Bourquin, Nathalie M-P; Knebel, Jean-François; Murray, Micah M; Clarke, Stephanie
2015-09-01
Recognition of environmental sounds is believed to proceed through discrimination steps from broad to more narrow categories. Very little is known about the neural processes that underlie fine-grained discrimination within narrow categories or about their plasticity in relation to newly acquired expertise. We investigated how the cortical representation of birdsongs is modulated by brief training to recognize individual species. During a 60-minute session, participants learned to recognize a set of birdsongs; they improved significantly their performance for trained (T) but not control species (C), which were counterbalanced across participants. Auditory evoked potentials (AEPs) were recorded during pre- and post-training sessions. Pre vs. post changes in AEPs were significantly different between T and C i) at 206-232ms post stimulus onset within a cluster on the anterior part of the left superior temporal gyrus; ii) at 246-291ms in the left middle frontal gyrus; and iii) 512-545ms in the left middle temporal gyrus as well as bilaterally in the cingulate cortex. All effects were driven by weaker activity for T than C species. Thus, expertise in discriminating T species modulated early stages of semantic processing, during and immediately after the time window that sustains the discrimination between human vs. animal vocalizations. Moreover, the training-induced plasticity is reflected by the sharpening of a left lateralized semantic network, including the anterior part of the temporal convexity and the frontal cortex. Training to identify birdsongs influenced, however, also the processing of C species, but at a much later stage. Correct discrimination of untrained sounds seems to require an additional step which results from lower-level features analysis such as apperception. We therefore suggest that the access to objects within an auditory semantic category is different and depends on subject's level of expertise. More specifically, correct intra-categorical auditory discrimination for untrained items follows the temporal hierarchy and transpires in a late stage of semantic processing. On the other hand, correct categorization of individually trained stimuli occurs earlier, during a period contemporaneous with human vs. animal vocalization discrimination, and involves a parallel semantic pathway requiring expertise. Copyright © 2015 Elsevier Inc. All rights reserved.
Yi, Siyan; Chhoun, Pheak; Suong, Samedy; Thin, Kouland; Brody, Carinne; Tuot, Sovannary
2015-01-01
Background AIDS-related stigma and mental disorders are the most common conditions in people living with HIV (PLHIV). We therefore conducted this study to examine the association of AIDS-related stigma and discrimination with mental disorders among PLHIV in Cambodia. Methods A two-stage cluster sampling method was used to select 1,003 adult PLHIV from six provinces. The People Living with HIV Stigma Index was used to measure stigma and discrimination, and a short version of general health questionnaire (GHQ-12) was used to measure mental disorders. Multivariate logistic regression analysis was conducted. Results The reported experiences of discrimination in communities in the past 12 months ranged from 0.8% for reports of being denied health services to 42.3% for being aware of being gossiped about. Internal stigma was also common ranging from 2.8% for avoiding going to a local clinic and/or hospital to 59.6% for deciding not to have (more) children. The proportions of PLHIV who reported fear of stigma and discrimination ranged from 13.9% for fear of being physically assaulted to 34.5% for fear of being gossiped about. The mean score of GHQ-12 was 3.2 (SD = 2.4). After controlling for several potential confounders, higher levels of mental disorders (GHQ-12≥ 4) remained significantly associated with higher levels of experiences of stigma and discrimination in family and communities (AOR = 1.9, 95% CI = 1.4–2.6), higher levels of internal stigma (AOR = 1.7, 95% CI = 1.2–2.3), and higher levels of fear of stigma and discrimination in family and communities (AOR = 1.5, 95% CI = 1.1–2.2). Conclusions AIDS-related stigma and discrimination among PLHIV in Cambodia are common and may have potential impacts on their mental health conditions. These findings indicate a need for community-based interventions to reduce stigma and discrimination in the general public and to help PLHIV to cope with this situation. PMID:25806534
SECIMTools: a suite of metabolomics data analysis tools.
Kirpich, Alexander S; Ibarra, Miguel; Moskalenko, Oleksandr; Fear, Justin M; Gerken, Joseph; Mi, Xinlei; Ashrafi, Ali; Morse, Alison M; McIntyre, Lauren M
2018-04-20
Metabolomics has the promise to transform the area of personalized medicine with the rapid development of high throughput technology for untargeted analysis of metabolites. Open access, easy to use, analytic tools that are broadly accessible to the biological community need to be developed. While technology used in metabolomics varies, most metabolomics studies have a set of features identified. Galaxy is an open access platform that enables scientists at all levels to interact with big data. Galaxy promotes reproducibility by saving histories and enabling the sharing workflows among scientists. SECIMTools (SouthEast Center for Integrated Metabolomics) is a set of Python applications that are available both as standalone tools and wrapped for use in Galaxy. The suite includes a comprehensive set of quality control metrics (retention time window evaluation and various peak evaluation tools), visualization techniques (hierarchical cluster heatmap, principal component analysis, modular modularity clustering), basic statistical analysis methods (partial least squares - discriminant analysis, analysis of variance, t-test, Kruskal-Wallis non-parametric test), advanced classification methods (random forest, support vector machines), and advanced variable selection tools (least absolute shrinkage and selection operator LASSO and Elastic Net). SECIMTools leverages the Galaxy platform and enables integrated workflows for metabolomics data analysis made from building blocks designed for easy use and interpretability. Standard data formats and a set of utilities allow arbitrary linkages between tools to encourage novel workflow designs. The Galaxy framework enables future data integration for metabolomics studies with other omics data.
ERIC Educational Resources Information Center
Ranard, Donald A.; Gilzow, Douglas F.
1989-01-01
Articles in this newsletter issue examine the experiences, strengths, and problems that Amerasian refugees from Vietnam have had while living in the United States. Topics of discussion include discrimination, educational difficulties, resettlement experiences, and cultural difficulties. The concept of cluster site resettlement, a possible solution…
Testing cold dark matter models using Hubble flow variations
NASA Astrophysics Data System (ADS)
Shi, Xiangdong
1999-05-01
COBE-normalized flat (matter plus cosmological constant) and open cold dark matter (CDM) models are tested by comparing their expected Hubble flow variations and the observed variations in a Type Ia supernova sample and a Tully-Fisher cluster sample. The test provides a probe of the CDM power spectrum on scales of 0.02h Mpc^-1<~ k<~ 0.2h Mpc^-1, free of the bias factor b. The results favour a low matter content universe, or a flat matter-dominated universe with a very low Hubble constant and/or a very small spectral index n^ps, with the best fits having Ο_0~ 0.3 to 0.4. The test is found to be more discriminative to the open CDM models than to the flat CDM models. For example, the test results are found to be compatible with those from the X-ray cluster abundance measurements at smaller length-scales, and consistent with the galaxy and cluster correlation analysis of Peacock & Dodds at similar length-scales, if our universe is flat; but the results are marginally incompatible with the X-ray cluster abundance measurements if our universe is open. The open CDM results are consistent with that of Peacock & Dodds only if the matter density of the universe is less than about 60 per cent of the critical density. The shortcoming of the test is discussed, so are ways to minimize it.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ventura, P.; Dell’Agli, F.; D’Antona, F.
We study the formation of multiple populations in globular clusters (GCs), under the hypothesis that stars in the second generation formed from the winds of intermediate-mass stars, ejected during the asymptotic giant branch (AGB) phase, possibly diluted with pristine gas, sharing the same chemical composition of first-generation stars. To this aim, we use the recent Apache Point Observatory Galactic Evolution Experiment (APOGEE) data, which provide the surface chemistry of a large sample of giant stars, belonging to clusters that span a wide metallicity range. The APOGEE data set is particularly suitable to discriminate among the various pollution scenarios proposed somore » far, as it provides the surface abundances of Mg and Al, the two elements involved in a nuclear channel extremely sensitive to the temperature, hence to the metallicity of the polluters. The present analysis shows a remarkable agreement between the observations and the theoretical yields from massive AGB stars. In particular, the observed extension of the depletion of Mg and O and the increase in Al is well reproduced by the models and the trend with the metallicity is also fully accounted for. This study further supports the idea that AGB stars were the key players in the pollution of the intra-cluster medium, from which additional generations of stars formed in GCs.« less
NASA Astrophysics Data System (ADS)
Shao, Mingying; Li, Xuejie; Zheng, Kang; Jiang, Man; Yan, Cuiwei; Li, Yantuan
2016-04-01
The goal of this paper is to explore the relationship between the inorganic elemental fingerprint and the geographical origin identification of Meretricis concha, which is a commonly used marine traditional Chinese medicine (TCM) for the treatment of asthma and scald burns. For that, the inorganic elemental contents of Meretricis concha from five sampling points in Jiaozhou Bay have been determined by means of inductively coupled plasma optical emission spectrometry, and the comparative investigations based on the contents of 14 inorganic elements (Al, As, Cd, Co, Cr, Cu, Fe, Hg, Mn, Mo, Ni, Pb, Se and Zn) of the samples from Jiaozhou Bay and the previous reported Rushan Bay were performed. It has been found that the samples from the two bays are approximately classified into two kinds using hierarchical cluster analysis, and a four-factor model based on principle component analysis could explain approximately 75% of the detection data, also linear discriminant analysis can be used to develop a prediction model to distinguish the samples from Jiaozhou Bay and Rushan Bay with accuracy of about 93%. The results of the present investigation suggested that the inorganic elemental fingerprint based on the combination of the measured elemental content and chemometric analysis is a promising approach for verifying the geographical origin of Meretricis concha, and this strategy should be valuable for the authenticity discrimination of some marine TCM.
Golowczyc, Marina A; Gugliada, Maria J; Hollmann, Axel; Delfederico, Lucrecia; Garrote, Graciela L; Abraham, Analía G; Semorile, Liliana; De Antoni, Graciela
2008-05-01
Considering that several health promoting properties are associated with kefir consumption and a reliable probiotic product requires a complete identification of the bacterial species, the present work evaluates several proved markers of probiotic potential of eleven isolates of homofermentative lactobacilli isolated from kefir grains and molecular identification and genotypic diversity. Using restriction analysis of amplified ribosomal DNA (ARDRA) and analysis of the 16S-23S rRNA internal spacer region we confirmed that all homofermentative lactobacilli belong to the species Lactobacillus plantarum. RAPD-PCR analysis allowed the discrimination of lactobacilli in five clusters. All isolates exhibited high resistance to bile salt. High survival after one hour of exposure to pH 2.5 was observed in Lb. plantarum CIDCA 8313, 83210, 8327 and 8338. All isolates were hydrophilic and non autoaggregative. Isolate CIDCA 8337 showed the highest percentage of adhesion among strains. All tested lactobacilli had strong inhibitory power against Salmonella typhimurium and Escherichia coli. Seven out of eleven isolates showed inhibition against Sal. enterica and five isolates were effective against Sal. gallinarum. Only CIDCA 8323 and CIDCA 8327 were able to inhibit Sal. sonnei. We did not find any correlation between the five clusters based on RAPD-PCR and the probiotic properties, suggesting that these isolates have unique characteristics.
Exploiting visual search theory to infer social interactions
NASA Astrophysics Data System (ADS)
Rota, Paolo; Dang-Nguyen, Duc-Tien; Conci, Nicola; Sebe, Nicu
2013-03-01
In this paper we propose a new method to infer human social interactions using typical techniques adopted in literature for visual search and information retrieval. The main piece of information we use to discriminate among different types of interactions is provided by proxemics cues acquired by a tracker, and used to distinguish between intentional and casual interactions. The proxemics information has been acquired through the analysis of two different metrics: on the one hand we observe the current distance between subjects, and on the other hand we measure the O-space synergy between subjects. The obtained values are taken at every time step over a temporal sliding window, and processed in the Discrete Fourier Transform (DFT) domain. The features are eventually merged into an unique array, and clustered using the K-means algorithm. The clusters are reorganized using a second larger temporal window into a Bag Of Words framework, so as to build the feature vector that will feed the SVM classifier.
Peak, K. Kealy; Duncan, Kathleen E.; Luna, Vicki A.; King, Debra S.; McCarthy, Peter J.; Cannons, Andrew C.
2011-01-01
Bacillus strains with >99.7% 16S rRNA gene sequence similarity were characterized with DNA:DNA hybridization, cellular fatty acid (CFA) analysis, and testing of 100 phenotypic traits. When paired with the most closely related type strain, percent DNA:DNA similarities (% S) for six Bacillus strains were all far below the recommended 70% threshold value for species circumscription with Bacillus nealsonii. An apparent genomic group of four Bacillus strain pairings with 94%–70% S was contradicted by the failure of the strains to cluster in CFA- and phenotype-based dendrograms as well as by their differentiation with 9–13 species level discriminators such as nitrate reduction, temperature range, and acid production from carbohydrates. The novel Bacillus strains were monophyletic and very closely related based on 16S rRNA gene sequence. Coherent genomic groups were not however supported by similarly organized phenotypic clusters. Therefore, the strains were not effectively circumscribed within the taxonomic species definition. PMID:22046187
Female-to-male transmasculine adult health: a mixed-methods community-based needs assessment.
Reisner, Sari L; Gamarel, Kristi E; Dunham, Emilia; Hopwood, Ruben; Hwahng, Sel
2013-01-01
There is a dearth of health research about transgender people. This mixed-methods study sought to formatively investigate the health and perceived health needs of female-to-male transmasculine adults. A cross-sectional quantitative needs assessment (n = 73) and qualitative open-ended input (n = 19) were conducted in June 2011. A latent class analysis modeled six binary health indicators (depression, alcohol use, current smoking, asthma, physical inactivity, overweight status) to identify clusters of presenting health issues. Four clusters of health indicators emerged: (a) depression; (b) syndemic (all indicators); (c) alcohol use, overweight status; and (d) smoking, physical inactivity, overweight status. Transphobic discrimination in health care and avoiding care were each associated with membership in the syndemic class. Qualitative themes included personal health care needs, community needs, and resilience and protective factors. Findings fill an important gap about the health of transmasculine communities, including the need for public health efforts that holistically address concomitant health concerns.
Use of an Electronic Tongue System and Fuzzy Logic to Analyze Water Samples
NASA Astrophysics Data System (ADS)
Braga, Guilherme S.; Paterno, Leonardo G.; Fonseca, Fernando J.
2009-05-01
An electronic tongue (ET) system incorporating 8 chemical sensors was used in combination with two pattern recognition tools, namely principal component analysis (PCA) and Fuzzy logic for discriminating/classification of water samples from different sources (tap, distilled and three brands of mineral water). The Fuzzy program exhibited a higher accuracy than the PCA and allowed the ET to classify correctly 4 in 5 types of water. Exception was made for one brand of mineral water which was sometimes misclassified as tap water. On the other hand, the PCA grouped water samples in three clusters, one with the distilled water; a second with tap water and one brand of mineral water, and the third with the other two other brands of mineral water. Samples in the second and third clusters could not be distinguished. Nevertheless, close grouping between repeated tests indicated that the ET system response is reproducible. The potential use of the Fuzzy logic as the data processing tool in combination with an electronic tongue system is discussed.
A HST/WFC3 Search for Substellar Companions in the Orion Nebula Cluster
NASA Astrophysics Data System (ADS)
Strampelli, Giovanni Maria; Aguilar, Jonathan; Aparicio, Antonio; Piotto, Giampaolo; Pueyo, Laurent; Robberto, Massimo
2018-01-01
We present new results relative to the population of substellar binaries in the Orion Nebula Cluster. We reprocessed HST/WFC3 data using an analysis technique developed to detect close companions in the wings of the stellar PSFs, based on the PyKLIP implementation of the KLIP PSF subtraction algorithm. Starting from a sample of ~1200 stars selected over the range J=11-15 mag, we were able to uncover ~80 candidate companions in the magnitude range J=16-23 mag. We use the presence of the 1.4 micron H2O absorption feature in the companion photosphere to discriminate 32 bona-fide substellar candidates from a population of reddened background objects. We derive an estimate of the companion mass assuming a 2Myr isochrone and the reddening of their primary. With 8 stellar companions, 19 brown dwarfs and 5 planetary mass objects, our study provide us with an unbiased sample of companions at the low-mass end of the IMF, probing the transition from binary to planetary systems.
Cognitive subtypes of dyslexia are characterized by distinct patterns of grey matter volume.
Jednoróg, Katarzyna; Gawron, Natalia; Marchewka, Artur; Heim, Stefan; Grabowska, Anna
2014-09-01
The variety of different causal theories together with inconsistencies about the anatomical brain markers emphasize the heterogeneity of developmental dyslexia. Attempts were made to test on a behavioral level the existence of subtypes of dyslexia showing distinguishable cognitive deficits. Importantly, no research was directly devoted to the investigation of structural brain correlates of these subtypes. Here, for the first time, we applied voxel-based morphometry (VBM) to study grey matter volume (GMV) differences in a relatively large sample (n = 46) of dyslexic children split into three subtypes based on the cognitive deficits: phonological, rapid naming, magnocellular/dorsal, and auditory attention shifting. VBM revealed GMV clusters specific for each studied group including areas of left inferior frontal gyrus, cerebellum, right putamen, and bilateral parietal cortex. In addition, using discriminant analysis on these clusters 79% of cross-validated cases were correctly re-classified into four groups (controls vs. three subtypes). Current results indicate that dyslexia may result from distinct cognitive impairments characterized by distinguishable anatomical markers.
NASA Astrophysics Data System (ADS)
Obuchowski, Nancy A.; Bullen, Jennifer A.
2018-04-01
Receiver operating characteristic (ROC) analysis is a tool used to describe the discrimination accuracy of a diagnostic test or prediction model. While sensitivity and specificity are the basic metrics of accuracy, they have many limitations when characterizing test accuracy, particularly when comparing the accuracies of competing tests. In this article we review the basic study design features of ROC studies, illustrate sample size calculations, present statistical methods for measuring and comparing accuracy, and highlight commonly used ROC software. We include descriptions of multi-reader ROC study design and analysis, address frequently seen problems of verification and location bias, discuss clustered data, and provide strategies for testing endpoints in ROC studies. The methods are illustrated with a study of transmission ultrasound for diagnosing breast lesions.
Yang, Yan-Mei; Lin, Li; Lu, You-Yuan; Ma, Xiao-Hui; Jin, Ling; Zhu, Tian-Tian
2016-03-01
The study is aimed to analyze the commercial specifications and grades of wild and cultivated Gentianae Macrophllae Radix based on multi-indicative constituents. The seven kinds of main chemical components containing in Gentianae Macrophyllae Radix were determined by UPLC, and then the quality levels of chemical component of Gentianae Macrophyllae Radix were clustered and classified by modern statistical methods (canonical correspondence analysis, Fisher discriminant analysis and so on). The quality indices were selected and their correlations were analyzed. Lastly, comprehensively quantitative grade division for quality under different commodity-specifications and different grades of same commodity-specifications of wild and planting were divided. The results provide a basis for a reasonable division of specification and grade of the commodity of Gentianae Macrophyllae Radix. The range of quality evaluation of main index components (gentiopicrin, loganin acid and swertiamarin) was proposed, and the Herbal Quality Index (HQI) was introduced. The rank discriminant function was established based on the quality by Fisher discriminant analysis. According to the analysis, the quality of wild and cultivated Luobojiao, one of the commercial specification of Gentianae Macrophyllae Radix was the best, Mahuajiao, the other commercial specification, was average , Xiaoqinjiao was inferior. Among grades, the quality of first-class cultivated Luobojiao was the worst, of second class secondary, and the third class the best; The quality of the first-class of wild Luobojiao was secondary, and the second-class the best; The quality of the second-class of Mahuajiao was secondary, and the first-class was the best; the quality of first-class Xiaoqinjiao was secondary, and the second-class was the better one between the two grades, but not obvious significantly. The method provides a new idea and method for evaluation of comprehensively quantitative on the quality of Gentianae Macrophyllae Radix. Copyright© by the Chinese Pharmaceutical Association.
NASA Astrophysics Data System (ADS)
Wright, Dawn; Sayre, Roger; Breyer, Sean; Butler, Kevin; VanGraafeiland, Keith; Goodin, Kathy; Kavanaugh, Maria; Costello, Mark; Cressie, Noel; Basher, Zeenatul; Harris, Peter; Guinotte, John
2017-04-01
A data-derived, ecological stratification-based ecosystem mapping approach was recently demonstrated by Sayre et al. for terrestrial ecosystems, resulting in a standardized map of nearly 4000 global ecological land units (ELUs) at a base spatial resolution of 250 m. The map was commissioned by the Group on Earth Observations for eventual use by the Global Earth Observation System of Systems (GEOSS), and was also a contribution to the Climate Data Initiative of US President Barack Obama. We now present a similar environmental stratification approach for extending a global ecosystems map into the oceans through the delineation of analog global ecological marine units (EMUs). EMUs are comprised of a global point mesh framework, created from over 52 million points from NOAA's World Ocean Atlas with a spatial resolution of ¼ by ¼ degree ( 27 x 27 km at the equator) at varying depths and a temporal resolution that is currently decadal. Each point carries attributes of chemical and physical oceanographic structure (temperature, salinity, dissolved oxygen, nitrate, silicate, phosphate) that are likely drivers of many marine ecosystem responses. We used a k-means statistical clustering algorithm to identify physically distinct, relatively homogenous, volumetric regions within the water column (the EMUs). Backwards stepwise discriminant analysis determined if all of six variables contributed significantly to the clustering, and a pseudo F-statistic gave us an optimum number of clusters worldwide at 37. Canonical discriminant analysis verified that all 37 clusters were significantly different from one another. A major intent of the EMUs is to support marine biodiversity conservation assessments, economic valuation studies of marine ecosystem goods and services, and studies of ocean acidification and other impacts (e.g., pollution, resource exploitation, etc.). As such, they represent a rich geospatial accounting framework for these types of studies, as well as for scientific research on species distributions and their relationships to the marine physical environment. To further benefit the community and facilitate collaborate knowledge building, data products are shared openly and interoperably via www.esri.com/ecological-marine-units. This includes provision of 3D point mesh and EMU clusters at the surface, bottom, and within the water column in varying formats via download, web services or web apps, as well as generic algorithms and GIS workflows that scale from global to regional and local. A major aim is for the community members to may move the research forward with higher-resolution data from their own field studies or areas of interest, with the original EMU project team assisting with GIS implementation (especially via a new online discussion forum), or hosting of additional data products as needed.
A pattern recognition approach to transistor array parameter variance
NASA Astrophysics Data System (ADS)
da F. Costa, Luciano; Silva, Filipi N.; Comin, Cesar H.
2018-06-01
The properties of semiconductor devices, including bipolar junction transistors (BJTs), are known to vary substantially in terms of their parameters. In this work, an experimental approach, including pattern recognition concepts and methods such as principal component analysis (PCA) and linear discriminant analysis (LDA), was used to experimentally investigate the variation among BJTs belonging to integrated circuits known as transistor arrays. It was shown that a good deal of the devices variance can be captured using only two PCA axes. It was also verified that, though substantially small variation of parameters is observed for BJT from the same array, larger variation arises between BJTs from distinct arrays, suggesting the consideration of device characteristics in more critical analog designs. As a consequence of its supervised nature, LDA was able to provide a substantial separation of the BJT into clusters, corresponding to each transistor array. In addition, the LDA mapping into two dimensions revealed a clear relationship between the considered measurements. Interestingly, a specific mapping suggested by the PCA, involving the total harmonic distortion variation expressed in terms of the average voltage gain, yielded an even better separation between the transistor array clusters. All in all, this work yielded interesting results from both semiconductor engineering and pattern recognition perspectives.
Zweber, Zandra M.; Henning, Robert A.; Magley, Vicki J.; Faghri, Pouran
2015-01-01
One potential way that healthy organizations can impact employee health is by promoting a climate for health within the organization. Using a definition of health climate that includes support for health from multiple levels within the organization, this study examines whether all three facets of health climate—the workgroup, supervisor, and organization—work together to contribute to employee well-being. Two samples are used in this study to examine health climate at the individual level and group level in order to provide a clearer picture of the impact of the three health climate facets. k-means cluster analysis was used on each sample to determine groups of individuals based on their levels of the three health climate facets. A discriminant function analysis was then run on each sample to determine if clusters differed on a function of employee well-being variables. Results provide evidence that having strength in all three of the facets is the most beneficial in terms of employee well-being at work. Findings from this study suggest that organizations must consider how health is treated within workgroups, how supervisors support employee health, and what the organization does to support employee health when promoting employee health. PMID:26380360
Zweber, Zandra M; Henning, Robert A; Magley, Vicki J; Faghri, Pouran
2015-01-01
One potential way that healthy organizations can impact employee health is by promoting a climate for health within the organization. Using a definition of health climate that includes support for health from multiple levels within the organization, this study examines whether all three facets of health climate--the workgroup, supervisor, and organization--work together to contribute to employee well-being. Two samples are used in this study to examine health climate at the individual level and group level in order to provide a clearer picture of the impact of the three health climate facets. k-means cluster analysis was used on each sample to determine groups of individuals based on their levels of the three health climate facets. A discriminant function analysis was then run on each sample to determine if clusters differed on a function of employee well-being variables. Results provide evidence that having strength in all three of the facets is the most beneficial in terms of employee well-being at work. Findings from this study suggest that organizations must consider how health is treated within workgroups, how supervisors support employee health, and what the organization does to support employee health when promoting employee health.
Ismail, Azimah; Toriman, Mohd Ekhwan; Juahir, Hafizan; Zain, Sharifuddin Md; Habir, Nur Liyana Abdul; Retnam, Ananthy; Kamaruddin, Mohd Khairul Amri; Umar, Roslan; Azid, Azman
2016-05-15
This study presents the determination of the spatial variation and source identification of heavy metal pollution in surface water along the Straits of Malacca using several chemometric techniques. Clustering and discrimination of heavy metal compounds in surface water into two groups (northern and southern regions) are observed according to level of concentrations via the application of chemometric techniques. Principal component analysis (PCA) demonstrates that Cu and Cr dominate the source apportionment in northern region with a total variance of 57.62% and is identified with mining and shipping activities. These are the major contamination contributors in the Straits. Land-based pollution originating from vehicular emission with a total variance of 59.43% is attributed to the high level of Pb concentration in the southern region. The results revealed that one state representing each cluster (northern and southern regions) is significant as the main location for investigating heavy metal concentration in the Straits of Malacca which would save monitoring cost and time. The monitoring of spatial variation and source of heavy metals pollution at the northern and southern regions of the Straits of Malacca, Malaysia, using chemometric analysis. Copyright © 2015 Elsevier Ltd. All rights reserved.
Unsupervised EEG analysis for automated epileptic seizure detection
NASA Astrophysics Data System (ADS)
Birjandtalab, Javad; Pouyan, Maziyar Baran; Nourani, Mehrdad
2016-07-01
Epilepsy is a neurological disorder which can, if not controlled, potentially cause unexpected death. It is extremely crucial to have accurate automatic pattern recognition and data mining techniques to detect the onset of seizures and inform care-givers to help the patients. EEG signals are the preferred biosignals for diagnosis of epileptic patients. Most of the existing pattern recognition techniques used in EEG analysis leverage the notion of supervised machine learning algorithms. Since seizure data are heavily under-represented, such techniques are not always practical particularly when the labeled data is not sufficiently available or when disease progression is rapid and the corresponding EEG footprint pattern will not be robust. Furthermore, EEG pattern change is highly individual dependent and requires experienced specialists to annotate the seizure and non-seizure events. In this work, we present an unsupervised technique to discriminate seizures and non-seizures events. We employ power spectral density of EEG signals in different frequency bands that are informative features to accurately cluster seizure and non-seizure events. The experimental results tried so far indicate achieving more than 90% accuracy in clustering seizure and non-seizure events without having any prior knowledge on patient's history.
Sun, Xiaomei; Wang, Haohao; Han, Xiaofeng; Chen, Shangwei; Zhu, Song; Dai, Jun
2014-12-19
A fingerprint analysis method has been developed for characterization and discrimination of polysaccharides from different Ganoderma by high performance liquid chromatography (HPLC) coupled with chemometrics means. The polysaccharides were extracted under ultrasonic-assisted condition, and then partly hydrolyzed with trifluoroacetic acid. Monosaccharides and oligosaccharides in the hydrolyzates were subjected to pre-column derivatization with 1-phenyl-3-methyl-5-pyrazolone and HPLC analysis, which will generate unique fingerprint information related to chemical composition and structure of polysaccharides. The peak data were imported to professional software in order to obtain standard fingerprint profiles and evaluate similarity of different samples. Meanwhile, the data were further processed by hierarchical cluster analysis and principal component analysis. Polysaccharides from different parts or species of Ganoderma or polysaccharides from the same parts of Ganoderma but from different geographical regions or different strains could be differentiated clearly. This fingerprint analysis method can be applied to identification and quality control of different Ganoderma and their products. Copyright © 2014 Elsevier Ltd. All rights reserved.
Melgarejo, Pablo; Legua, Pilar; Garcia-Sanchez, Francisco; Hernández, Francisca
2016-01-01
Background. Miguel Hernandez University (Spain) created a germplasm bank of the varieties of pomegranate from different Southeastern Spain localities in order to preserve the crop’s wide genetic diversity. Once this collection was established, the next step was to characterize the phenotype of these varieties to determine the phenotypic variability that existed among all the different pomegranate genotypes, and to understand the degree of polymorphism of the morphometric characteristics among varieties. Methods. Fifty-three pomegranate (Punica granatum L.) accessions were studied in order to determine their degree of polymorphism and to detect similarities in their genotypes. Thirty-one morphometric characteristics were measured in fruits, arils, seeds, leaves and flowers, as well as juice characteristics including content, pH, titratable acidity, total soluble solids and maturity index. ANOVA, principal component analysis, and cluster analysis showed that there was a considerable phenotypic diversity (and presumably genetic). Results. The cluster analysis produced a dendrogram with four main clusters. The dissimilarity level ranged from 1 to 25, indicating that there were varieties that were either very similar or very different from each other, with varieties from the same geographical areas being more closely related. Within each varietal group, different degrees of similarity were found, although there were no accessions that were identical. These results highlight the crop’s great genetic diversity, which can be explained not only by their different geographical origins, but also to the fact that these are native plants that have not come from genetic improvement programs. The geographic origin could be, in the cases where no exchanges of plant material took place, a key criterion for cultivar clustering. Conclusions. As a result of the present study, we can conclude that among all the parameters analyzed, those related to fruit and seed size as well as the juice’s acidity and pH had the highest power of discrimination, and were, therefore, the most useful for genetic characterization of this pomegranate germplasm banks. This is opposed to leaf and flower characteristics, which had a low power of discrimination. This germplasm bank, more specifically, was characterized by its considerable phenotypic (and presumably genetic) diversity among pomegranate accessions, with a greater proximity existing among the varieties from the same geographical area, suggesting that over time, there had not been an exchange of plant material among the different cultivation areas. In summary, knowledge on the extent of the genetic diversity of the collection is essential for germplasm management. In this study, these data may help in developing strategies for pomegranate germplasm management and may allow for more efficient use of this germplasm in future breeding programs for this species. PMID:27547535
Buried landmine detection using multivariate normal clustering
NASA Astrophysics Data System (ADS)
Duston, Brian M.
2001-10-01
A Bayesian classification algorithm is presented for discriminating buried land mines from buried and surface clutter in Ground Penetrating Radar (GPR) signals. This algorithm is based on multivariate normal (MVN) clustering, where feature vectors are used to identify populations (clusters) of mines and clutter objects. The features are extracted from two-dimensional images created from ground penetrating radar scans. MVN clustering is used to determine the number of clusters in the data and to create probability density models for target and clutter populations, producing the MVN clustering classifier (MVNCC). The Bayesian Information Criteria (BIC) is used to evaluate each model to determine the number of clusters in the data. An extension of the MVNCC allows the model to adapt to local clutter distributions by treating each of the MVN cluster components as a Poisson process and adaptively estimating the intensity parameters. The algorithm is developed using data collected by the Mine Hunter/Killer Close-In Detector (MH/K CID) at prepared mine lanes. The Mine Hunter/Killer is a prototype mine detecting and neutralizing vehicle developed for the U.S. Army to clear roads of anti-tank mines.
NASA Technical Reports Server (NTRS)
Storrie-Lombardi, Michael C.; Hoover, Richard B.
2005-01-01
Last year we presented techniques for the detection of fossils during robotic missions to Mars using both structural and chemical signatures[Storrie-Lombardi and Hoover, 2004]. Analyses included lossless compression of photographic images to estimate the relative complexity of a putative fossil compared to the rock matrix [Corsetti and Storrie-Lombardi, 2003] and elemental abundance distributions to provide mineralogical classification of the rock matrix [Storrie-Lombardi and Fisk, 2004]. We presented a classification strategy employing two exploratory classification algorithms (Principal Component Analysis and Hierarchical Cluster Analysis) and non-linear stochastic neural network to produce a Bayesian estimate of classification accuracy. We now present an extension of our previous experiments exploring putative fossil forms morphologically resembling cyanobacteria discovered in the Orgueil meteorite. Elemental abundances (C6, N7, O8, Na11, Mg12, Ai13, Si14, P15, S16, Cl17, K19, Ca20, Fe26) obtained for both extant cyanobacteria and fossil trilobites produce signatures readily distinguishing them from meteorite targets. When compared to elemental abundance signatures for extant cyanobacteria Orgueil structures exhibit decreased abundances for C6, N7, Na11, All3, P15, Cl17, K19, Ca20 and increases in Mg12, S16, Fe26. Diatoms and silicified portions of cyanobacterial sheaths exhibiting high levels of silicon and correspondingly low levels of carbon cluster more closely with terrestrial fossils than with extant cyanobacteria. Compression indices verify that variations in random and redundant textural patterns between perceived forms and the background matrix contribute significantly to morphological visual identification. The results provide a quantitative probabilistic methodology for discriminating putatitive fossils from the surrounding rock matrix and &om extant organisms using both structural and chemical information. The techniques described appear applicable to the geobiological analysis of meteoritic samples or in situ exploration of the Mars regolith. Keywords: cyanobacteria, microfossils, Mars, elemental abundances, complexity analysis, multifactor analysis, principal component analysis, hierarchical cluster analysis, artificial neural networks, paleo-biosignatures
Hakimzadeh, Neda; Parastar, Hadi; Fattahi, Mohammad
2014-01-24
In this study, multivariate curve resolution (MCR) and multivariate classification methods are proposed to develop a new chemometric strategy for comprehensive analysis of high-performance liquid chromatography-diode array absorbance detection (HPLC-DAD) fingerprints of sixty Salvia reuterana samples from five different geographical regions. Different chromatographic problems occurred during HPLC-DAD analysis of S. reuterana samples, such as baseline/background contribution and noise, low signal-to-noise ratio (S/N), asymmetric peaks, elution time shifts, and peak overlap are handled using the proposed strategy. In this way, chromatographic fingerprints of sixty samples are properly segmented to ten common chromatographic regions using local rank analysis and then, the corresponding segments are column-wise augmented for subsequent MCR analysis. Extended multivariate curve resolution-alternating least squares (MCR-ALS) is used to obtain pure component profiles in each segment. In general, thirty-one chemical components were resolved using MCR-ALS in sixty S. reuterana samples and the lack of fit (LOF) values of MCR-ALS models were below 10.0% in all cases. Pure spectral profiles are considered for identification of chemical components by comparing their resolved spectra with the standard ones and twenty-four components out of thirty-one components were identified. Additionally, pure elution profiles are used to obtain relative concentrations of chemical components in different samples for multivariate classification analysis by principal component analysis (PCA) and k-nearest neighbors (kNN). Inspection of the PCA score plot (explaining 76.1% of variance accounted for three PCs) showed that S. reuterana samples belong to four clusters. The degree of class separation (DCS) which quantifies the distance separating clusters in relation to the scatter within each cluster is calculated for four clusters and it was in the range of 1.6-5.8. These results are then confirmed by kNN. In addition, according to the PCA loading plot and kNN dendrogram of thirty-one variables, five chemical constituents of luteolin-7-o-glucoside, salvianolic acid D, rosmarinic acid, lithospermic acid and trijuganone A are identified as the most important variables (i.e., chemical markers) for clusters discrimination. Finally, the effect of different chemical markers on samples differentiation is investigated using counter-propagation artificial neural network (CP-ANN) method. It is concluded that the proposed strategy can be successfully applied for comprehensive analysis of chromatographic fingerprints of complex natural samples. Copyright © 2013 Elsevier B.V. All rights reserved.
Layfield, Lester J; Esebua, Magda; Schmidt, Robert L
2016-07-01
The separation of branchial cleft cysts from metastatic cystic squamous cell carcinomas in adults can be clinically and cytologically challenging. Diagnostic accuracy for separation is reported to be as low as 75% prompting some authors to recommend frozen section evaluation of suspected branchial cleft cysts before resection. We evaluated 19 cytologic features to determine which were useful in this distinction. Thirty-three cases (21 squamous carcinoma and 12 branchial cysts) of histologically confirmed cystic lesions of the lateral neck were graded for the presence or absence of 19 cytologic features by two cytopathologists. The cytologic features were analyzed for agreement between observers and underwent multivariate analysis for correlation with the diagnosis of carcinoma. Interobserver agreement was greatest for increased nuclear/cytoplasmic (N/C) ratio, pyknotic nuclei, and irregular nuclear membranes. Recursive partitioning analysis showed increased N/C ratio, small clusters of cells, and irregular nuclear membranes were the best discriminators. The distinction of branchial cleft cysts from cystic squamous cell carcinoma is cytologically difficult. Both digital image analysis and p16 testing have been suggested as aids in this separation, but analysis of cytologic features remains the main method for diagnosis. In an analysis of 19 cytologic features, we found that high nuclear cytoplasmic ratio, irregular nuclear membranes, and small cell clusters were most helpful in their distinction. Diagn. Cytopathol. 2016;44:561-567. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Zhu, Hongbin; Wang, Chunyan; Qi, Yao; Song, Fengrui; Liu, Zhiqiang; Liu, Shuying
2012-11-08
This study presents a novel and rapid method to identify chemical markers for the quality control of Radix Aconiti Preparata, a world widely used traditional herbal medicine. In the method, the samples with a fast extraction procedure were analyzed using direct analysis in real time mass spectrometry (DART MS) combined with multivariate data analysis. At present, the quality assessment approach of Radix Aconiti Preparata was based on the two processing methods recorded in Chinese Pharmacopoeia for the purpose of reducing the toxicity of Radix Aconiti and ensuring its clinical therapeutic efficacy. In order to ensure the safety and effectivity in clinical use, the processing degree of Radix Aconiti should be well controlled and assessed. In the paper, hierarchical cluster analysis and principal component analysis were performed to evaluate the DART MS data of Radix Aconiti Preparata samples in different processing times. The results showed that the well processed Radix Aconiti Preparata, unqualified processed and the raw Radix Aconiti could be clustered reasonably corresponding to their constituents. The loading plot shows that the main chemical markers having the most influence on the discrimination amongst the qualified and unqualified samples were mainly some monoester diterpenoid aconitines and diester diterpenoid aconitines, i.e. benzoylmesaconine, hypaconitine, mesaconitine, neoline, benzoylhypaconine, benzoylaconine, fuziline, aconitine and 10-OH-mesaconitine. The established DART MS approach in combination with multivariate data analysis provides a very flexible and reliable method for quality assessment of toxic herbal medicine. Copyright © 2012 Elsevier B.V. All rights reserved.
Simulating the Birth of Massive Star Clusters: Is Destruction Inevitable?
NASA Astrophysics Data System (ADS)
Rosen, Anna
2013-10-01
Very early in its operation, the Hubble Space Telescope {HST} opened an entirely new frontier: study of the demographics and properties of star clusters far beyond the Milky Way. However, interpretation of HST's observations has proven difficult, and has led to the development of two conflicting models. One view is that most massive star clusters are disrupted during their infancy by feedback from newly formed stars {i.e., "infant mortality"}, independent of cluster mass or environment. The other model is that most star clusters survive their infancy and are disrupted later by mass-dependent dynamical processes. Since observations at present have failed to discriminate between these views, we propose a theoretical investigation to provide new insight. We will perform radiation-hydrodynamic simulations of the formation of massive star clusters, including for the first time a realistic treatment of the most important stellar feedback processes. These simulations will elucidate the physics of stellar feedback, and allow us to determine whether cluster disruption is mass-dependent or -independent. We will also use our simulations to search for observational diagnostics that can distinguish bound from unbound clusters, and to predict how cluster disruption affects the cluster luminosity function in a variety of galactic environments.
NASA Astrophysics Data System (ADS)
Stauffer, R. M.; Thompson, A. M.; Young, G. S.; Oltmans, S. J.; Johnson, B.
2016-12-01
Ozone (O3) climatologies are typically created by averaging ozonesonde profiles on a monthly or seasonal basis, either for specific regions or zonally. We demonstrate the advantages of using a statistical clustering technique, self-organizing maps (SOM), over this simple averaging, through analysis of more than 4500 sonde profiles taken from the long-term US sites at Boulder, CO; Huntsville, AL; Trinidad Head, CA; and Wallops Island, VA. First, we apply SOM to O3 mixing ratios from surface to 12 km amsl. At all four sites, profiles in SOM clusters exhibit similar tropopause height, 500 hPa height and temperature, and total and tropospheric column O3. Second, when profiles from each SOM cluster are compared to monthly O3 means, near-tropopause O3 in three of the clusters is double (over +100 ppbv) the climatological O3 mixing ratio. The three clusters include 13-16% of all profiles, mostly from winter and spring. Large mid-tropospheric deviations from monthly means are found in two highly-populated clusters that represent either distinctly polluted (summer) or clean O3 (fall-winter, high tropopause) profiles. Thus, SOM indeed appear to represent US O3 profile statistics better than conventional climatologies. In the case of Trinidad Head, SOM clusters of O3 profile data from the lower troposphere (surface-6 km amsl) can discriminate background vs polluted O3 and the meteorology associated with each. Two of nine O3 clusters exhibit thin layers ( 100s of m thick) of high O3, typically between 1 and 4 km. Comparisons between clusters and downwind, high-altitude surface O3 measurements display a marked impact of the elevated tropospheric O3. Days corresponding to the high O3 clusters exhibit hourly surface O3 anomalies at surface sites of +5 -10 ppbv compared to a climatology; the anomalies can last up to four days. We also explore applications of SOM to tropical ozonesonde profiles, where tropospheric O3 variability is generally smaller.