Mixture modelling for cluster analysis.
McLachlan, G J; Chang, S U
2004-10-01
Cluster analysis via a finite mixture model approach is considered. With this approach to clustering, the data can be partitioned into a specified number of clusters g by first fitting a mixture model with g components. An outright clustering of the data is then obtained by assigning an observation to the component to which it has the highest estimated posterior probability of belonging; that is, the ith cluster consists of those observations assigned to the ith component (i = 1,..., g). The focus is on the use of mixtures of normal components for the cluster analysis of data that can be regarded as being continuous. But attention is also given to the case of mixed data, where the observations consist of both continuous and discrete variables.
Peterson, Leif E
2002-01-01
CLUSFAVOR (CLUSter and Factor Analysis with Varimax Orthogonal Rotation) 5.0 is a Windows-based computer program for hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles. CLUSFAVOR 5.0 standardizes input data; sorts data according to gene-specific coefficient of variation, standard deviation, average and total expression, and Shannon entropy; performs hierarchical cluster analysis using nearest-neighbor, unweighted pair-group method using arithmetic averages (UPGMA), or furthest-neighbor joining methods, and Euclidean, correlation, or jack-knife distances; and performs principal-component analysis. PMID:12184816
Assessment of cluster yield components by image analysis.
Diago, Maria P; Tardaguila, Javier; Aleixos, Nuria; Millan, Borja; Prats-Montalban, Jose M; Cubero, Sergio; Blasco, Jose
2015-04-01
Berry weight, berry number and cluster weight are key parameters for yield estimation for wine and tablegrape industry. Current yield prediction methods are destructive, labour-demanding and time-consuming. In this work, a new methodology, based on image analysis was developed to determine cluster yield components in a fast and inexpensive way. Clusters of seven different red varieties of grapevine (Vitis vinifera L.) were photographed under laboratory conditions and their cluster yield components manually determined after image acquisition. Two algorithms based on the Canny and the logarithmic image processing approaches were tested to find the contours of the berries in the images prior to berry detection performed by means of the Hough Transform. Results were obtained in two ways: by analysing either a single image of the cluster or using four images per cluster from different orientations. The best results (R(2) between 69% and 95% in berry detection and between 65% and 97% in cluster weight estimation) were achieved using four images and the Canny algorithm. The model's capability based on image analysis to predict berry weight was 84%. The new and low-cost methodology presented here enabled the assessment of cluster yield components, saving time and providing inexpensive information in comparison with current manual methods. © 2014 Society of Chemical Industry.
Gifford, Elizabeth V; Tavakoli, Sara; Weingardt, Kenneth R; Finney, John W; Pierson, Heather M; Rosen, Craig S; Hagedorn, Hildi J; Cook, Joan M; Curran, Geoff M
2012-01-01
Evidence-based psychological treatments (EBPTs) are clusters of interventions, but it is unclear how providers actually implement these clusters in practice. A disaggregated measure of EBPTs was developed to characterize clinicians' component-level evidence-based practices and to examine relationships among these practices. Survey items captured components of evidence-based treatments based on treatment integrity measures. The Web-based survey was conducted with 75 U.S. Department of Veterans Affairs (VA) substance use disorder (SUD) practitioners and 149 non-VA community-based SUD practitioners. Clinician's self-designated treatment orientations were positively related to their endorsement of those EBPT components; however, clinicians used components from a variety of EBPTs. Hierarchical cluster analysis indicated that clinicians combined and organized interventions from cognitive-behavioral therapy, the community reinforcement approach, motivational interviewing, structured family and couples therapy, 12-step facilitation, and contingency management into clusters including empathy and support, treatment engagement and activation, abstinence initiation, and recovery maintenance. Understanding how clinicians use EBPT components may lead to improved evidence-based practice dissemination and implementation. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue
2017-08-01
Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.
Principal Component Clustering Approach to Teaching Quality Discriminant Analysis
ERIC Educational Resources Information Center
Xian, Sidong; Xia, Haibo; Yin, Yubo; Zhai, Zhansheng; Shang, Yan
2016-01-01
Teaching quality is the lifeline of the higher education. Many universities have made some effective achievement about evaluating the teaching quality. In this paper, we establish the Students' evaluation of teaching (SET) discriminant analysis model and algorithm based on principal component clustering analysis. Additionally, we classify the SET…
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
Multivariate Statistical Analysis of MSL APXS Bulk Geochemical Data
NASA Astrophysics Data System (ADS)
Hamilton, V. E.; Edwards, C. S.; Thompson, L. M.; Schmidt, M. E.
2014-12-01
We apply cluster and factor analyses to bulk chemical data of 130 soil and rock samples measured by the Alpha Particle X-ray Spectrometer (APXS) on the Mars Science Laboratory (MSL) rover Curiosity through sol 650. Multivariate approaches such as principal components analysis (PCA), cluster analysis, and factor analysis compliment more traditional approaches (e.g., Harker diagrams), with the advantage of simultaneously examining the relationships between multiple variables for large numbers of samples. Principal components analysis has been applied with success to APXS, Pancam, and Mössbauer data from the Mars Exploration Rovers. Factor analysis and cluster analysis have been applied with success to thermal infrared (TIR) spectral data of Mars. Cluster analyses group the input data by similarity, where there are a number of different methods for defining similarity (hierarchical, density, distribution, etc.). For example, without any assumptions about the chemical contributions of surface dust, preliminary hierarchical and K-means cluster analyses clearly distinguish the physically adjacent rock targets Windjana and Stephen as being distinctly different than lithologies observed prior to Curiosity's arrival at The Kimberley. In addition, they are separated from each other, consistent with chemical trends observed in variation diagrams but without requiring assumptions about chemical relationships. We will discuss the variation in cluster analysis results as a function of clustering method and pre-processing (e.g., log transformation, correction for dust cover) and implications for interpreting chemical data. Factor analysis shares some similarities with PCA, and examines the variability among observed components of a dataset so as to reveal variations attributable to unobserved components. Factor analysis has been used to extract the TIR spectra of components that are typically observed in mixtures and only rarely in isolation; there is the potential for similar results with data from APXS. These techniques offer new ways to understand the chemical relationships between the materials interrogated by Curiosity, and potentially their relation to materials observed by APXS instruments on other landed missions.
Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat
2010-08-01
Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food.
Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat
2010-01-01
Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food. PMID:24575193
Motegi, Hiromi; Tsuboi, Yuuri; Saga, Ayako; Kagami, Tomoko; Inoue, Maki; Toki, Hideaki; Minowa, Osamu; Noda, Tetsuo; Kikuchi, Jun
2015-11-04
There is an increasing need to use multivariate statistical methods for understanding biological functions, identifying the mechanisms of diseases, and exploring biomarkers. In addition to classical analyses such as hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis, various multivariate strategies, including independent component analysis, non-negative matrix factorization, and multivariate curve resolution, have recently been proposed. However, determining the number of components is problematic. Despite the proposal of several different methods, no satisfactory approach has yet been reported. To resolve this problem, we implemented a new idea: classifying a component as "reliable" or "unreliable" based on the reproducibility of its appearance, regardless of the number of components in the calculation. Using the clustering method for classification, we applied this idea to multivariate curve resolution-alternating least squares (MCR-ALS). Comparisons between conventional and modified methods applied to proton nuclear magnetic resonance ((1)H-NMR) spectral datasets derived from known standard mixtures and biological mixtures (urine and feces of mice) revealed that more plausible results are obtained by the modified method. In particular, clusters containing little information were detected with reliability. This strategy, named "cluster-aided MCR-ALS," will facilitate the attainment of more reliable results in the metabolomics datasets.
Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong
2015-08-07
Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
Use of multivariate statistics to identify unreliable data obtained using CASA.
Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón
2013-06-01
In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.
The Productivity Analysis of Chennai Automotive Industry Cluster
NASA Astrophysics Data System (ADS)
Bhaskaran, E.
2014-07-01
Chennai, also called the Detroit of India, is India's second fastest growing auto market and exports auto components and vehicles to US, Germany, Japan and Brazil. For inclusive growth and sustainable development, 250 auto component industries in Ambattur, Thirumalisai and Thirumudivakkam Industrial Estates located in Chennai have adopted the Cluster Development Approach called Automotive Component Cluster. The objective is to study the Value Chain, Correlation and Data Envelopment Analysis by determining technical efficiency, peer weights, input and output slacks of 100 auto component industries in three estates. The methodology adopted is using Data Envelopment Analysis of Output Oriented Banker Charnes Cooper model by taking net worth, fixed assets, employment as inputs and gross output as outputs. The non-zero represents the weights for efficient clusters. The higher slack obtained reveals the excess net worth, fixed assets, employment and shortage in gross output. To conclude, the variables are highly correlated and the inefficient industries should increase their gross output or decrease the fixed assets or employment. Moreover for sustainable development, the cluster should strengthen infrastructure, technology, procurement, production and marketing interrelationships to decrease costs and to increase productivity and efficiency to compete in the indigenous and export market.
Orbit Clustering Based on Transfer Cost
NASA Technical Reports Server (NTRS)
Gustafson, Eric D.; Arrieta-Camacho, Juan J.; Petropoulos, Anastassios E.
2013-01-01
We propose using cluster analysis to perform quick screening for combinatorial global optimization problems. The key missing component currently preventing cluster analysis from use in this context is the lack of a useable metric function that defines the cost to transfer between two orbits. We study several proposed metrics and clustering algorithms, including k-means and the expectation maximization algorithm. We also show that proven heuristic methods such as the Q-law can be modified to work with cluster analysis.
Batch Computed Tomography Analysis of Projectiles
2016-05-01
error calculation. Projectiles are then grouped together according to the similarity of their components. Also discussed is graphical- cluster analysis...ballistic, armor, grouping, clustering 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT UU 18. NUMBER OF...Fig. 10 Graphical structure of 15 clusters of the jacket/core radii profiles with plots of the profiles contained within each cluster . The size of
Diao, K; Farmani, R; Fu, G; Astaraie-Imani, M; Ward, S; Butler, D
2014-01-01
Large water distribution systems (WDSs) are networks with both topological and behavioural complexity. Thereby, it is usually difficult to identify the key features of the properties of the system, and subsequently all the critical components within the system for a given purpose of design or control. One way is, however, to more explicitly visualize the network structure and interactions between components by dividing a WDS into a number of clusters (subsystems). Accordingly, this paper introduces a clustering strategy that decomposes WDSs into clusters with stronger internal connections than external connections. The detected cluster layout is very similar to the community structure of the served urban area. As WDSs may expand along with urban development in a community-by-community manner, the correspondingly formed distribution clusters may reveal some crucial configurations of WDSs. For verification, the method is applied to identify all the critical links during firefighting for the vulnerability analysis of a real-world WDS. Moreover, both the most critical pipes and clusters are addressed, given the consequences of pipe failure. Compared with the enumeration method, the method used in this study identifies the same group of the most critical components, and provides similar criticality prioritizations of them in a more computationally efficient time.
NASA Astrophysics Data System (ADS)
Ginanjar, Irlandia; Pasaribu, Udjianna S.; Indratno, Sapto W.
2017-03-01
This article presents the application of the principal component analysis (PCA) biplot for the needs of data mining. This article aims to simplify and objectify the methods for objects clustering in PCA biplot. The novelty of this paper is to get a measure that can be used to objectify the objects clustering in PCA biplot. Orthonormal eigenvectors, which are the coefficients of a principal component model representing an association between principal components and initial variables. The existence of the association is a valid ground to objects clustering based on principal axes value, thus if m principal axes used in the PCA, then the objects can be classified into 2m clusters. The inter-city buses are clustered based on maintenance costs data by using two principal axes PCA biplot. The buses are clustered into four groups. The first group is the buses with high maintenance costs, especially for lube, and brake canvass. The second group is the buses with high maintenance costs, especially for tire, and filter. The third group is the buses with low maintenance costs, especially for lube, and brake canvass. The fourth group is buses with low maintenance costs, especially for tire, and filter.
Hidaka, Tomoo; Hayakawa, Takehito; Kakamu, Takeyasu; Kumagai, Tomohiro; Hiruta, Yuhei; Hata, Junko; Tsuji, Masayoshi; Fukushima, Tetsuhito
2016-01-01
The present study was a cross-sectional study conducted to reveal the prevalence of metabolic syndrome and its components and describe the features of such prevalence among Japanese workers by clustered business category using big data. The data of approximately 120,000 workers were obtained from a national representative insurance organization, and the study analyzed the health checkup and questionnaire results according to the field of business of each subject. Abnormalities found during the checkups such as excessive waist circumference, hypertension or glucose intolerance, and metabolic syndrome, were recorded. All subjects were classified by business field into 18 categories based on The North American Industry Classification System. Based on the criteria of the Japanese Committee for the Diagnostic Criteria of Metabolic Syndrome, the standardized prevalence ratio (SPR) of metabolic syndrome and its components by business category was calculated, and the 95% confidence interval of the SPR was computed. Hierarchical cluster analysis was then performed based on the SPR of metabolic syndrome components, and the 18 business categories were classified into three clusters for both males and females. The following business categories were at significantly high risk of metabolic syndrome: among males, Construction, Transportation, Professional Services, and Cooperative Association; and among females, Health Care and Cooperative Association. The results of the cluster analysis indicated one cluster for each gender with a higher prevalence of metabolic syndrome components; among males, a cluster consisting of Manufacturing, Transportation, Finance, and Cooperative Association, and among females, a cluster consisting of Mining, Transportation, Finance, Accommodation, and Cooperative Association. These findings reveal that, when providing health guidance and support regarding metabolic syndrome, consideration must be given to its components and the variety of its prevalence rates by business category and gender.
Hidaka, Tomoo; Hayakawa, Takehito; Kakamu, Takeyasu; Kumagai, Tomohiro; Hiruta, Yuhei; Hata, Junko; Tsuji, Masayoshi; Fukushima, Tetsuhito
2016-01-01
The present study was a cross-sectional study conducted to reveal the prevalence of metabolic syndrome and its components and describe the features of such prevalence among Japanese workers by clustered business category using big data. The data of approximately 120,000 workers were obtained from a national representative insurance organization, and the study analyzed the health checkup and questionnaire results according to the field of business of each subject. Abnormalities found during the checkups such as excessive waist circumference, hypertension or glucose intolerance, and metabolic syndrome, were recorded. All subjects were classified by business field into 18 categories based on The North American Industry Classification System. Based on the criteria of the Japanese Committee for the Diagnostic Criteria of Metabolic Syndrome, the standardized prevalence ratio (SPR) of metabolic syndrome and its components by business category was calculated, and the 95% confidence interval of the SPR was computed. Hierarchical cluster analysis was then performed based on the SPR of metabolic syndrome components, and the 18 business categories were classified into three clusters for both males and females. The following business categories were at significantly high risk of metabolic syndrome: among males, Construction, Transportation, Professional Services, and Cooperative Association; and among females, Health Care and Cooperative Association. The results of the cluster analysis indicated one cluster for each gender with a higher prevalence of metabolic syndrome components; among males, a cluster consisting of Manufacturing, Transportation, Finance, and Cooperative Association, and among females, a cluster consisting of Mining, Transportation, Finance, Accommodation, and Cooperative Association. These findings reveal that, when providing health guidance and support regarding metabolic syndrome, consideration must be given to its components and the variety of its prevalence rates by business category and gender. PMID:27082961
Groundwater quality assessment of urban Bengaluru using multivariate statistical techniques
NASA Astrophysics Data System (ADS)
Gulgundi, Mohammad Shahid; Shetty, Amba
2018-03-01
Groundwater quality deterioration due to anthropogenic activities has become a subject of prime concern. The objective of the study was to assess the spatial and temporal variations in groundwater quality and to identify the sources in the western half of the Bengaluru city using multivariate statistical techniques. Water quality index rating was calculated for pre and post monsoon seasons to quantify overall water quality for human consumption. The post-monsoon samples show signs of poor quality in drinking purpose compared to pre-monsoon. Cluster analysis (CA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the groundwater quality data measured on 14 parameters from 67 sites distributed across the city. Hierarchical cluster analysis (CA) grouped the 67 sampling stations into two groups, cluster 1 having high pollution and cluster 2 having lesser pollution. Discriminant analysis (DA) was applied to delineate the most meaningful parameters accounting for temporal and spatial variations in groundwater quality of the study area. Temporal DA identified pH as the most important parameter, which discriminates between water quality in the pre-monsoon and post-monsoon seasons and accounts for 72% seasonal assignation of cases. Spatial DA identified Mg, Cl and NO3 as the three most important parameters discriminating between two clusters and accounting for 89% spatial assignation of cases. Principal component analysis was applied to the dataset obtained from the two clusters, which evolved three factors in each cluster, explaining 85.4 and 84% of the total variance, respectively. Varifactors obtained from principal component analysis showed that groundwater quality variation is mainly explained by dissolution of minerals from rock water interactions in the aquifer, effect of anthropogenic activities and ion exchange processes in water.
NASA Technical Reports Server (NTRS)
Zhu, Dongming; Chen, Yuan L.; Miller, Robert A.
2003-01-01
Advanced oxide thermal barrier coatings have been developed by incorporating multi-component rare earth oxide dopants into zirconia-yttria to effectively promote the creation of the thermodynamically stable, immobile oxide defect clusters and/or nano-scale phases within the coating systems. The presence of these nano-sized defect clusters has found to significantly reduce the coating intrinsic thermal conductivity, improve sintering resistance, and maintain long-term high temperature stability. In this paper, the defect clusters and nano-structured phases, which were created by the addition of multi-component rare earth dopants to the plasma-sprayed and electron-beam physical vapor deposited thermal barrier coatings, were characterized by high-resolution transmission electron microscopy (TEM). The defect cluster size, distribution, crystallographic and compositional information were investigated using high-resolution TEM lattice imaging, selected area diffraction (SAD), electron energy-loss spectroscopy (EELS) and energy dispersive spectroscopy (EDS) analysis techniques. The results showed that substantial defect clusters were formed in the advanced multi-component rare earth oxide doped zirconia- yttria systems. The size of the oxide defect clusters and the cluster dopant segregation was typically ranging from 5 to 50 nm. These multi-component dopant induced defect clusters are an important factor for the coating long-term high temperature stability and excellent performance.
NASA Technical Reports Server (NTRS)
Zhu, Dongming; Chen, Yuan L.; Miller, Robert A.
1990-01-01
Advanced oxide thermal barrier coatings have been developed by incorporating multi- component rare earth oxide dopants into zirconia-yttria to effectively promote the creation of the thermodynamically stable, immobile oxide defect clusters and/or nano-scale phases within the coating systems. The presence of these nano-sized defect clusters has found to significantly reduce the coating intrinsic thermal conductivity, improve sintering resistance, and maintain long-term high temperature stability. In this paper, the defect clusters and nano-structured phases, which were created by the addition of multi-component rare earth dopants to the plasma- sprayed and electron-beam physical vapor deposited thermal barrier coatings, were characterized by high-resolution transmission electron microscopy (TEM). The defect cluster size, distribution, crystallographic and compositional information were investigated using high-resolution TEM lattice imaging, selected area diffraction (SAD), and energy dispersive spectroscopy (EDS) analysis techniques. The results showed that substantial defect clusters were formed in the advanced multi-component rare earth oxide doped zirconia-yttria systems. The size of the oxide defect clusters and the cluster dopant segregation was typically ranging fiom 5 to 50 nm. These multi-component dopant induced defect clusters are an important factor for the coating long-term high temperature stability and excellent performance.
Using Machine Learning Techniques in the Analysis of Oceanographic Data
NASA Astrophysics Data System (ADS)
Falcinelli, K. E.; Abuomar, S.
2017-12-01
Acoustic Doppler Current Profilers (ADCPs) are oceanographic tools capable of collecting large amounts of current profile data. Using unsupervised machine learning techniques such as principal component analysis, fuzzy c-means clustering, and self-organizing maps, patterns and trends in an ADCP dataset are found. Cluster validity algorithms such as visual assessment of cluster tendency and clustering index are used to determine the optimal number of clusters in the ADCP dataset. These techniques prove to be useful in analysis of ADCP data and demonstrate potential for future use in other oceanographic applications.
PCA based clustering for brain tumor segmentation of T1w MRI images.
Kaya, Irem Ersöz; Pehlivanlı, Ayça Çakmak; Sekizkardeş, Emine Gezmez; Ibrikci, Turgay
2017-03-01
Medical images are huge collections of information that are difficult to store and process consuming extensive computing time. Therefore, the reduction techniques are commonly used as a data pre-processing step to make the image data less complex so that a high-dimensional data can be identified by an appropriate low-dimensional representation. PCA is one of the most popular multivariate methods for data reduction. This paper is focused on T1-weighted MRI images clustering for brain tumor segmentation with dimension reduction by different common Principle Component Analysis (PCA) algorithms. Our primary aim is to present a comparison between different variations of PCA algorithms on MRIs for two cluster methods. Five most common PCA algorithms; namely the conventional PCA, Probabilistic Principal Component Analysis (PPCA), Expectation Maximization Based Principal Component Analysis (EM-PCA), Generalize Hebbian Algorithm (GHA), and Adaptive Principal Component Extraction (APEX) were applied to reduce dimensionality in advance of two clustering algorithms, K-Means and Fuzzy C-Means. In the study, the T1-weighted MRI images of the human brain with brain tumor were used for clustering. In addition to the original size of 512 lines and 512 pixels per line, three more different sizes, 256 × 256, 128 × 128 and 64 × 64, were included in the study to examine their effect on the methods. The obtained results were compared in terms of both the reconstruction errors and the Euclidean distance errors among the clustered images containing the same number of principle components. According to the findings, the PPCA obtained the best results among all others. Furthermore, the EM-PCA and the PPCA assisted K-Means algorithm to accomplish the best clustering performance in the majority as well as achieving significant results with both clustering algorithms for all size of T1w MRI images. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie
2011-11-01
The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.
Matsen IV, Frederick A.; Evans, Steven N.
2013-01-01
Principal components analysis (PCA) and hierarchical clustering are two of the most heavily used techniques for analyzing the differences between nucleic acid sequence samples taken from a given environment. They have led to many insights regarding the structure of microbial communities. We have developed two new complementary methods that leverage how this microbial community data sits on a phylogenetic tree. Edge principal components analysis enables the detection of important differences between samples that contain closely related taxa. Each principal component axis is a collection of signed weights on the edges of the phylogenetic tree, and these weights are easily visualized by a suitable thickening and coloring of the edges. Squash clustering outputs a (rooted) clustering tree in which each internal node corresponds to an appropriate “average” of the original samples at the leaves below the node. Moreover, the length of an edge is a suitably defined distance between the averaged samples associated with the two incident nodes, rather than the less interpretable average of distances produced by UPGMA, the most widely used hierarchical clustering method in this context. We present these methods and illustrate their use with data from the human microbiome. PMID:23505415
ERIC Educational Resources Information Center
Steinley, Douglas; Brusco, Michael J.; Henson, Robert
2012-01-01
A measure of "clusterability" serves as the basis of a new methodology designed to preserve cluster structure in a reduced dimensional space. Similar to principal component analysis, which finds the direction of maximal variance in multivariate space, principal cluster axes find the direction of maximum clusterability in multivariate space.…
NASA Astrophysics Data System (ADS)
Annunziatella, M.; Bonamigo, M.; Grillo, C.; Mercurio, A.; Rosati, P.; Caminha, G.; Biviano, A.; Girardi, M.; Gobat, R.; Lombardi, M.; Munari, E.
2017-12-01
We present a high-resolution dissection of the two-dimensional total mass distribution in the core of the Hubble Frontier Fields galaxy cluster MACS J0416.1‑2403, at z = 0.396. We exploit HST/WFC3 near-IR (F160W) imaging, VLT/Multi Unit Spectroscopic Explorer spectroscopy, and Chandra data to separate the stellar, hot gas, and dark-matter mass components in the inner 300 kpc of the cluster. We combine the recent results of our refined strong lensing analysis, which includes the contribution of the intracluster gas, with the modeling of the surface brightness and stellar mass distributions of 193 cluster members, of which 144 are spectroscopically confirmed. We find that, moving from 10 to 300 kpc from the cluster center, the stellar to total mass fraction decreases from 12% to 1% and the hot gas to total mass fraction increases from 3% to 9%, resulting in a baryon fraction of approximatively 10% at the outermost radius. We measure that the stellar component represents ∼30%, near the cluster center, and 15%, at larger clustercentric distances, of the total mass in the cluster substructures. We subtract the baryonic mass component from the total mass distribution and conclude that within 30 kpc (∼3 times the effective radius of the brightest cluster galaxy) from the cluster center the surface mass density profile of the total mass and global (cluster plus substructures) dark-matter are steeper and that of the diffuse (cluster) dark-matter is shallower than an NFW profile. Our current analysis does not point to a significant offset between the cluster stellar and dark-matter components. This detailed and robust reconstruction of the inner dark-matter distribution in a larger sample of galaxy clusters will set a new benchmark for different structure formation scenarios.
Song, Yuqiao; Liao, Jie; Dong, Junxing; Chen, Li
2015-09-01
The seeds of grapevine (Vitis vinifera) are a byproduct of wine production. To examine the potential value of grape seeds, grape seeds from seven sources were subjected to fingerprinting using direct analysis in real time coupled with time-of-flight mass spectrometry combined with chemometrics. Firstly, we listed all reported components (56 components) from grape seeds and calculated the precise m/z values of the deprotonated ions [M-H](-) . Secondly, the experimental conditions were systematically optimized based on the peak areas of total ion chromatograms of the samples. Thirdly, the seven grape seed samples were examined using the optimized method. Information about 20 grape seed components was utilized to represent characteristic fingerprints. Finally, hierarchical clustering analysis and principal component analysis were performed to analyze the data. Grape seeds from seven different sources were classified into two clusters; hierarchical clustering analysis and principal component analysis yielded similar results. The results of this study lay the foundation for appropriate utilization and exploitation of grape seed samples. Due to the absence of complicated sample preparation methods and chromatographic separation, the method developed in this study represents one of the simplest and least time-consuming methods for grape seed fingerprinting. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Improving Cluster Analysis with Automatic Variable Selection Based on Trees
2014-12-01
regression trees Daisy DISsimilAritY PAM partitioning around medoids PMA penalized multivariate analysis SPC sparse principal components UPGMA unweighted...unweighted pair-group average method ( UPGMA ). This method measures dissimilarities between all objects in two clusters and takes the average value
Functional Connectivity Parcellation of the Human Thalamus by Independent Component Analysis.
Zhang, Sheng; Li, Chiang-Shan R
2017-11-01
As a key structure to relay and integrate information, the thalamus supports multiple cognitive and affective functions through the connectivity between its subnuclei and cortical and subcortical regions. Although extant studies have largely described thalamic regional functions in anatomical terms, evidence accumulates to suggest a more complex picture of subareal activities and connectivities of the thalamus. In this study, we aimed to parcellate the thalamus and examine whole-brain connectivity of its functional clusters. With resting state functional magnetic resonance imaging data from 96 adults, we used independent component analysis (ICA) to parcellate the thalamus into 10 components. On the basis of the independence assumption, ICA helps to identify how subclusters overlap spatially. Whole brain functional connectivity of each subdivision was computed for independent component's time course (ICtc), which is a unique time series to represent an IC. For comparison, we computed seed-region-based functional connectivity using the averaged time course across all voxels within a thalamic subdivision. The results showed that, at p < 10 -6 , corrected, 49% of voxels on average overlapped among subdivisions. Compared with seed-region analysis, ICtc analysis revealed patterns of connectivity that were more distinguished between thalamic clusters. ICtc analysis demonstrated thalamic connectivity to the primary motor cortex, which has eluded the analysis as well as previous studies based on averaged time series, and clarified thalamic connectivity to the hippocampus, caudate nucleus, and precuneus. The new findings elucidate functional organization of the thalamus and suggest that ICA clustering in combination with ICtc rather than seed-region analysis better distinguishes whole-brain connectivities among functional clusters of a brain region.
Preliminary Comparisons of the Information Content and Utility of TM Versus MSS Data
NASA Technical Reports Server (NTRS)
Markham, B. L.
1984-01-01
Comparisons were made between subscenes from the first TM scene acquired of the Washington, D.C. area and a MSS scene acquired approximately one year earlier. Three types of analyses were conducted to compare TM and MSS data: a water body analysis, a principal components analysis and a spectral clustering analysis. The water body analysis compared the capability of the TM to the MSS for detecting small uniform targets. Of the 59 ponds located on aerial photographs 34 (58%) were detected by the TM with six commission errors (15%) and 13 (22%) were detected by the MSS with three commission errors (19%). The smallest water body detected by the TM was 16 meters; the smallest detected by the MSS was 40 meters. For the principal components analysis, means and covariance matrices were calculated for each subscene, and principal components images generated and characterized. In the spectral clustering comparison each scene was independently clustered and the clusters were assigned to informational classes. The preliminary comparison indicated that TM data provides enhancements over MSS in terms of (1) small target detection and (2) data dimensionality (even with 4-band data). The extra dimension, partially resultant from TM band 1, appears useful for built-up/non-built-up area separation.
NASA Astrophysics Data System (ADS)
Okabe, Taizo; Nishimichi, Takahiro; Oguri, Masamune; Peirani, Sébastien; Kitayama, Tetsu; Sasaki, Shin; Suto, Yasushi
2018-04-01
While various observations measured ellipticities of galaxy clusters and alignments between orientations of the brightest cluster galaxies and their host clusters, there are only a handful of numerical simulations that implement realistic baryon physics to allow direct comparisons with those observations. Here we investigate ellipticities of galaxy clusters and alignments between various components of them and the central galaxies in the state-of-the-art cosmological hydrodynamical simulation Horizon-AGN, which contains dark matter, stellar, and gas components in a large simulation box of (100h-1 Mpc)3 with high spatial resolution (˜1 kpc). We estimate ellipticities of total matter, dark matter, stellar, gas surface mass density distributions, X-ray surface brightness, and the Compton y-parameter of the Sunyaev-Zel'dovich effect, as well as alignments between these components and the central galaxies for 120 projected images of galaxy clusters with masses M200 > 5 × 1013M⊙. Our results indicate that the distributions of these components are well aligned with the major-axes of the central galaxies, with the root mean square value of differences of their position angles of ˜20°, which vary little from inner to the outer regions. We also estimate alignments of these various components with total matter distributions, and find tighter alignments than those for central galaxies with the root mean square value of ˜15°. We compare our results with previous observations of ellipticities and position angle alignments and find reasonable agreements. The comprehensive analysis presented in this paper provides useful prior information for analyzing stacked lensing signals as well as designing future observations to study ellipticities and alignments of galaxy clusters.
NASA Astrophysics Data System (ADS)
Okabe, Taizo; Nishimichi, Takahiro; Oguri, Masamune; Peirani, Sébastien; Kitayama, Tetsu; Sasaki, Shin; Suto, Yasushi
2018-07-01
While various observations measured ellipticities of galaxy clusters and alignments between orientations of the brightest cluster galaxies and their host clusters, there are only a handful of numerical simulations that implement realistic baryon physics to allow direct comparisons with those observations. Here, we investigate ellipticities of galaxy clusters and alignments between various components of them and the central galaxies in the state-of-the-art cosmological hydrodynamical simulation Horizon-AGN, which contains dark matter, stellar, and gas components in a large simulation box of (100h-1 Mpc)3 with high spatial resolution (˜1 kpc). We estimate ellipticities of total matter, dark matter, stellar, gas surface mass density distributions, X-ray surface brightness, and the Compton y-parameter of the Sunyaev-Zel'dovich effect, as well as alignments between these components and the central galaxies for 120 projected images of galaxy clusters with masses M200 > 5 × 1013 M⊙. Our results indicate that the distributions of these components are well aligned with the major axes of the central galaxies, with the root-mean-square value of differences of their position angles of ˜20°, which vary little from inner to the outer regions. We also estimate alignments of these various components with total matter distributions, and find tighter alignments than those for central galaxies with the root-mean-square value of ˜15°. We compare our results with previous observations of ellipticities and position angle alignments and find reasonable agreements. The comprehensive analysis presented in this paper provides useful prior information for analysing stacked lensing signals as well as designing future observations to study ellipticities and alignments of galaxy clusters.
Ma, Li; Sun, Jing; Yang, Zhaoguang; Wang, Lin
2015-12-01
Heavy metal contamination attracted a wide spread attention due to their strong toxicity and persistence. The Ganxi River, located in Chenzhou City, Southern China, has been severely polluted by lead/zinc ore mining activities. This work investigated the heavy metal pollution in agricultural soils around the Ganxi River. The total concentrations of heavy metals were determined by inductively coupled plasma-mass spectrometry. The potential risk associated with the heavy metals in soil was assessed by Nemerow comprehensive index and potential ecological risk index. In both methods, the study area was rated as very high risk. Multivariate statistical methods including Pearson's correlation analysis, hierarchical cluster analysis, and principal component analysis were employed to evaluate the relationships between heavy metals, as well as the correlation between heavy metals and pH, to identify the metal sources. Three distinct clusters have been observed by hierarchical cluster analysis. In principal component analysis, a total of two components were extracted to explain over 90% of the total variance, both of which were associated with anthropogenic sources.
Dynamic competitive probabilistic principal components analysis.
López-Rubio, Ezequiel; Ortiz-DE-Lazcano-Lobato, Juan Miguel
2009-04-01
We present a new neural model which extends the classical competitive learning (CL) by performing a Probabilistic Principal Components Analysis (PPCA) at each neuron. The model also has the ability to learn the number of basis vectors required to represent the principal directions of each cluster, so it overcomes a drawback of most local PCA models, where the dimensionality of a cluster must be fixed a priori. Experimental results are presented to show the performance of the network with multispectral image data.
NASA Astrophysics Data System (ADS)
Ueki, Kenta; Iwamori, Hikaru
2017-10-01
In this study, with a view of understanding the structure of high-dimensional geochemical data and discussing the chemical processes at work in the evolution of arc magmas, we employed principal component analysis (PCA) to evaluate the compositional variations of volcanic rocks from the Sengan volcanic cluster of the Northeastern Japan Arc. We analyzed the trace element compositions of various arc volcanic rocks, sampled from 17 different volcanoes in a volcanic cluster. The PCA results demonstrated that the first three principal components accounted for 86% of the geochemical variation in the magma of the Sengan region. Based on the relationships between the principal components and the major elements, the mass-balance relationships with respect to the contributions of minerals, the composition of plagioclase phenocrysts, geothermal gradient, and seismic velocity structure in the crust, the first, the second, and the third principal components appear to represent magma mixing, crystallizations of olivine/pyroxene, and crystallizations of plagioclase, respectively. These represented 59%, 20%, and 6%, respectively, of the variance in the entire compositional range, indicating that magma mixing accounted for the largest variance in the geochemical variation of the arc magma. Our result indicated that crustal processes dominate the geochemical variation of magma in the Sengan volcanic cluster.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steenbergen, K. G., E-mail: kgsteen@gmail.com; Gaston, N.
2014-02-14
Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement formore » a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.« less
Steenbergen, K G; Gaston, N
2014-02-14
Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement for a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.
Hakimzadeh, Neda; Parastar, Hadi; Fattahi, Mohammad
2014-01-24
In this study, multivariate curve resolution (MCR) and multivariate classification methods are proposed to develop a new chemometric strategy for comprehensive analysis of high-performance liquid chromatography-diode array absorbance detection (HPLC-DAD) fingerprints of sixty Salvia reuterana samples from five different geographical regions. Different chromatographic problems occurred during HPLC-DAD analysis of S. reuterana samples, such as baseline/background contribution and noise, low signal-to-noise ratio (S/N), asymmetric peaks, elution time shifts, and peak overlap are handled using the proposed strategy. In this way, chromatographic fingerprints of sixty samples are properly segmented to ten common chromatographic regions using local rank analysis and then, the corresponding segments are column-wise augmented for subsequent MCR analysis. Extended multivariate curve resolution-alternating least squares (MCR-ALS) is used to obtain pure component profiles in each segment. In general, thirty-one chemical components were resolved using MCR-ALS in sixty S. reuterana samples and the lack of fit (LOF) values of MCR-ALS models were below 10.0% in all cases. Pure spectral profiles are considered for identification of chemical components by comparing their resolved spectra with the standard ones and twenty-four components out of thirty-one components were identified. Additionally, pure elution profiles are used to obtain relative concentrations of chemical components in different samples for multivariate classification analysis by principal component analysis (PCA) and k-nearest neighbors (kNN). Inspection of the PCA score plot (explaining 76.1% of variance accounted for three PCs) showed that S. reuterana samples belong to four clusters. The degree of class separation (DCS) which quantifies the distance separating clusters in relation to the scatter within each cluster is calculated for four clusters and it was in the range of 1.6-5.8. These results are then confirmed by kNN. In addition, according to the PCA loading plot and kNN dendrogram of thirty-one variables, five chemical constituents of luteolin-7-o-glucoside, salvianolic acid D, rosmarinic acid, lithospermic acid and trijuganone A are identified as the most important variables (i.e., chemical markers) for clusters discrimination. Finally, the effect of different chemical markers on samples differentiation is investigated using counter-propagation artificial neural network (CP-ANN) method. It is concluded that the proposed strategy can be successfully applied for comprehensive analysis of chromatographic fingerprints of complex natural samples. Copyright © 2013 Elsevier B.V. All rights reserved.
Liu, Xiang; Guo, Ling-Peng; Zhang, Fei-Yun; Ma, Jie; Mu, Shu-Yong; Zhao, Xin; Li, Lan-Hai
2015-02-01
Eight physical and chemical indicators related to water quality were monitored from nineteen sampling sites along the Kunes River at the end of snowmelt season in spring. To investigate the spatial distribution characteristics of water physical and chemical properties, cluster analysis (CA), discriminant analysis (DA) and principal component analysis (PCA) are employed. The result of cluster analysis showed that the Kunes River could be divided into three reaches according to the similarities of water physical and chemical properties among sampling sites, representing the upstream, midstream and downstream of the river, respectively; The result of discriminant analysis demonstrated that the reliability of such a classification was high, and DO, Cl- and BOD5 were the significant indexes leading to this classification; Three principal components were extracted on the basis of the principal component analysis, in which accumulative variance contribution could reach 86.90%. The result of principal component analysis also indicated that water physical and chemical properties were mostly affected by EC, ORP, NO3(-) -N, NH4(+) -N, Cl- and BOD5. The sorted results of principal component scores in each sampling sites showed that the water quality was mainly influenced by DO in upstream, by pH in midstream, and by the rest of indicators in downstream. The order of comprehensive scores for principal components revealed that the water quality degraded from the upstream to downstream, i.e., the upstream had the best water quality, followed by the midstream, while the water quality at downstream was the worst. This result corresponded exactly to the three reaches classified using cluster analysis. Anthropogenic activity and the accumulation of pollutants along the river were probably the main reasons leading to this spatial difference.
Cuthbertson, Daniel; Andrews, Preston K.; Reganold, John P.; Davies, Neal M.; Lange, B. Markus
2012-01-01
A gas chromatography–mass spectrometry approach was employed to evaluate the use of metabolite patterns to differentiate fruit from six commercially grown apple cultivars harvested in 2008. Principal component analysis (PCA) of apple fruit peel and flesh data indicated that individual cultivar replicates clustered together and were separated from all other cultivar samples. An independent metabolomics investigation with fruit harvested in 2003 confirmed the separate clustering of fruit from different cultivars. Further evidence for cultivar separation was obtained using a hierarchical clustering analysis. An evaluation of PCA component loadings revealed specific metabolite classes that contributed the most to each principal component, whereas a correlation analysis demonstrated that specific metabolites correlate directly with quality traits such as antioxidant activity, total phenolics, and total anthocyanins, which are important parameters in the selection of breeding germplasm. These data sets lay the foundation for elucidating the metabolic basis of commercially important fruit quality traits. PMID:22881116
Rosacea assessment by erythema index and principal component analysis segmentation maps
NASA Astrophysics Data System (ADS)
Kuzmina, Ilona; Rubins, Uldis; Saknite, Inga; Spigulis, Janis
2017-12-01
RGB images of rosacea were analyzed using segmentation maps of principal component analysis (PCA) and erythema index (EI). Areas of segmented clusters were compared to Clinician's Erythema Assessment (CEA) values given by two dermatologists. The results show that visible blood vessels are segmented more precisely on maps of the erythema index and the third principal component (PC3). In many cases, a distribution of clusters on EI and PC3 maps are very similar. Mean values of clusters' areas on these maps show a decrease of the area of blood vessels and erythema and an increase of lighter skin area after the therapy for the patients with diagnosis CEA = 2 on the first visit and CEA=1 on the second visit. This study shows that EI and PC3 maps are more useful than the maps of the first (PC1) and second (PC2) principal components for indicating vascular structures and erythema on the skin of rosacea patients and therapy monitoring.
Miyamoto, Yuki; Momose, Takamasa; Kanamori, Hideto
2012-11-21
Infrared absorption spectra of methyl fluoride with ortho-hydrogen (ortho-H(2)) clusters in a solid para-hydrogen (para-H(2)) crystal at 3.6 K were studied in the C-H stretching fundamental region (~3000 cm(-1)) using an FTIR spectrometer. As shown previously, the ν(3) C-F stretching fundamental band of CH(3)F-(ortho-H(2))(n) (n = 0, 1, 2, ...) clusters at 1040 cm(-1) shows a series of n discrete absorption lines, which correspond to different-sized clusters. We observed three unresolved broad peaks in the C-H stretching region and applied this cluster model to them assuming the same intensity distribution function as the ν(3) band. A fitting analysis successfully gave us the linewidth and lineshift of the components in each vibrational band. It was found that the separately determined linewidth, matrix shift of the band origin, and cluster shift are dependent on the vibrational mode. From the transition intensities of the monomer component derived from the fitting analysis, we discuss the mixing ratio of the vibrational modes due to Fermi resonance.
Chen, Shan; Li, Xiao-ning; Liang, Yi-zeng; Zhang, Zhi-min; Liu, Zhao-xia; Zhang, Qi-ming; Ding, Li-xia; Ye, Fei
2010-08-01
During Raman spectroscopy analysis, the organic molecules and contaminations will obscure or swamp Raman signals. The present study starts from Raman spectra of prednisone acetate tablets and glibenclamide tables, which are acquired from the BWTek i-Raman spectrometer. The background is corrected by R package baselineWavelet. Then principle component analysis and random forests are used to perform clustering analysis. Through analyzing the Raman spectra of two medicines, the accurate and validity of this background-correction algorithm is checked and the influences of fluorescence background on Raman spectra clustering analysis is discussed. Thus, it is concluded that it is important to correct fluorescence background for further analysis, and an effective background correction solution is provided for clustering or other analysis.
NASA Technical Reports Server (NTRS)
Stewart, L. J.; Murphy, E. D.; Mitchell, C. M.
1982-01-01
A human factors analysis addressed three related yet distinct issues within the area of workstation design for the Earth Radiation Budget Satellite (ERBS) mission operation room (MOR). The first issue, physical layout of the MOR, received the most intensive effort. It involved the positioning of clusters of equipment within the physical dimensions of the ERBS MOR. The second issue for analysis was comprised of several environmental concerns, such as lighting, furniture, and heating and ventilation systems. The third issue was component arrangement, involving the physical arrangement of individual components within clusters of consoles, e.g., a communications panel.
Molecular reclassification of Crohn's disease: a cautionary note on population stratification.
Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel
2013-01-01
Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn's disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn's disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals.
Molecular Reclassification of Crohn’s Disease: A Cautionary Note on Population Stratification
Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M.; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel
2013-01-01
Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn’s disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn’s disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals. PMID:24147066
A Catalog of Galaxy Clusters Observed by XMM-Newton
NASA Technical Reports Server (NTRS)
Snowden, S. L.; Mushotzky, R. M.; Kuntz, K. D.; Davis, David S.
2007-01-01
Images and the radial profiles of the temperature, abundance, and brightness for 70 clusters of galaxies observed by XMM-Newton are presented along with a detailed discussion of the data reduction and analysis methods, including background modeling, which were used in the processing. Proper consideration of the various background components is vital to extend the reliable determination of cluster parameters to the largest possible cluster radii. The various components of the background including the quiescent particle background, cosmic diffuse emission, soft proton contamination, and solar wind charge exchange emission are discussed along with suggested means of their identification, filtering, and/or their modeling and subtraction. Every component is spectrally variable, sometimes significantly so, and all components except the cosmic background are temporally variable as well. The distributions of the events over the FOV vary between the components, and some distributions vary with energy. The scientific results from observations of low surface brightness objects and the diffuse background itself can be strongly affected by these background components and therefore great care should be taken in their consideration.
Shukla, Sudhir; Bhargava, Atul; Chatterjee, Avijeet; Pandey, Avinash Chandra; Mishra, Brij K
2010-01-15
Assessment of genetic diversity in a crop-breeding programme helps in the identification of diverse parental combinations to create segregating progenies with maximum genetic variability and facilitates introgression of desirable genes from diverse germplasm into the available genetic base. In the present study, 39 strains of vegetable amaranth (Amaranthus tricolor) were evaluated for eight morphological and seven quality traits for two test seasons to study the extent of genetic divergence among the strains. Multivariate analysis showed that the first four principal components contributed 67.55% of the variability. Cluster analysis grouped the strains into six clusters that displayed a wide range of diversity for most of the traits. Cluster analysis has proved to be an effective method in grouping strains that may facilitate effective management and utilisation in crop-breeding programmes. The diverse strains falling in different clusters were identified, which can be utilised in different hybridisation programmes to develop high-foliage-yielding varieties rich in nutritional components. Copyright (c) 2009 Society of Chemical Industry.
Self-aggregation in scaled principal component space
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ding, Chris H.Q.; He, Xiaofeng; Zha, Hongyuan
2001-10-05
Automatic grouping of voluminous data into meaningful structures is a challenging task frequently encountered in broad areas of science, engineering and information processing. These data clustering tasks are frequently performed in Euclidean space or a subspace chosen from principal component analysis (PCA). Here we describe a space obtained by a nonlinear scaling of PCA in which data objects self-aggregate automatically into clusters. Projection into this space gives sharp distinctions among clusters. Gene expression profiles of cancer tissue subtypes, Web hyperlink structure and Internet newsgroups are analyzed to illustrate interesting properties of the space.
Wolf, Antje; Kirschner, Karl N
2013-02-01
With improvements in computer speed and algorithm efficiency, MD simulations are sampling larger amounts of molecular and biomolecular conformations. Being able to qualitatively and quantitatively sift these conformations into meaningful groups is a difficult and important task, especially when considering the structure-activity paradigm. Here we present a study that combines two popular techniques, principal component (PC) analysis and clustering, for revealing major conformational changes that occur in molecular dynamics (MD) simulations. Specifically, we explored how clustering different PC subspaces effects the resulting clusters versus clustering the complete trajectory data. As a case example, we used the trajectory data from an explicitly solvated simulation of a bacteria's L11·23S ribosomal subdomain, which is a target of thiopeptide antibiotics. Clustering was performed, using K-means and average-linkage algorithms, on data involving the first two to the first five PC subspace dimensions. For the average-linkage algorithm we found that data-point membership, cluster shape, and cluster size depended on the selected PC subspace data. In contrast, K-means provided very consistent results regardless of the selected subspace. Since we present results on a single model system, generalization concerning the clustering of different PC subspaces of other molecular systems is currently premature. However, our hope is that this study illustrates a) the complexities in selecting the appropriate clustering algorithm, b) the complexities in interpreting and validating their results, and c) by combining PC analysis with subsequent clustering valuable dynamic and conformational information can be obtained.
USDA-ARS?s Scientific Manuscript database
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that i...
Chen, Yanxian; Chang, Billy Heung Wing; Ding, Xiaohu; He, Mingguang
2016-11-22
In the present study we attempt to use hypothesis-independent analysis in investigating the patterns in refraction growth in Chinese children, and to explore the possible risk factors affecting the different components of progression, as defined by Principal Component Analysis (PCA). A total of 637 first-born twins in Guangzhou Twin Eye Study with 6-year annual visits (baseline age 7-15 years) were available in the analysis. Cluster 1 to 3 were classified after a partitioning clustering, representing stable, slow and fast progressing groups of refraction respectively. Baseline age and refraction, paternal refraction, maternal refraction and proportion of two myopic parents showed significant differences across the three groups. Three major components of progression were extracted using PCA: "Average refraction", "Acceleration" and the combination of "Myopia stabilization" and "Late onset of refraction progress". In regression models, younger children with more severe myopia were associated with larger "Acceleration". The risk factors of "Acceleration" included change of height and weight, near work, and parental myopia, while female gender, change of height and weight were associated with "Stabilization", and increased outdoor time was related to "Late onset of refraction progress". We therefore concluded that genetic and environmental risk factors have different impacts on patterns of refraction progression.
Chen, Yanxian; Chang, Billy Heung Wing; Ding, Xiaohu; He, Mingguang
2016-01-01
In the present study we attempt to use hypothesis-independent analysis in investigating the patterns in refraction growth in Chinese children, and to explore the possible risk factors affecting the different components of progression, as defined by Principal Component Analysis (PCA). A total of 637 first-born twins in Guangzhou Twin Eye Study with 6-year annual visits (baseline age 7–15 years) were available in the analysis. Cluster 1 to 3 were classified after a partitioning clustering, representing stable, slow and fast progressing groups of refraction respectively. Baseline age and refraction, paternal refraction, maternal refraction and proportion of two myopic parents showed significant differences across the three groups. Three major components of progression were extracted using PCA: “Average refraction”, “Acceleration” and the combination of “Myopia stabilization” and “Late onset of refraction progress”. In regression models, younger children with more severe myopia were associated with larger “Acceleration”. The risk factors of “Acceleration” included change of height and weight, near work, and parental myopia, while female gender, change of height and weight were associated with “Stabilization”, and increased outdoor time was related to “Late onset of refraction progress”. We therefore concluded that genetic and environmental risk factors have different impacts on patterns of refraction progression. PMID:27874105
The Quantitative Analysis of Chennai Automotive Industry Cluster
NASA Astrophysics Data System (ADS)
Bhaskaran, Ethirajan
2016-07-01
Chennai, also called as Detroit of India due to presence of Automotive Industry producing over 40 % of the India's vehicle and components. During 2001-2002, the Automotive Component Industries (ACI) in Ambattur, Thirumalizai and Thirumudivakkam Industrial Estate, Chennai has faced problems on infrastructure, technology, procurement, production and marketing. The objective is to study the Quantitative Performance of Chennai Automotive Industry Cluster before (2001-2002) and after the CDA (2008-2009). The methodology adopted is collection of primary data from 100 ACI using quantitative questionnaire and analyzing using Correlation Analysis (CA), Regression Analysis (RA), Friedman Test (FMT), and Kruskall Wallis Test (KWT).The CA computed for the different set of variables reveals that there is high degree of relationship between the variables studied. The RA models constructed establish the strong relationship between the dependent variable and a host of independent variables. The models proposed here reveal the approximate relationship in a closer form. KWT proves, there is no significant difference between three locations clusters with respect to: Net Profit, Production Cost, Marketing Costs, Procurement Costs and Gross Output. This supports that each location has contributed for development of automobile component cluster uniformly. The FMT proves, there is no significant difference between industrial units in respect of cost like Production, Infrastructure, Technology, Marketing and Net Profit. To conclude, the Automotive Industries have fully utilized the Physical Infrastructure and Centralised Facilities by adopting CDA and now exporting their products to North America, South America, Europe, Australia, Africa and Asia. The value chain analysis models have been implemented in all the cluster units. This Cluster Development Approach (CDA) model can be implemented in industries of under developed and developing countries for cost reduction and productivity increase.
NASA Astrophysics Data System (ADS)
Jha, S. K.; Brockman, R. A.; Hoffman, R. M.; Sinha, V.; Pilchak, A. L.; Porter, W. J.; Buchanan, D. J.; Larsen, J. M.; John, R.
2018-05-01
Principal component analysis and fuzzy c-means clustering algorithms were applied to slip-induced strain and geometric metric data in an attempt to discover unique microstructural configurations and their frequencies of occurrence in statistically representative instantiations of a titanium alloy microstructure. Grain-averaged fatigue indicator parameters were calculated for the same instantiation. The fatigue indicator parameters strongly correlated with the spatial location of the microstructural configurations in the principal components space. The fuzzy c-means clustering method identified clusters of data that varied in terms of their average fatigue indicator parameters. Furthermore, the number of points in each cluster was inversely correlated to the average fatigue indicator parameter. This analysis demonstrates that data-driven methods have significant potential for providing unbiased determination of unique microstructural configurations and their frequencies of occurrence in a given volume from the point of view of strain localization and fatigue crack initiation.
An improved K-means clustering algorithm in agricultural image segmentation
NASA Astrophysics Data System (ADS)
Cheng, Huifeng; Peng, Hui; Liu, Shanmei
Image segmentation is the first important step to image analysis and image processing. In this paper, according to color crops image characteristics, we firstly transform the color space of image from RGB to HIS, and then select proper initial clustering center and cluster number in application of mean-variance approach and rough set theory followed by clustering calculation in such a way as to automatically segment color component rapidly and extract target objects from background accurately, which provides a reliable basis for identification, analysis, follow-up calculation and process of crops images. Experimental results demonstrate that improved k-means clustering algorithm is able to reduce the computation amounts and enhance precision and accuracy of clustering.
Description and typology of intensive Chios dairy sheep farms in Greece.
Gelasakis, A I; Valergakis, G E; Arsenos, G; Banos, G
2012-06-01
The aim was to assess the intensified dairy sheep farming systems of the Chios breed in Greece, establishing a typology that may properly describe and characterize them. The study included the total of the 66 farms of the Chios sheep breeders' cooperative Macedonia. Data were collected using a structured direct questionnaire for in-depth interviews, including questions properly selected to obtain a general description of farm characteristics and overall management practices. A multivariate statistical analysis was used on the data to obtain the most appropriate typology. Initially, principal component analysis was used to produce uncorrelated variables (principal components), which would be used for the consecutive cluster analysis. The number of clusters was decided using hierarchical cluster analysis, whereas, the farms were allocated in 4 clusters using k-means cluster analysis. The identified clusters were described and afterward compared using one-way ANOVA or a chi-squared test. The main differences were evident on land availability and use, facility and equipment availability and type, expansion rates, and application of preventive flock health programs. In general, cluster 1 included newly established, intensive, well-equipped, specialized farms and cluster 2 included well-established farms with balanced sheep and feed/crop production. In cluster 3 were assigned small flock farms focusing more on arable crops than on sheep farming with a tendency to evolve toward cluster 2, whereas cluster 4 included farms representing a rather conservative form of Chios sheep breeding with low/intermediate inputs and choosing not to focus on feed/crop production. In the studied set of farms, 4 different farmer attitudes were evident: 1) farming disrupts sheep breeding; feed should be purchased and economies of scale will decrease costs (mainly cluster 1), 2) only exercise/pasture land is necessary; at least part of the feed (pasture) must be home-grown to decrease costs (clusters 1 and 4), 3) providing pasture to sheep is essential; on-farm feed production decreases costs (mainly cluster 3), and 4) large-scale farming (feed production and cash crops) does not disrupt sheep breeding; all feed must be produced on-farm to decrease costs (mainly cluster 3). Conducting a profitability analysis among different clusters, exploring and discovering the most beneficial levels of intensified management and capital investment should now be considered. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Rabey, Martin; Slater, Helen; OʼSullivan, Peter; Beales, Darren; Smith, Anne
2015-10-01
The objectives of this study were to explore the existence of subgroups in a cohort with chronic low back pain (n = 294) based on the results of multimodal sensory testing and profile subgroups on demographic, psychological, lifestyle, and general health factors. Bedside (2-point discrimination, brush, vibration and pinprick perception, temporal summation on repeated monofilament stimulation) and laboratory (mechanical detection threshold, pressure, heat and cold pain thresholds, conditioned pain modulation) sensory testing were examined at wrist and lumbar sites. Data were entered into principal component analysis, and 5 component scores were entered into latent class analysis. Three clusters, with different sensory characteristics, were derived. Cluster 1 (31.9%) was characterised by average to high temperature and pressure pain sensitivity. Cluster 2 (52.0%) was characterised by average to high pressure pain sensitivity. Cluster 3 (16.0%) was characterised by low temperature and pressure pain sensitivity. Temporal summation occurred significantly more frequently in cluster 1. Subgroups were profiled on pain intensity, disability, depression, anxiety, stress, life events, fear avoidance, catastrophizing, perception of the low back region, comorbidities, body mass index, multiple pain sites, sleep, and activity levels. Clusters 1 and 2 had a significantly greater proportion of female participants and higher depression and sleep disturbance scores than cluster 3. The proportion of participants undertaking <300 minutes per week of moderate activity was significantly greater in cluster 1 than in clusters 2 and 3. Low back pain, therefore, does not appear to be homogeneous. Pain mechanisms relating to presentations of each subgroup were postulated. Future research may investigate prognoses and interventions tailored towards these subgroups.
Characterization of spatial and temporal variability in hydrochemistry of Johor Straits, Malaysia.
Abdullah, Pauzi; Abdullah, Sharifah Mastura Syed; Jaafar, Othman; Mahmud, Mastura; Khalik, Wan Mohd Afiq Wan Mohd
2015-12-15
Characterization of hydrochemistry changes in Johor Straits within 5 years of monitoring works was successfully carried out. Water quality data sets (27 stations and 19 parameters) collected in this area were interpreted subject to multivariate statistical analysis. Cluster analysis grouped all the stations into four clusters ((Dlink/Dmax) × 100<90) and two clusters ((Dlink/Dmax) × 100<80) for site and period similarities. Principal component analysis rendered six significant components (eigenvalue>1) that explained 82.6% of the total variance of the data set. Classification matrix of discriminant analysis assigned 88.9-92.6% and 83.3-100% correctness in spatial and temporal variability, respectively. Times series analysis then confirmed that only four parameters were not significant over time change. Therefore, it is imperative that the environmental impact of reclamation and dredging works, municipal or industrial discharge, marine aquaculture and shipping activities in this area be effectively controlled and managed. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ofner, Johannes; Kamilli, Katharina A; Eitenberger, Elisabeth; Friedbacher, Gernot; Lendl, Bernhard; Held, Andreas; Lohninger, Hans
2015-09-15
The chemometric analysis of multisensor hyperspectral data allows a comprehensive image-based analysis of precipitated atmospheric particles. Atmospheric particulate matter was precipitated on aluminum foils and analyzed by Raman microspectroscopy and subsequently by electron microscopy and energy dispersive X-ray spectroscopy. All obtained images were of the same spot of an area of 100 × 100 μm(2). The two hyperspectral data sets and the high-resolution scanning electron microscope images were fused into a combined multisensor hyperspectral data set. This multisensor data cube was analyzed using principal component analysis, hierarchical cluster analysis, k-means clustering, and vertex component analysis. The detailed chemometric analysis of the multisensor data allowed an extensive chemical interpretation of the precipitated particles, and their structure and composition led to a comprehensive understanding of atmospheric particulate matter.
Clustering analysis strategies for electron energy loss spectroscopy (EELS).
Torruella, Pau; Estrader, Marta; López-Ortega, Alberto; Baró, Maria Dolors; Varela, Maria; Peiró, Francesca; Estradé, Sònia
2018-02-01
In this work, the use of cluster analysis algorithms, widely applied in the field of big data, is proposed to explore and analyze electron energy loss spectroscopy (EELS) data sets. Three different data clustering approaches have been tested both with simulated and experimental data from Fe 3 O 4 /Mn 3 O 4 core/shell nanoparticles. The first method consists on applying data clustering directly to the acquired spectra. A second approach is to analyze spectral variance with principal component analysis (PCA) within a given data cluster. Lastly, data clustering on PCA score maps is discussed. The advantages and requirements of each approach are studied. Results demonstrate how clustering is able to recover compositional and oxidation state information from EELS data with minimal user input, giving great prospects for its usage in EEL spectroscopy. Copyright © 2017 Elsevier B.V. All rights reserved.
Chemometric expertise of the quality of groundwater sources for domestic use.
Spanos, Thomas; Ene, Antoaneta; Simeonova, Pavlina
2015-01-01
In the present study 49 representative sites have been selected for the collection of water samples from central water supplies with different geographical locations in the region of Kavala, Northern Greece. Ten physicochemical parameters (pH, electric conductivity, nitrate, chloride, sodium, potassium, total alkalinity, total hardness, bicarbonate and calcium) were analyzed monthly, in the period from January 2010 to December 2010. Chemometric methods were used for monitoring data mining and interpretation (cluster analysis, principal components analysis and source apportioning by principal components regression). The clustering of the chemical indicators delivers two major clusters related to the water hardness and the mineral components (impacted by sea, bedrock and acidity factors). The sampling locations are separated into three major clusters corresponding to the spatial distribution of the sites - coastal, lowland and semi-mountainous. The principal components analysis reveals two latent factors responsible for the data structures, which are also an indication for the sources determining the groundwater quality of the region (conditionally named "mineral" factor and "water hardness" factor). By the apportionment approach it is shown what the contribution is of each of the identified sources to the formation of the total concentration of each one of the chemical parameters. The mean values of the studied physicochemical parameters were found to be within the limits given in the 98/83/EC Directive. The water samples are appropriate for human consumption. The results of this study provide an overview of the hydrogeological profile of water supply system for the studied area.
Joining X-Ray to Lensing: An Accurate Combined Analysis of MACS J0416.1-2403
NASA Astrophysics Data System (ADS)
Bonamigo, M.; Grillo, C.; Ettori, S.; Caminha, G. B.; Rosati, P.; Mercurio, A.; Annunziatella, M.; Balestra, I.; Lombardi, M.
2017-06-01
We present a novel approach for a combined analysis of X-ray and gravitational lensing data and apply this technique to the merging galaxy cluster MACS J0416.1-2403. The method exploits the information on the intracluster gas distribution that comes from a fit of the X-ray surface brightness and then includes the hot gas as a fixed mass component in the strong-lensing analysis. With our new technique, we can separate the collisional from the collision-less diffuse mass components, thus obtaining a more accurate reconstruction of the dark matter distribution in the core of a cluster. We introduce an analytical description of the X-ray emission coming from a set of dual pseudo-isothermal elliptical mass distributions, which can be directly used in most lensing softwares. By combining Chandra observations with Hubble Frontier Fields imaging and Multi Unit Spectroscopic Explorer spectroscopy in MACS J0416.1-2403, we measure a projected gas-to-total mass fraction of approximately 10% at 350 kpc from the cluster center. Compared to the results of a more traditional cluster mass model (diffuse halos plus member galaxies), we find a significant difference in the cumulative projected mass profile of the dark matter component and that the dark matter over total mass fraction is almost constant, out to more than 350 kpc. In the coming era of large surveys, these results show the need of multiprobe analyses for detailed dark matter studies in galaxy clusters.
Looking Wider and Further: The Evolution of Galaxies Inside Galaxy Clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Yuanyuan
2016-01-01
Galaxy clusters are rare objects in the universe, but on-going wide field optical surveys are identifying many thousands of them to redshift 1.0 and beyond. Using early data from the Dark Energy Survey (DES) and publicly released data from the Sloan Digital Sky Survey (SDSS), this dissertation explores the evolution of cluster galaxies in the redshift range from 0 to 1.0. As it is common for deep wide field sky surveys like DES to struggle with galaxy detection efficiency at cluster core, the first component of this dissertation describes an efficient package that helps resolving the issue. The second partmore » focuses on the formation of cluster galaxies. The study quantifies the growth of cluster bright central galaxies (BCGs), and argues for the importance of merging and intra-cluster light production during BCG evolution. An analysis of cluster red sequence galaxy luminosity function is also performed, demonstrating that the abundance of these galaxies is mildly dependent on cluster mass and redshift. The last component of the dissertation characterizes the properties of galaxy filaments to help understanding cluster environments« less
Zhang, Xianming; Lohmann, Rainer; Dassuncao, Clifton; Hu, Xindi C.; Weber, Andrea K.; Vecitis, Chad D.; Sunderland, Elsie M.
2017-01-01
Exposure to poly and perfluoroalkyl substances (PFASs) has been associated with adverse health effects in humans and wildlife. Understanding pollution sources is essential for environmental regulation but source attribution for PFASs has been confounded by limited information on industrial releases and rapid changes in chemical production. Here we use principal component analysis (PCA), hierarchical clustering, and geospatial analysis to understand source contributions to 14 PFASs measured across 37 sites in the Northeastern United States in 2014. PFASs are significantly elevated in urban areas compared to rural sites except for perfluorobutane sulfonate (PFBS), N-methyl perfluorooctanesulfonamidoacetic acid (N-MeFOSAA), perfluoroundecanate (PFUnDA) and perfluorododecanate (PFDoDA). The highest PFAS concentrations across sites were for perfluorooctanate (PFOA, 56 ng L−1) and perfluorohexane sulfonate (PFOS, 43 ng L−1) and PFOS levels are lower than earlier measurements of U.S. surface waters. PCA and cluster analysis indicates three main statistical groupings of PFASs. Geospatial analysis of watersheds reveals the first component/cluster originates from a mixture of contemporary point sources such as airports and textile mills. Atmospheric sources from the waste sector are consistent with the second component, and the metal smelting industry plausibly explains the third component. We find this source-attribution technique is effective for better understanding PFAS sources in urban areas. PMID:28217711
NASA Astrophysics Data System (ADS)
Farsadnia, Farhad; Ghahreman, Bijan
2016-04-01
Hydrologic homogeneous group identification is considered both fundamental and applied research in hydrology. Clustering methods are among conventional methods to assess the hydrological homogeneous regions. Recently, Self-Organizing feature Map (SOM) method has been applied in some studies. However, the main problem of this method is the interpretation on the output map of this approach. Therefore, SOM is used as input to other clustering algorithms. The aim of this study is to apply a two-level Self-Organizing feature map and Ward hierarchical clustering method to determine the hydrologic homogenous regions in North and Razavi Khorasan provinces. At first by principal component analysis, we reduced SOM input matrix dimension, then the SOM was used to form a two-dimensional features map. To determine homogeneous regions for flood frequency analysis, SOM output nodes were used as input into the Ward method. Generally, the regions identified by the clustering algorithms are not statistically homogeneous. Consequently, they have to be adjusted to improve their homogeneity. After adjustment of the homogeneity regions by L-moment tests, five hydrologic homogeneous regions were identified. Finally, adjusted regions were created by a two-level SOM and then the best regional distribution function and associated parameters were selected by the L-moment approach. The results showed that the combination of self-organizing maps and Ward hierarchical clustering by principal components as input is more effective than the hierarchical method, by principal components or standardized inputs to achieve hydrologic homogeneous regions.
A Dimensionality Reduction-Based Multi-Step Clustering Method for Robust Vessel Trajectory Analysis
Liu, Jingxian; Wu, Kefeng
2017-01-01
The Shipboard Automatic Identification System (AIS) is crucial for navigation safety and maritime surveillance, data mining and pattern analysis of AIS information have attracted considerable attention in terms of both basic research and practical applications. Clustering of spatio-temporal AIS trajectories can be used to identify abnormal patterns and mine customary route data for transportation safety. Thus, the capacities of navigation safety and maritime traffic monitoring could be enhanced correspondingly. However, trajectory clustering is often sensitive to undesirable outliers and is essentially more complex compared with traditional point clustering. To overcome this limitation, a multi-step trajectory clustering method is proposed in this paper for robust AIS trajectory clustering. In particular, the Dynamic Time Warping (DTW), a similarity measurement method, is introduced in the first step to measure the distances between different trajectories. The calculated distances, inversely proportional to the similarities, constitute a distance matrix in the second step. Furthermore, as a widely-used dimensional reduction method, Principal Component Analysis (PCA) is exploited to decompose the obtained distance matrix. In particular, the top k principal components with above 95% accumulative contribution rate are extracted by PCA, and the number of the centers k is chosen. The k centers are found by the improved center automatically selection algorithm. In the last step, the improved center clustering algorithm with k clusters is implemented on the distance matrix to achieve the final AIS trajectory clustering results. In order to improve the accuracy of the proposed multi-step clustering algorithm, an automatic algorithm for choosing the k clusters is developed according to the similarity distance. Numerous experiments on realistic AIS trajectory datasets in the bridge area waterway and Mississippi River have been implemented to compare our proposed method with traditional spectral clustering and fast affinity propagation clustering. Experimental results have illustrated its superior performance in terms of quantitative and qualitative evaluations. PMID:28777353
Lu, Chi-Jie; Chang, Chi-Chang
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.
Zou, Ling; Guo, Qian; Xu, Yi; Yang, Biao; Jiao, Zhuqing; Xiang, Jianbo
2016-04-29
Functional magnetic resonance imaging (fMRI) is an important tool in neuroscience for assessing connectivity and interactions between distant areas of the brain. To find and characterize the coherent patterns of brain activity as a means of identifying brain systems for the cognitive reappraisal of the emotion task, both density-based k-means clustering and independent component analysis (ICA) methods can be applied to characterize the interactions between brain regions involved in cognitive reappraisal of emotion. Our results reveal that compared with the ICA method, the density-based k-means clustering method provides a higher sensitivity of polymerization. In addition, it is more sensitive to those relatively weak functional connection regions. Thus, the study concludes that in the process of receiving emotional stimuli, the relatively obvious activation areas are mainly distributed in the frontal lobe, cingulum and near the hypothalamus. Furthermore, density-based k-means clustering method creates a more reliable method for follow-up studies of brain functional connectivity.
Opara, Umezuruike Linus; Jacobson, Dan; Al-Saady, Nadiya Abubakar
2010-01-01
Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs. This study employed amplified fragment length polymorphism (AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman. Using 12 primer combinations, a total of 1094 bands were scored, of which 1012 were polymorphic. Eighty-two unique markers were identified, which revealed the distinct separation of the seven cultivars. The results obtained show that AFLP can be used to differentiate the banana cultivars. Further classification by phylogenetic, hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis. Based on the analytical results, a consensus dendrogram of the banana cultivars is presented. PMID:20443211
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738
HPLC-DAD-ESI-MS Analysis of Flavonoids from Leaves of Different Cultivars of Sweet Osmanthus.
Wang, Yiguang; Fu, Jianxin; Zhang, Chao; Zhao, Hongbo
2016-09-14
Osmanthus fragrans Lour. has traditionally been a popular ornamental plant in China. In this study, ethanol extracts of the leaves of four cultivar groups of O. fragrans were analyzed by high-performance liquid chromatography coupled with diode array detection (HPLC-DAD) and high-performance liquid chromatography with electrospray ionization and mass spectrometry (HPLC-ESI-MS). The results suggest that variation in flavonoids among O. fragrans cultivars is quantitative, rather than qualitative. Fifteen components were detected and separated, among which, the structures of 11 flavonoids and two coumarins were identified or tentatively identified. According to principal component analysis (PCA) and hierarchical cluster analysis (HCA) based on the abundance of these components (expressed as rutin equivalents), 22 selected cultivars were classified into four clusters. The seven cultivars from Cluster III ('Xiaoye Sugui', 'Boye Jingui', 'Wuyi Dangui', 'Yingye Dangui', 'Danzhuang', 'Foding Zhu', and 'Tianxiang Taige'), which are enriched in rutin and total flavonoids, and 'Sijigui' from Cluster II which contained the highest amounts of kaempferol glycosides and apigenin 7-O-glucoside, could be selected as potential pharmaceutical resources. However, the chemotaxonomy in this paper does not correlate with the distribution of the existing cultivar groups, demonstrating that the distribution of flavonoids in O. fragrans leaves does not provide an effective means of classification for O. fragrans cultivars based on flower color.
NASA Astrophysics Data System (ADS)
Marchegiani, P.; Colafrancesco, S.
2017-08-01
A recent stacking analysis of Planck HFI data of galaxy clusters led to the derivation of the cluster temperatures using the relativistic corrections to the Sunyaev-Zel'dovich effect (SZE). However, the temperatures of high-temperature clusters, as derived from this analysis, were basically higher than the temperatures derived from X-ray measurements, at a moderate statistical significance of 1.5σ. This discrepancy has been attributed by Hurier to calibration issues. In this paper, we discuss an alternative explanation for this discrepancy in terms of a non-thermal SZE astrophysical component. We find that this explanation can work if non-thermal electrons in galaxy clusters have a low minimum momentum (p1 ˜ 0.5-1), and if their pressure is of the order of 20-30 per cent of the thermal gas pressure. Both these conditions are hard to obtain if the non-thermal electrons are mixed with the hot gas in the intracluster medium, but can be possibly obtained if the non-thermal electrons are mainly confined in bubbles with a high amount of non-thermal plasma and a low amount of thermal plasma, or are in giant radio lobes/relics in the outskirts of the clusters. To derive more precise results on the properties of the non-thermal electrons in clusters, and in view of more solid detections of a discrepancy between X-ray- and SZE-derived cluster temperatures that cannot be explained in other ways, it would be necessary to reproduce the full analysis done by Hurier by systematically adding the non-thermal component of the SZE.
The Two-Component Virial Theorem and the Physical Properties of Stellar Systems.
Dantas; Ribeiro; Capelato; de Carvalho RR
2000-01-01
Motivated by present indirect evidence that galaxies are surrounded by dark matter halos, we investigate whether their physical properties can be described by a formulation of the virial theorem that explicitly takes into account the gravitational potential term representing the interaction of the dark halo with the baryonic or luminous component. Our analysis shows that the application of such a "two-component virial theorem" not only accounts for the scaling relations displayed by, in particular, elliptical galaxies, but also for the observed properties of all virialized stellar systems, ranging from globular clusters to galaxy clusters.
Analysis of the mutations induced by conazole fungicides in vivo.
Ross, Jeffrey A; Leavitt, Sharon A
2010-05-01
The mouse liver tumorigenic conazole fungicides triadimefon and propiconazole have previously been shown to be in vivo mouse liver mutagens in the Big Blue transgenic mutation assay when administered in feed at tumorigenic doses, whereas the non-tumorigenic conazole myclobutanil was not mutagenic. DNA sequencing of the mutants recovered from each treatment group as well as from animals receiving control diet was conducted to gain additional insight into the mode of action by which tumorigenic conazoles induce mutations. Relative dinucleotide mutabilities (RDMs) were calculated for each possible dinucleotide in each treatment group and then examined by multivariate statistical analysis techniques. Unsupervised hierarchical clustering analysis of RDM values segregated two independent control groups together, along with the non-tumorigen myclobutanil. The two tumorigenic conazoles clustered together in a distinct grouping. Partitioning around mediods of RDM values into two clusters also groups the triadimefon and propiconazole together in one cluster and the two control groups and myclobutanil together in a second cluster. Principal component analysis of these results identifies two components that account for 88.3% of the variability in the points. Taken together, these results are consistent with the hypothesis that propiconazole- and triadimefon-induced mutations do not represent clonal expansion of background mutations and support the hypothesis that they arise from the accumulation of reactive electrophilic metabolic intermediates within the liver in vivo.
NASA Astrophysics Data System (ADS)
Huang, W.; Campredon, R.; Abrao, J. J.; Bernat, M.; Latouche, C.
1994-06-01
In the last decade, the Atlantic coast of south-eastern Brazil has been affected by increasing deforestation and anthropogenic effluents. Sediments in the coastal lagoons have recorded the process of such environmental change. Thirty-seven sediment samples from three cores in Piratininga Lagoon, Rio de Janeiro, were analyzed for their major components and minor element concentrations in order to examine geochemical characteristics and the depositional environment and to investigate the variation of heavy metals of environmental concern. Two multivariate analysis methods, principal component analysis and cluster analysis, were performed on the analytical data set to help visualize the sample clusters and the element associations. On the whole, the sediment samples from each core are similar and the sample clusters corresponding to the three cores are clearly separated, as a result of the different conditions of sedimentation. Some changes in the depositional environment are recognized using the results of multivariate analysis. The enrichment of Pb, Cu, and Zn in the upper parts of cores is in agreement with increasing anthropogenic influx (pollution).
Fernández-Arjona, María Del Mar; Grondona, Jesús M; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D
2017-01-01
It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor.
Fernández-Arjona, María del Mar; Grondona, Jesús M.; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D.
2017-01-01
It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor. PMID:28848398
Unsupervised analysis of small animal dynamic Cerenkov luminescence imaging
NASA Astrophysics Data System (ADS)
Spinelli, Antonello E.; Boschi, Federico
2011-12-01
Clustering analysis (CA) and principal component analysis (PCA) were applied to dynamic Cerenkov luminescence images (dCLI). In order to investigate the performances of the proposed approaches, two distinct dynamic data sets obtained by injecting mice with 32P-ATP and 18F-FDG were acquired using the IVIS 200 optical imager. The k-means clustering algorithm has been applied to dCLI and was implemented using interactive data language 8.1. We show that cluster analysis allows us to obtain good agreement between the clustered and the corresponding emission regions like the bladder, the liver, and the tumor. We also show a good correspondence between the time activity curves of the different regions obtained by using CA and manual region of interest analysis on dCLIT and PCA images. We conclude that CA provides an automatic unsupervised method for the analysis of preclinical dynamic Cerenkov luminescence image data.
NASA Technical Reports Server (NTRS)
Parada, N. D. J.; Novo, E. M. L. M.
1983-01-01
Two sets of MSS/LANDSAT data with solar elevation ranging from 22 deg to 41 deg were used at the Image-100 System to implement the Eliason et alii technique for extracting the topographic modulation component. An unsupervised cluster analysis was used to obtain an average brightness image for each channel. Analysis of the enhanced imaged shows that the technique for extracting topographic modulation component is more appropriated to MSS data obtained under high sun elevation ngles. Low sun elevation increases the variance of each cluster so that the average brightness doesn't represent its albedo proprties. The topographic modulation component applied to low sun elevation angle damages rather than enhance topographic information. Better results were produced for channels 4 and 5 than for channels 6 and 7.
Borri, Marco; Schmidt, Maria A; Powell, Ceri; Koh, Dow-Mu; Riddell, Angela M; Partridge, Mike; Bhide, Shreerang A; Nutting, Christopher M; Harrington, Kevin J; Newbold, Katie L; Leach, Martin O
2015-01-01
To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters) of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment. The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4). Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters. The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4), determined with cluster validation, produced the best separation between reducing and non-reducing clusters. The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.
Goldbaum, Michael H; Jang, Gil-Jin; Bowd, Chris; Hao, Jiucang; Zangwill, Linda M; Liebmann, Jeffrey; Girkin, Christopher; Jung, Tzyy-Ping; Weinreb, Robert N; Sample, Pamela A
2009-12-01
To determine if the patterns uncovered with variational Bayesian-independent component analysis-mixture model (VIM) applied to a large set of normal and glaucomatous fields obtained with the Swedish Interactive Thresholding Algorithm (SITA) are distinct, recognizable, and useful for modeling the severity of the field loss. SITA fields were obtained with the Humphrey Visual Field Analyzer (Carl Zeiss Meditec, Inc, Dublin, California) on 1,146 normal eyes and 939 glaucoma eyes from subjects followed by the Diagnostic Innovations in Glaucoma Study and the African Descent and Glaucoma Evaluation Study. VIM modifies independent component analysis (ICA) to develop separate sets of ICA axes in the cluster of normal fields and the 2 clusters of abnormal fields. Of 360 models, the model with the best separation of normal and glaucomatous fields was chosen for creating the maximally independent axes. Grayscale displays of fields generated by VIM on each axis were compared. SITA fields most closely associated with each axis and displayed in grayscale were evaluated for consistency of pattern at all severities. The best VIM model had 3 clusters. Cluster 1 (1,193) was mostly normal (1,089, 95% specificity) and had 2 axes. Cluster 2 (596) contained mildly abnormal fields (513) and 2 axes; cluster 3 (323) held mostly moderately to severely abnormal fields (322) and 5 axes. Sensitivity for clusters 2 and 3 combined was 88.9%. The VIM-generated field patterns differed from each other and resembled glaucomatous defects (eg, nasal step, arcuate, temporal wedge). SITA fields assigned to an axis resembled each other and the VIM-generated patterns for that axis. Pattern severity increased in the positive direction of each axis by expansion or deepening of the axis pattern. VIM worked well on SITA fields, separating them into distinctly different yet recognizable patterns of glaucomatous field defects. The axis and pattern properties make VIM a good candidate as a preliminary process for detecting progression.
Syazwan, AI; Rafee, B Mohd; Juahir, Hafizan; Azman, AZF; Nizar, AM; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, AA; Yunos, MA Syafiq; Anita, AR; Hanafiah, J Muhamad; Shaharuddin, MS; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, MN Mohamad; Azizan, HS; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, FT
2012-01-01
Purpose To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. Design A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. Method A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Result Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. Conclusion This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ investigations and complaints procedures. PMID:23055779
Syazwan, Ai; Rafee, B Mohd; Juahir, Hafizan; Azman, Azf; Nizar, Am; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, Aa; Yunos, Ma Syafiq; Anita, Ar; Hanafiah, J Muhamad; Shaharuddin, Ms; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, Mn Mohamad; Azizan, Hs; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, Ft
2012-01-01
To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ investigations and complaints procedures.
Schultz, K K; Bennett, T B; Nordlund, K V; Döpfer, D; Cook, N B
2016-09-01
Transition cow management has been tracked via the Transition Cow Index (TCI; AgSource Cooperative Services, Verona, WI) since 2006. Transition Cow Index was developed to measure the difference between actual and predicted milk yield at first test day to evaluate the relative success of the transition period program. This project aimed to assess TCI in relation to all commonly used Dairy Herd Improvement (DHI) metrics available through AgSource Cooperative Services. Regression analysis was used to isolate variables that were relevant to TCI, and then principal components analysis and network analysis were used to determine the relative strength and relatedness among variables. Finally, cluster analysis was used to segregate herds based on similarity of relevant variables. The DHI data were obtained from 2,131 Wisconsin dairy herds with test-day mean ≥30 cows, which were tested ≥10 times throughout the 2014 calendar year. The original list of 940 DHI variables was reduced through expert-driven selection and regression analysis to 23 variables. The K-means cluster analysis produced 5 distinct clusters. Descriptive statistics were calculated for the 23 variables per cluster grouping. Using principal components analysis, cluster analysis, and network analysis, 4 parameters were isolated as most relevant to TCI; these were energy-corrected milk, 3 measures of intramammary infection (dry cow cure rate, linear somatic cell count score in primiparous cows, and new infection rate), peak ratio, and days in milk at peak milk production. These variables together with cow and newborn calf survival measures form a group of metrics that can be used to assist in the evaluation of overall transition period performance. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Users matter : multi-agent systems model of high performance computing cluster users.
DOE Office of Scientific and Technical Information (OSTI.GOV)
North, M. J.; Hood, C. S.; Decision and Information Sciences
2005-01-01
High performance computing clusters have been a critical resource for computational science for over a decade and have more recently become integral to large-scale industrial analysis. Despite their well-specified components, the aggregate behavior of clusters is poorly understood. The difficulties arise from complicated interactions between cluster components during operation. These interactions have been studied by many researchers, some of whom have identified the need for holistic multi-scale modeling that simultaneously includes network level, operating system level, process level, and user level behaviors. Each of these levels presents its own modeling challenges, but the user level is the most complex duemore » to the adaptability of human beings. In this vein, there are several major user modeling goals, namely descriptive modeling, predictive modeling and automated weakness discovery. This study shows how multi-agent techniques were used to simulate a large-scale computing cluster at each of these levels.« less
Factor Analysis and Counseling Research
ERIC Educational Resources Information Center
Weiss, David J.
1970-01-01
Topics discussed include factor analysis versus cluster analysis, analysis of Q correlation matrices, ipsativity and factor analysis, and tests for the significance of a correlation matrix prior to application of factor analytic techniques. Techniques for factor extraction discussed include principal components, canonical factor analysis, alpha…
NASA Astrophysics Data System (ADS)
Ye, M.; Pacheco Castro, R. B.; Pacheco Avila, J.; Cabrera Sansores, A.
2014-12-01
The karstic aquifer of Yucatan is a vulnerable and complex system. The first fifteen meters of this aquifer have been polluted, due to this the protection of this resource is important because is the only source of potable water of the entire State. Through the assessment of groundwater quality we can gain some knowledge about the main processes governing water chemistry as well as spatial patterns which are important to establish protection zones. In this work multivariate statistical techniques are used to assess the groundwater quality of the supply wells (30 to 40 meters deep) in the hidrogeologic region of the Ring of Cenotes, located in Yucatan, Mexico. Cluster analysis and principal component analysis are applied in groundwater chemistry data of the study area. Results of principal component analysis show that the main sources of variation in the data are due sea water intrusion and the interaction of the water with the carbonate rocks of the system and some pollution processes. The cluster analysis shows that the data can be divided in four clusters. The spatial distribution of the clusters seems to be random, but is consistent with sea water intrusion and pollution with nitrates. The overall results show that multivariate statistical analysis can be successfully applied in the groundwater quality assessment of this karstic aquifer.
Common factor analysis versus principal component analysis: choice for symptom cluster research.
Kim, Hee-Ju
2008-03-01
The purpose of this paper is to examine differences between two factor analytical methods and their relevance for symptom cluster research: common factor analysis (CFA) versus principal component analysis (PCA). Literature was critically reviewed to elucidate the differences between CFA and PCA. A secondary analysis (N = 84) was utilized to show the actual result differences from the two methods. CFA analyzes only the reliable common variance of data, while PCA analyzes all the variance of data. An underlying hypothetical process or construct is involved in CFA but not in PCA. PCA tends to increase factor loadings especially in a study with a small number of variables and/or low estimated communality. Thus, PCA is not appropriate for examining the structure of data. If the study purpose is to explain correlations among variables and to examine the structure of the data (this is usual for most cases in symptom cluster research), CFA provides a more accurate result. If the purpose of a study is to summarize data with a smaller number of variables, PCA is the choice. PCA can also be used as an initial step in CFA because it provides information regarding the maximum number and nature of factors. In using factor analysis for symptom cluster research, several issues need to be considered, including subjectivity of solution, sample size, symptom selection, and level of measure.
NASA Astrophysics Data System (ADS)
Unglert, K.; Radić, V.; Jellinek, A. M.
2016-06-01
Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as self-organizing maps (SOM) and principal component analysis (PCA) can help to quickly and automatically identify important patterns related to impending eruptions. For the first time, we evaluate the performance of SOM and PCA on synthetic volcano seismic spectra constructed from observations during two well-studied eruptions at Klauea Volcano, Hawai'i, that include features observed in many volcanic settings. In particular, our objective is to test which of the techniques can best retrieve a set of three spectral patterns that we used to compose a synthetic spectrogram. We find that, without a priori knowledge of the given set of patterns, neither SOM nor PCA can directly recover the spectra. We thus test hierarchical clustering, a commonly used method, to investigate whether clustering in the space of the principal components and on the SOM, respectively, can retrieve the known patterns. Our clustering method applied to the SOM fails to detect the correct number and shape of the known input spectra. In contrast, clustering of the data reconstructed by the first three PCA modes reproduces these patterns and their occurrence in time more consistently. This result suggests that PCA in combination with hierarchical clustering is a powerful practical tool for automated identification of characteristic patterns in volcano seismic spectra. Our results indicate that, in contrast to PCA, common clustering algorithms may not be ideal to group patterns on the SOM and that it is crucial to evaluate the performance of these tools on a control dataset prior to their application to real data.
Samsir, Sri A'jilah; Bunawan, Hamidun; Yen, Choong Chee; Noor, Normah Mohd
2016-09-01
In this dataset, we distinguish 15 accessions of Garcinia mangostana from Peninsular Malaysia using Fourier transform-infrared spectroscopy coupled with chemometric analysis. We found that the position and intensity of characteristic peaks at 3600-3100 cm(-) (1) in IR spectra allowed discrimination of G. mangostana from different locations. Further principal component analysis (PCA) of all the accessions suggests the two main clusters were formed: samples from Johor, Melaka, and Negeri Sembilan (South) were clustered together in one group while samples from Perak, Kedah, Penang, Selangor, Kelantan, and Terengganu (North and East Coast) were in another clustered group.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chow, Edward, E-mail: Edward.Chow@sunnybrook.c; James, Jennifer; Barsevick, Andrea
Purpose: To explore the relationships (clusters) among the functional interference items in the Brief Pain Inventory (BPI) in patients with bone metastases. Methods: Patients enrolled in the Radiation Therapy Oncology Group (RTOG) 9714 bone metastases study were eligible. Patients were assessed at baseline and 4, 8, and 12 weeks after randomization for the palliative radiotherapy with the BPI, which consists of seven functional items: general activity, mood, walking ability, normal work, relations with others, sleep, and enjoyment of life. Principal component analysis with varimax rotation was used to determine the clusters between the functional items at baseline and the follow-up.more » Cronbach's alpha was used to determine the consistency and reliability of each cluster at baseline and follow-up. Results: There were 448 male and 461 female patients, with a median age of 67 years. There were two functional interference clusters at baseline, which accounted for 71% of the total variance. The first cluster (physical interference) included normal work and walking ability, which accounted for 58% of the total variance. The second cluster (psychosocial interference) included relations with others and sleep, which accounted for 13% of the total variance. The Cronbach's alpha statistics were 0.83 and 0.80, respectively. The functional clusters changed at week 12 in responders but persisted through week 12 in nonresponders. Conclusion: Palliative radiotherapy is effective in reducing bone pain. Functional interference component clusters exist in patients treated for bone metastases. These clusters changed over time in this study, possibly attributable to treatment. Further research is needed to examine these effects.« less
Analysis of RXTE data on Clusters of Galaxies
NASA Technical Reports Server (NTRS)
Petrosian, Vahe
2004-01-01
This grant provided support for the reduction, analysis and interpretation of of hard X-ray (HXR, for short) observations of the cluster of galaxies RXJO658--5557 scheduled for the week of August 23, 2002 under the RXTE Cycle 7 program (PI Vahe Petrosian, Obs. ID 70165). The goal of the observation was to search for and characterize the shape of the HXR component beyond the well established thermal soft X-ray (SXR) component. Such hard components have been detected in several nearby clusters. distant cluster would provide information on the characteristics of this radiation at a different epoch in the evolution of the imiverse and shed light on its origin. We (Petrosian, 2001) have argued that thermal bremsstrahlung, as proposed earlier, cannot be the mechanism for the production of the HXRs and that the most likely mechanism is Compton upscattering of the cosmic microwave radiation by relativistic electrons which are known to be present in the clusters and be responsible for the observed radio emission. Based on this picture we estimated that this cluster, in spite of its relatively large distance, will have HXR signal comparable to the other nearby ones. The planned observation of a relatively The proposed RXTE observations were carried out and the data have been analyzed. We detect a hard X-ray tail in the spectrum of this cluster with a flux very nearly equal to our predicted value. This has strengthen the case for the Compton scattering model. We intend the data obtained via this observation to be a part of a larger data set. We have identified other clusters of galaxies (in archival RXTE and other instrument data sets) with sufficiently high quality data where we can search for and measure (or at least put meaningful limits) on the strength of the hard component. With these studies we expect to clarify the mechanism for acceleration of particles in the intercluster medium and provide guidance for future observations of this intriguing phenomenon by instrument on GLAST. The details of the nonthermal particle population has important implications for the theories of cluster formation, mergers and evolution. The results of this work were first presented at the High Energy Division meeting of the American astronomical Society at Mt. Tremblene, Canada (Petrosian et al. 2003). and in an invited review talk at the General Assembly of the International Astronomical Union at Sydney, Australia (Petrosian, 2003). A paper describe the observations, the data analysis and its implication is being prepared for publication in the Astrophysical Journal.
Murphy, Matthew; MacCarthy, M Jayne; McAllister, Lynda; Gilbert, Robert
2014-12-05
Competency profiles for occupational clusters within Canada's substance abuse workforce (SAW) define the need for skill and knowledge in evidence-based practice (EBP) across all its members. Members of the Senior Management occupational cluster hold ultimate responsibility for decisions made within addiction services agencies and therefore must possess the highest level of proficiency in EBP. The objective of this study was to assess the knowledge of the principles of EBP, and use of the components of the evidence-based decision making (EBDM) process in members of this occupational cluster from selected addiction services agencies in Nova Scotia. A convenience sampling method was used to recruit participants from addiction services agencies. Semi-structured qualitative interviews were conducted with eighteen Senior Management. The interviews were audio-recorded, transcribed verbatim and checked by the participants. Interview transcripts were coded and analyzed for themes using content analysis and assisted by qualitative data analysis software (NVivo 9.0). Data analysis revealed four main themes: 1) Senior Management believe that addictions services agencies are evidence-based; 2) Consensus-based decision making is the norm; 3) Senior Management understand the principles of EBP and; 4) Senior Management do not themselves use all components of the EBDM process when making decisions, oftentimes delegating components of this process to decision support staff. Senior Management possess an understanding of the principles of EBP, however, when making decisions they often delegate components of the EBDM process to decision support staff. Decision support staff are not defined as an occupational cluster in Canada's SAW and have not been ascribed a competency profile. As such, there is no guarantee that this group possesses competency in EBDM. There is a need to advocate for the development of a defined occupational cluster and associated competency profile for this critical group.
Cluster Analysis to Identify Possible Subgroups in Tinnitus Patients.
van den Berge, Minke J C; Free, Rolien H; Arnold, Rosemarie; de Kleine, Emile; Hofman, Rutger; van Dijk, J Marc C; van Dijk, Pim
2017-01-01
In tinnitus treatment, there is a tendency to shift from a "one size fits all" to a more individual, patient-tailored approach. Insight in the heterogeneity of the tinnitus spectrum might improve the management of tinnitus patients in terms of choice of treatment and identification of patients with severe mental distress. The goal of this study was to identify subgroups in a large group of tinnitus patients. Data were collected from patients with severe tinnitus complaints visiting our tertiary referral tinnitus care group at the University Medical Center Groningen. Patient-reported and physician-reported variables were collected during their visit to our clinic. Cluster analyses were used to characterize subgroups. For the selection of the right variables to enter in the cluster analysis, two approaches were used: (1) variable reduction with principle component analysis and (2) variable selection based on expert opinion. Various variables of 1,783 tinnitus patients were included in the analyses. Cluster analysis (1) included 976 patients and resulted in a four-cluster solution. The effect of external influences was the most discriminative between the groups, or clusters, of patients. The "silhouette measure" of the cluster outcome was low (0.2), indicating a "no substantial" cluster structure. Cluster analysis (2) included 761 patients and resulted in a three-cluster solution, comparable to the first analysis. Again, a "no substantial" cluster structure was found (0.2). Two cluster analyses on a large database of tinnitus patients revealed that clusters of patients are mostly formed by a different response of external influences on their disease. However, both cluster outcomes based on this dataset showed a poor stability, suggesting that our tinnitus population comprises a continuum rather than a number of clearly defined subgroups.
Amro, Amin; Waldum, Bård; von der Lippe, Nanna; Brekke, Fredrik Barth; Dammen, Toril; Miaskowski, Christine; Os, Ingrid
2015-01-01
Patients with end-stage renal disease on dialysis have reduced survival rates compared with the general population. Symptoms are frequent in dialysis patients, and a symptom cluster is defined as two or more related co-occurring symptoms. The aim of this study was to explore the associations between symptom clusters and mortality in dialysis patients. In a prospective observational cohort study of dialysis patients (n = 301), Kidney Disease and Quality of Life Short Form and Beck Depression Inventory questionnaires were administered. To generate symptom clusters, principal component analysis with varimax rotation was used on 11 kidney-specific self-reported physical symptoms. A Beck Depression Inventory score of 16 or greater was defined as clinically significant depressive symptoms. Physical and mental component summary scores were generated from Short Form-36. Multivariate Cox regression analysis was used for the survival analysis, Kaplan-Meier curves and log-rank statistics were applied to compare survival rates between the groups. Three different symptom clusters were identified; one included loading of several uremic symptoms. In multivariate analyses and after adjustment for health-related quality of life and depressive symptoms, the worst perceived quartile of the "uremic" symptom cluster independently predicted all-cause mortality (hazard ratio 2.47, 95% CI 1.44-4.22, P = 0.001) compared with the other quartiles during a follow-up period that ranged from four to 52 months. The two other symptom clusters ("neuromuscular" and "skin") or the individual symptoms did not predict mortality. Clustering of uremic symptoms predicted mortality. Assessing co-occurring symptoms rather than single symptoms may help to identify dialysis patients at high risk for mortality. Copyright © 2015 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Dong, Jianghu J; Wang, Liangliang; Gill, Jagbir; Cao, Jiguo
2017-01-01
This article is motivated by some longitudinal clinical data of kidney transplant recipients, where kidney function progression is recorded as the estimated glomerular filtration rates at multiple time points post kidney transplantation. We propose to use the functional principal component analysis method to explore the major source of variations of glomerular filtration rate curves. We find that the estimated functional principal component scores can be used to cluster glomerular filtration rate curves. Ordering functional principal component scores can detect abnormal glomerular filtration rate curves. Finally, functional principal component analysis can effectively estimate missing glomerular filtration rate values and predict future glomerular filtration rate values.
Borri, Marco; Schmidt, Maria A.; Powell, Ceri; Koh, Dow-Mu; Riddell, Angela M.; Partridge, Mike; Bhide, Shreerang A.; Nutting, Christopher M.; Harrington, Kevin J.; Newbold, Katie L.; Leach, Martin O.
2015-01-01
Purpose To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters) of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment. Material and Methods The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4). Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters. Results The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4), determined with cluster validation, produced the best separation between reducing and non-reducing clusters. Conclusion The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes. PMID:26398888
NASA Astrophysics Data System (ADS)
Liu, Jiangang; Tian, Jie
2007-03-01
The present study combined the Independent Component Analysis (ICA) and low-resolution brain electromagnetic tomography (LORETA) algorithms to identify the spatial distribution and time course of single-trial EEG record differences between neural responses to emotional stimuli vs. the neutral. Single-trial multichannel (129-sensor) EEG records were collected from 21 healthy, right-handed subjects viewing the emotion emotional (pleasant/unpleasant) and neutral pictures selected from International Affective Picture System (IAPS). For each subject, the single-trial EEG records of each emotional pictures were concatenated with the neutral, and a three-step analysis was applied to each of them in the same way. First, the ICA was performed to decompose each concatenated single-trial EEG records into temporally independent and spatially fixed components, namely independent components (ICs). The IC associated with artifacts were isolated. Second, the clustering analysis classified, across subjects, the temporally and spatially similar ICs into the same clusters, in which nonparametric permutation test for Global Field Power (GFP) of IC projection scalp maps identified significantly different temporal segments of each emotional condition vs. neutral. Third, the brain regions accounted for those significant segments were localized spatially with LORETA analysis. In each cluster, a voxel-by-voxel randomization test identified significantly different brain regions between each emotional condition vs. the neutral. Compared to the neutral, both emotional pictures elicited activation in the visual, temporal, ventromedial and dorsomedial prefrontal cortex and anterior cingulated gyrus. In addition, the pleasant pictures activated the left middle prefrontal cortex and the posterior precuneus, while the unpleasant pictures activated the right orbitofrontal cortex, posterior cingulated gyrus and somatosensory region. Our results were well consistent with other functional imaging studies, while revealed temporal dynamics of emotional processing of specific brain structure with high temporal resolution.
Allanson, Emma R; Tunçalp, Özge; Vogel, Joshua P; Khan, Dina N; Oladapo, Olufemi T; Long, Qian; Gülmezoglu, Ahmet Metin
2017-01-01
The capacity for health systems to support the translation of research in to clinical practice may be limited. The cluster randomised controlled trial (cluster RCT) design is often employed in evaluating the effectiveness of implementation of evidence-based practices. We aimed to systematically review available evidence to identify and evaluate the components in the implementation process at the facility level using cluster RCT designs. All cluster RCTs where the healthcare facility was the unit of randomisation, published or written from 1990 to 2014, were assessed. Included studies were analysed for the components of implementation interventions employed in each. Through iterative mapping and analysis, we synthesised a master list of components used and summarised the effects of different combinations of interventions on practices. Forty-six studies met the inclusion criteria and covered the specialty groups of obstetrics and gynaecology (n=9), paediatrics and neonatology (n=4), intensive care (n=4), internal medicine (n=20), and anaesthetics and surgery (n=3). Six studies included interventions that were delivered across specialties. Nine components of multifaceted implementation interventions were identified: leadership, barrier identification, tailoring to the context, patient involvement, communication, education, supportive supervision, provision of resources, and audit and feedback. The four main components that were most commonly used were education (n=42, 91%), audit and feedback (n=26, 57%), provision of resources (n=23, 50%) and leadership (n=21, 46%). Future implementation research should focus on better reporting of multifaceted approaches, incorporating sets of components that facilitate the translation of research into practice, and should employ rigorous monitoring and evaluation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bonamigo, M.; Grillo, C.; Ettori, S.
We present a novel approach for a combined analysis of X-ray and gravitational lensing data and apply this technique to the merging galaxy cluster MACS J0416.1–2403. The method exploits the information on the intracluster gas distribution that comes from a fit of the X-ray surface brightness and then includes the hot gas as a fixed mass component in the strong-lensing analysis. With our new technique, we can separate the collisional from the collision-less diffuse mass components, thus obtaining a more accurate reconstruction of the dark matter distribution in the core of a cluster. We introduce an analytical description of themore » X-ray emission coming from a set of dual pseudo-isothermal elliptical mass distributions, which can be directly used in most lensing softwares. By combining Chandra observations with Hubble Frontier Fields imaging and Multi Unit Spectroscopic Explorer spectroscopy in MACS J0416.1–2403, we measure a projected gas-to-total mass fraction of approximately 10% at 350 kpc from the cluster center. Compared to the results of a more traditional cluster mass model (diffuse halos plus member galaxies), we find a significant difference in the cumulative projected mass profile of the dark matter component and that the dark matter over total mass fraction is almost constant, out to more than 350 kpc. In the coming era of large surveys, these results show the need of multiprobe analyses for detailed dark matter studies in galaxy clusters.« less
Method for exploratory cluster analysis and visualisation of single-trial ERP ensembles.
Williams, N J; Nasuto, S J; Saddy, J D
2015-07-30
The validity of ensemble averaging on event-related potential (ERP) data has been questioned, due to its assumption that the ERP is identical across trials. Thus, there is a need for preliminary testing for cluster structure in the data. We propose a complete pipeline for the cluster analysis of ERP data. To increase the signal-to-noise (SNR) ratio of the raw single-trials, we used a denoising method based on Empirical Mode Decomposition (EMD). Next, we used a bootstrap-based method to determine the number of clusters, through a measure called the Stability Index (SI). We then used a clustering algorithm based on a Genetic Algorithm (GA) to define initial cluster centroids for subsequent k-means clustering. Finally, we visualised the clustering results through a scheme based on Principal Component Analysis (PCA). After validating the pipeline on simulated data, we tested it on data from two experiments - a P300 speller paradigm on a single subject and a language processing study on 25 subjects. Results revealed evidence for the existence of 6 clusters in one experimental condition from the language processing study. Further, a two-way chi-square test revealed an influence of subject on cluster membership. Our analysis operates on denoised single-trials, the number of clusters are determined in a principled manner and the results are presented through an intuitive visualisation. Given the cluster structure in some experimental conditions, we suggest application of cluster analysis as a preliminary step before ensemble averaging. Copyright © 2015 Elsevier B.V. All rights reserved.
Technical Efficiency of Automotive Industry Cluster in Chennai
NASA Astrophysics Data System (ADS)
Bhaskaran, E.
2012-07-01
Chennai is also called as Detroit of India due to its automotive industry presence producing over 40 % of the India's vehicle and components. During 2001-2002, diagnostic study was conducted on the Automotive Component Industries (ACI) in Ambattur Industrial Estate, Chennai and in SWOT analysis it was found that it had faced problems on infrastructure, technology, procurement, production and marketing. In the year 2004-2005 under the cluster development approach (CDA), they formed Chennai auto cluster, under public private partnership concept, received grant from Government of India, Government of Tamil Nadu, Ambattur Municipality, bank loans and stake holders. This results development in infrastructure, technology, procurement, production and marketing interrelationships among ACI. The objective is to determine the correlation coefficient, regression equation, technical efficiency, peer weights, slack variables and return to scale of cluster before and after the CDA. The methodology adopted is collection of primary data from ACI and analyzing using data envelopment analysis (DEA) of input oriented Banker-Charnes-Cooper model. There is significant increase in correlation coefficient and the regression analysis reveals that for one percent increase in employment and net worth, the gross output increases significantly after the CDA. The DEA solver gives the technical efficiency of ACI by taking shift, employment, net worth as input data and quality, gross output and export ratio as output data. From the technical score and ranking of ACI, it is found that there is significant increase in technical efficiency of ACI when compared to CDA. The slack variables obtained clearly reveals the excess employment and net worth and no shortage of gross output. To conclude there is increase in technical efficiency of not only Chennai auto cluster in general but also Chennai auto components industries in particular.
ERIC Educational Resources Information Center
Rhee, Eunjeong; Lee, Bo Hyun; Kim, Boyoung; Ha, Gyuyoung; Lee, Sang Min
2016-01-01
The current study investigated how the five components of planned happenstance skills are related to vocational identity statuses. For determination of relationships, cluster and discriminant analyses were conducted sequentially on a sample of 515 university students in South Korea. Cluster analysis revealed vocational identity statuses to be…
Groundwater Quality: Analysis of Its Temporal and Spatial Variability in a Karst Aquifer.
Pacheco Castro, Roger; Pacheco Ávila, Julia; Ye, Ming; Cabrera Sansores, Armando
2018-01-01
This study develops an approach based on hierarchical cluster analysis for investigating the spatial and temporal variation of water quality governing processes. The water quality data used in this study were collected in the karst aquifer of Yucatan, Mexico, the only source of drinking water for a population of nearly two million people. Hierarchical cluster analysis was applied to the quality data of all the sampling periods lumped together. This was motivated by the observation that, if water quality does not vary significantly in time, two samples from the same sampling site will belong to the same cluster. The resulting distribution maps of clusters and box-plots of the major chemical components reveal the spatial and temporal variability of groundwater quality. Principal component analysis was used to verify the results of cluster analysis and to derive the variables that explained most of the variation of the groundwater quality data. Results of this work increase the knowledge about how precipitation and human contamination impact groundwater quality in Yucatan. Spatial variability of groundwater quality in the study area is caused by: a) seawater intrusion and groundwater rich in sulfates at the west and in the coast, b) water rock interactions and the average annual precipitation at the middle and east zones respectively, and c) human contamination present in two localized zones. Changes in the amount and distribution of precipitation cause temporal variation by diluting groundwater in the aquifer. This approach allows to analyze the variation of groundwater quality controlling processes efficiently and simultaneously. © 2017, National Ground Water Association.
Pirkle, Catherine M; Wu, Yan Yan; Zunzunegui, Maria-Victoria; Gómez, José Fernando
2018-01-01
Objective Conceptual models underpinning much epidemiological research on ageing acknowledge that environmental, social and biological systems interact to influence health outcomes. Recursive partitioning is a data-driven approach that allows for concurrent exploration of distinct mixtures, or clusters, of individuals that have a particular outcome. Our aim is to use recursive partitioning to examine risk clusters for metabolic syndrome (MetS) and its components, in order to identify vulnerable populations. Study design Cross-sectional analysis of baseline data from a prospective longitudinal cohort called the International Mobility in Aging Study (IMIAS). Setting IMIAS includes sites from three middle-income countries—Tirana (Albania), Natal (Brazil) and Manizales (Colombia)—and two from Canada—Kingston (Ontario) and Saint-Hyacinthe (Quebec). Participants Community-dwelling male and female adults, aged 64–75 years (n=2002). Primary and secondary outcome measures We apply recursive partitioning to investigate social and behavioural risk factors for MetS and its components. Model-based recursive partitioning (MOB) was used to cluster participants into age-adjusted risk groups based on variabilities in: study site, sex, education, living arrangements, childhood adversities, adult occupation, current employment status, income, perceived income sufficiency, smoking status and weekly minutes of physical activity. Results 43% of participants had MetS. Using MOB, the primary partitioning variable was participant sex. Among women from middle-incomes sites, the predicted proportion with MetS ranged from 58% to 68%. Canadian women with limited physical activity had elevated predicted proportions of MetS (49%, 95% CI 39% to 58%). Among men, MetS ranged from 26% to 41% depending on childhood social adversity and education. Clustering for MetS components differed from the syndrome and across components. Study site was a primary partitioning variable for all components except HDL cholesterol. Sex was important for most components. Conclusion MOB is a promising technique for identifying disease risk clusters (eg, vulnerable populations) in modestly sized samples. PMID:29500203
Matsuura, Tomoaki; Tanimura, Naoki; Hosoda, Kazufumi; Yomo, Tetsuya; Shimizu, Yoshihiro
2017-01-01
To elucidate the dynamic features of a biologically relevant large-scale reaction network, we constructed a computational model of minimal protein synthesis consisting of 241 components and 968 reactions that synthesize the Met-Gly-Gly (MGG) peptide based on an Escherichia coli-based reconstituted in vitro protein synthesis system. We performed a simulation using parameters collected primarily from the literature and found that the rate of MGG peptide synthesis becomes nearly constant in minutes, thus achieving a steady state similar to experimental observations. In addition, concentration changes to 70% of the components, including intermediates, reached a plateau in a few minutes. However, the concentration change of each component exhibits several temporal plateaus, or a quasi-stationary state (QSS), before reaching the final plateau. To understand these complex dynamics, we focused on whether the components reached a QSS, mapped the arrangement of components in a QSS in the entire reaction network structure, and investigated time-dependent changes. We found that components in a QSS form clusters that grow over time but not in a linear fashion, and that this process involves the collapse and regrowth of clusters before the formation of a final large single cluster. These observations might commonly occur in other large-scale biological reaction networks. This developed analysis might be useful for understanding large-scale biological reactions by visualizing complex dynamics, thereby extracting the characteristics of the reaction network, including phase transitions. PMID:28167777
NASA Astrophysics Data System (ADS)
Wheeler, K. I.; Levia, D. F., Jr.; Hudson, J. E.
2017-12-01
As trees undergo autumnal processes such as resorption, senescence, and leaf abscission, the dissolved organic matter (DOM) contribution of leaf litter leachate to streams changes. However, little research has investigated how the fluorescent DOM (FDOM) changes throughout the autumn and how this differs inter- and intraspecifically. Two of the major impacts of global climate change on forested ecosystems include altering phenology and causing forest community species and subspecies composition restructuring. We examined changes in FDOM in leachate from American beech (Fagus grandifolia Ehrh.) leaves in Maryland, Rhode Island, Vermont, and North Carolina and yellow poplar (Liriodendron tulipifera L.) leaves from Maryland throughout three different phenophases: green, senescing, and freshly abscissed. Beech leaves from Maryland and Rhode Island have previously been identified as belonging to the same distinct genetic cluster and beech trees from Vermont and the study site in North Carolina from the other. FDOM in samples was characterized using excitation-emission matrices (EEMs) and a six-component parallel factor analysis (PARAFAC) model was created to identify components. Self-organizing maps (SOMs) were used to visualize variation and patterns in the PARAFAC component proportions of the leachate samples. Phenophase and species had the greatest influence on determining where a sample mapped on the SOM when compared to genetic clusters and geographic origin. Throughout senescence, FDOM from all the trees transitioned from more protein-like components to more humic-like ones. Percent greenness of the sampled leaves and the proportion of the tyrosine-like component 1 were found to significantly differ between the two genetic beech clusters. This suggests possible differences in photosynthesis and resorption between the two genetic clusters of beech. The use of SOMs to visualize differences in patterns of senescence between the different species and genetic populations proved to be useful in ways that other multivariate analysis techniques lack.
Alexander, Nathan; Woetzel, Nils; Meiler, Jens
2011-02-01
Clustering algorithms are used as data analysis tools in a wide variety of applications in Biology. Clustering has become especially important in protein structure prediction and virtual high throughput screening methods. In protein structure prediction, clustering is used to structure the conformational space of thousands of protein models. In virtual high throughput screening, databases with millions of drug-like molecules are organized by structural similarity, e.g. common scaffolds. The tree-like dendrogram structure obtained from hierarchical clustering can provide a qualitative overview of the results, which is important for focusing detailed analysis. However, in practice it is difficult to relate specific components of the dendrogram directly back to the objects of which it is comprised and to display all desired information within the two dimensions of the dendrogram. The current work presents a hierarchical agglomerative clustering method termed bcl::Cluster. bcl::Cluster utilizes the Pymol Molecular Graphics System to graphically depict dendrograms in three dimensions. This allows simultaneous display of relevant biological molecules as well as additional information about the clusters and the members comprising them.
Computational gene expression profiling under salt stress reveals patterns of co-expression
Sanchita; Sharma, Ashok
2016-01-01
Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411
Binarity and Variable Stars in the Open Cluster NGC 2126
NASA Astrophysics Data System (ADS)
Chehlaeh, Nareemas; Mkrtichian, David; Kim, Seung-Lee; Lampens, Patricia; Komonjinda, Siramas; Kusakin, Anatoly; Glazunova, Ljudmila
2018-04-01
We present the results of an analysis of photometric time-series observations for NGC 2126 acquired at the Thai National Observatory (TNO) in Thailand and the Mount Lemmon Optical Astronomy Observatory (LOAO) in USA during the years 2004, 2013 and 2015. The main purpose is to search for new variable stars and to study the light curves of binary systems as well as the oscillation spectra of pulsating stars. NGC 2126 is an intermediate-age open cluster which has a population of stars inside the δ Scuti instability strip. Several variable stars are reported including three eclipsing binary stars, one of which is an eclipsing binary star with a pulsating component (V551 Aur). The Wilson-Devinney technique was used to analyze its light curves and to determine a new set of the system’s parameters. A frequency analysis of the eclipse-subtracted light curve was also performed. Eclipsing binaries which are members of open clusters are capable of delivering strong constraints on the cluster’s properties which are in turn useful for a pulsational analysis of their pulsating components. Therefore, high-resolution, high-quality spectra will be needed to derive accurate component radial velocities of the faint eclipsing binaries which are located in the field of NGC 2126. The new Devasthal Optical Telescope, suitably equipped, could in principle do this.
Analyzing coastal environments by means of functional data analysis
NASA Astrophysics Data System (ADS)
Sierra, Carlos; Flor-Blanco, Germán; Ordoñez, Celestino; Flor, Germán; Gallego, José R.
2017-07-01
Here we used Functional Data Analysis (FDA) to examine particle-size distributions (PSDs) in a beach/shallow marine sedimentary environment in Gijón Bay (NW Spain). The work involved both Functional Principal Components Analysis (FPCA) and Functional Cluster Analysis (FCA). The grainsize of the sand samples was characterized by means of laser dispersion spectroscopy. Within this framework, FPCA was used as a dimension reduction technique to explore and uncover patterns in grain-size frequency curves. This procedure proved useful to describe variability in the structure of the data set. Moreover, an alternative approach, FCA, was applied to identify clusters and to interpret their spatial distribution. Results obtained with this latter technique were compared with those obtained by means of two vector approaches that combine PCA with CA (Cluster Analysis). The first method, the point density function (PDF), was employed after adapting a log-normal distribution to each PSD and resuming each of the density functions by its mean, sorting, skewness and kurtosis. The second applied a centered-log-ratio (clr) to the original data. PCA was then applied to the transformed data, and finally CA to the retained principal component scores. The study revealed functional data analysis, specifically FPCA and FCA, as a suitable alternative with considerable advantages over traditional vector analysis techniques in sedimentary geology studies.
Verma, Priyanka; Kumar, Manoj; Mishra, Girish; Sahoo, Dinabandhu
2017-02-01
In the present study bio prospecting of thirty seaweeds from Indian coasts was analyzed for their biochemical components including pigments, fatty acid and ash content. Multivariate analysis of biochemical components and fatty acids was done using Principal Component Analysis (PCA) and Agglomerative hierarchical clustering (AHC) to manifest chemotaxonomic relationship among various seaweeds. The overall analysis suggests that these seaweeds have multi-functional properties and can be utilized as promising bioresource for proteins, lipids, pigments and carbohydrates for the food/feed and biofuel industry. Copyright © 2016. Published by Elsevier Ltd.
Kumar, Raj G; Rubin, Jonathan E; Berger, Rachel P; Kochanek, Patrick M; Wagner, Amy K
2016-03-01
Studies have characterized absolute levels of multiple inflammatory markers as significant risk factors for poor outcomes after traumatic brain injury (TBI). However, inflammatory marker concentrations are highly inter-related, and production of one may result in the production or regulation of another. Therefore, a more comprehensive characterization of the inflammatory response post-TBI should consider relative levels of markers in the inflammatory pathway. We used principal component analysis (PCA) as a dimension-reduction technique to characterize the sets of markers that contribute independently to variability in cerebrospinal (CSF) inflammatory profiles after TBI. Using PCA results, we defined groups (or clusters) of individuals (n=111) with similar patterns of acute CSF inflammation that were then evaluated in the context of outcome and other relevant CSF and serum biomarkers collected days 0-3 and 4-5 post-injury. We identified four significant principal components (PC1-PC4) for CSF inflammation from days 0-3, and PC1 accounted for the greatest (31%) percentage of variance. PC1 was characterized by relatively higher CSF sICAM-1, sFAS, IL-10, IL-6, sVCAM-1, IL-5, and IL-8 levels. Cluster analysis then defined two distinct clusters, such that individuals in cluster 1 had highly positive PC1 scores and relatively higher levels of CSF cortisol, progesterone, estradiol, testosterone, brain derived neurotrophic factor (BDNF), and S100b; this group also had higher serum cortisol and lower serum BDNF. Multinomial logistic regression analyses showed that individuals in cluster 1 had a 10.9 times increased likelihood of GOS scores of 2/3 vs. 4/5 at 6 months compared to cluster 2, after controlling for covariates. Cluster group did not discriminate between mortality compared to GOS scores of 4/5 after controlling for age and other covariates. Cluster groupings also did not discriminate mortality or 12 month outcomes in multivariate models. PCA and cluster analysis establish that a subset of CSF inflammatory markers measured in days 0-3 post-TBI may distinguish individuals with poor 6-month outcome, and future studies should prospectively validate these findings. PCA of inflammatory mediators after TBI could aid in prognostication and in identifying patient subgroups for therapeutic interventions. Copyright © 2015 Elsevier Inc. All rights reserved.
Vidigal, Pedrina Gonçalves; Mosel, Frank; Koehling, Hedda Luise; Mueller, Karl Dieter; Buer, Jan; Rath, Peter Michael; Steinmann, Joerg
2014-12-01
Stenotrophomonas maltophilia is an opportunist multidrug-resistant pathogen that causes a wide range of nosocomial infections. Various cystic fibrosis (CF) centres have reported an increasing prevalence of S. maltophilia colonization/infection among patients with this disease. The purpose of this study was to assess specific fingerprints of S. maltophilia isolates from CF patients (n = 71) by investigating fatty acid methyl esters (FAMEs) through gas chromatography (GC) and highly abundant proteins by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS), and to compare them with isolates obtained from intensive care unit (ICU) patients (n = 20) and the environment (n = 11). Principal component analysis (PCA) of GC-FAME patterns did not reveal a clustering corresponding to distinct CF, ICU or environmental types. Based on the peak area index, it was observed that S. maltophilia isolates from CF patients produced significantly higher amounts of fatty acids in comparison with ICU patients and the environmental isolates. Hierarchical cluster analysis (HCA) based on the MALDI-TOF MS peak profiles of S. maltophilia revealed the presence of five large clusters, suggesting a high phenotypic diversity. Although HCA of MALDI-TOF mass spectra did not result in distinct clusters predominantly composed of CF isolates, PCA revealed the presence of a distinct cluster composed of S. maltophilia isolates from CF patients. Our data suggest that S. maltophilia colonizing CF patients tend to modify not only their fatty acid patterns but also their protein patterns as a response to adaptation in the unfavourable environment of the CF lung. © 2014 The Authors.
Origins of chemoreceptor curvature sorting in Escherichia coli
Draper, Will; Liphardt, Jan
2017-01-01
Bacterial chemoreceptors organize into large clusters at the cell poles. Despite a wealth of structural and biochemical information on the system's components, it is not clear how chemoreceptor clusters are reliably targeted to the cell pole. Here, we quantify the curvature-dependent localization of chemoreceptors in live cells by artificially deforming growing cells of Escherichia coli in curved agar microchambers, and find that chemoreceptor cluster localization is highly sensitive to membrane curvature. Through analysis of multiple mutants, we conclude that curvature sensitivity is intrinsic to chemoreceptor trimers-of-dimers, and results from conformational entropy within the trimer-of-dimers geometry. We use the principles of the conformational entropy model to engineer curvature sensitivity into a series of multi-component synthetic protein complexes. When expressed in E. coli, the synthetic complexes form large polar clusters, and a complex with inverted geometry avoids the cell poles. This demonstrates the successful rational design of both polar and anti-polar clustering, and provides a synthetic platform on which to build new systems. PMID:28322223
Using Interactive Graphics to Teach Multivariate Data Analysis to Psychology Students
ERIC Educational Resources Information Center
Valero-Mora, Pedro M.; Ledesma, Ruben D.
2011-01-01
This paper discusses the use of interactive graphics to teach multivariate data analysis to Psychology students. Three techniques are explored through separate activities: parallel coordinates/boxplots; principal components/exploratory factor analysis; and cluster analysis. With interactive graphics, students may perform important parts of the…
Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.
Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V
2007-01-01
The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.
Baraybar, Jose Pablo
2015-09-01
The analysis of the distribution of gunshot injuries in a sample of 777 sets of human remains of proven human rights abuse from Somaliland, the Balkans and Peru is compared to frequencies of injuries sustained by combatants in contemporary conflicts reported in the literature. Principal Component Analysis (PCA) reduced the data to three components accounting for 82.94% of the variance. The first component with 38.31% of variance shows segments Arms and thorax/abdomen to be positively correlated (0.887 and 0.662, respectively); the segment head/neck is strongly correlated (0.951) to the second component while the segment thorax/abdomen shows a low, negative correlation (-0.388). Finally in the third component only the legs are strongly correlated (0.991). Data was further subjected to a K-means cluster analysis to determine the likely groupings combining the four types of injuries. Each of the three clusters reproduced similar patterns observed in the PCA: Cluster 1 shows the prevalence of injuries to the thorax/abdomen and extremities in addition to injuries to the head/neck; Cluster 2 shows injuries to the head/neck and Cluster 3 injuries to the thorax/abdomen and a lower representation of the arms and legs. Most of the cases (70.5%), irrespective of geography and type of site (attack or detention), were grouped into Cluster 2. Such comparison shows that in human rights abuse, irrespective of their geography, gunshot injuries tend to follow a pattern favouring the head/neck and thorax/abdomen areas over the extremities, the reverse pattern observed in contemporary combat operations. In those settings gunshot wound trauma is the second cause of mortality/morbidity (after fragmenting ammunition) and its distribution concentrates on the extremities, thorax/abdomen and head; following the pattern of protective armour when it is used. Considering that human rights abuses are often presented as encounters between two armed groups in the context of counter-insurgency operations, a careful analysis of gunshot injury patterns could serve as an indicator that in fact murder, rather than combat, took place and the intention was to kill rather than to maim or render people unfit for battle. To compare the variation of gunshot injury patterns between mortality associated with human rights abuses and armed conflict in selected samples from different countries. Literature review and case analysis. Original statistical analysis of gunshot injuries on human remains (n=777) recovered from mass or clandestine graves associated with human rights abuses in countries in Somaliland, the Balkans and Peru (1983-1995) and literature review of mortality caused by armed conflicts. Mechanism of gunshot injury and wound distribution pattern in geographically diverse samples of human rights abuse. Copyright © 2015 The Chartered Society of Forensic Sciences. Published by Elsevier Ireland Ltd. All rights reserved.
Towards the identification of plant and animal binders on Australian stone knives.
Blee, Alisa J; Walshe, Keryn; Pring, Allan; Quinton, Jamie S; Lenehan, Claire E
2010-07-15
There is limited information regarding the nature of plant and animal residues used as adhesives, fixatives and pigments found on Australian Aboriginal artefacts. This paper reports the use of FTIR in combination with the chemometric tools principal component analysis (PCA) and hierarchical clustering (HC) for the analysis and identification of Australian plant and animal fixatives on Australian stone artefacts. Ten different plant and animal residues were able to be discriminated from each other at a species level by combining FTIR spectroscopy with the chemometric data analysis methods, principal component analysis (PCA) and hierarchical clustering (HC). Application of this method to residues from three broken stone knives from the collections of the South Australian Museum indicated that two of the handles of knives were likely to have contained beeswax as the fixative whilst Spinifex resin was the probable binder on the third. Copyright 2010 Elsevier B.V. All rights reserved.
Lifestyle and accidents among young drivers.
Gregersen, N P; Berg, H Y
1994-06-01
This study covers the lifestyle component of the problems related to young drivers' accident risk. The purpose of the study is to measure the relationship between lifestyle and accident risk, and to identify specific high-risk and low-risk groups. Lifestyle is measured through a questionnaire, where 20-year-olds describe themselves and how often they deal with a large number of different activities, like sports, music, movies, reading, cars and driving, political engagement, etc. They also report their involvement in traffic accidents. With a principal component analysis followed by a cluster analysis, lifestyle profiles are defined. These profiles are finally correlated to accidents, which makes it possible to define high-risk and low-risk groups. The cluster analysis defined 15 clusters including four high-risk groups with an average overrisk of 150% and two low-risk groups with an average underrisk of 75%. The results are discussed from two perspectives. The first is the importance of theoretical understanding of the contribution of lifestyle factors to young drivers' high accident risk. The second is how the findings could be used in practical road safety measures, like education, campaigns, etc.
Correlation and network analysis of global financial indices
NASA Astrophysics Data System (ADS)
Kumar, Sunil; Deo, Nivedita
2012-08-01
Random matrix theory (RMT) and network methods are applied to investigate the correlation and network properties of 20 financial indices. The results are compared before and during the financial crisis of 2008. In the RMT method, the components of eigenvectors corresponding to the second largest eigenvalue form two clusters of indices in the positive and negative directions. The components of these two clusters switch in opposite directions during the crisis. The network analysis uses the Fruchterman-Reingold layout to find clusters in the network of indices at different thresholds. At a threshold of 0.6, before the crisis, financial indices corresponding to the Americas, Europe, and Asia-Pacific form separate clusters. On the other hand, during the crisis at the same threshold, the American and European indices combine together to form a strongly linked cluster while the Asia-Pacific indices form a separate weakly linked cluster. If the value of the threshold is further increased to 0.9 then the European indices (France, Germany, and the United Kingdom) are found to be the most tightly linked indices. The structure of the minimum spanning tree of financial indices is more starlike before the crisis and it changes to become more chainlike during the crisis. The average linkage hierarchical clustering algorithm is used to find a clearer cluster structure in the network of financial indices. The cophenetic correlation coefficients are calculated and found to increase significantly, which indicates that the hierarchy increases during the financial crisis. These results show that there is substantial change in the structure of the organization of financial indices during a financial crisis.
Correlation and network analysis of global financial indices.
Kumar, Sunil; Deo, Nivedita
2012-08-01
Random matrix theory (RMT) and network methods are applied to investigate the correlation and network properties of 20 financial indices. The results are compared before and during the financial crisis of 2008. In the RMT method, the components of eigenvectors corresponding to the second largest eigenvalue form two clusters of indices in the positive and negative directions. The components of these two clusters switch in opposite directions during the crisis. The network analysis uses the Fruchterman-Reingold layout to find clusters in the network of indices at different thresholds. At a threshold of 0.6, before the crisis, financial indices corresponding to the Americas, Europe, and Asia-Pacific form separate clusters. On the other hand, during the crisis at the same threshold, the American and European indices combine together to form a strongly linked cluster while the Asia-Pacific indices form a separate weakly linked cluster. If the value of the threshold is further increased to 0.9 then the European indices (France, Germany, and the United Kingdom) are found to be the most tightly linked indices. The structure of the minimum spanning tree of financial indices is more starlike before the crisis and it changes to become more chainlike during the crisis. The average linkage hierarchical clustering algorithm is used to find a clearer cluster structure in the network of financial indices. The cophenetic correlation coefficients are calculated and found to increase significantly, which indicates that the hierarchy increases during the financial crisis. These results show that there is substantial change in the structure of the organization of financial indices during a financial crisis.
Vavougios, George D; George D, George; Pastaka, Chaido; Zarogiannis, Sotirios G; Gourgoulianis, Konstantinos I
2016-02-01
Phenotyping obstructive sleep apnea syndrome's comorbidity has been attempted for the first time only recently. The aim of our study was to determine phenotypes of comorbidity in obstructive sleep apnea syndrome patients employing a data-driven approach. Data from 1472 consecutive patient records were recovered from our hospital's database. Categorical principal component analysis and two-step clustering were employed to detect distinct clusters in the data. Univariate comparisons between clusters included one-way analysis of variance with Bonferroni correction and chi-square tests. Predictors of pairwise cluster membership were determined via a binary logistic regression model. The analyses revealed six distinct clusters: A, 'healthy, reporting sleeping related symptoms'; B, 'mild obstructive sleep apnea syndrome without significant comorbidities'; C1: 'moderate obstructive sleep apnea syndrome, obesity, without significant comorbidities'; C2: 'moderate obstructive sleep apnea syndrome with severe comorbidity, obesity and the exclusive inclusion of stroke'; D1: 'severe obstructive sleep apnea syndrome and obesity without comorbidity and a 33.8% prevalence of hypertension'; and D2: 'severe obstructive sleep apnea syndrome with severe comorbidities, along with the highest Epworth Sleepiness Scale score and highest body mass index'. Clusters differed significantly in apnea-hypopnea index, oxygen desaturation index; arousal index; age, body mass index, minimum oxygen saturation and daytime oxygen saturation (one-way analysis of variance P < 0.0001). Binary logistic regression indicated that older age, greater body mass index, lower daytime oxygen saturation and hypertension were associated independently with an increased risk of belonging in a comorbid cluster. Six distinct phenotypes of obstructive sleep apnea syndrome and its comorbidities were identified. Mapping the heterogeneity of the obstructive sleep apnea syndrome may help the early identification of at-risk groups. Finally, determining predictors of comorbidity for the moderate and severe strata of these phenotypes implies a need to take these factors into account when considering obstructive sleep apnea syndrome treatment options. © 2015 The Authors. Journal of Sleep Research published by John Wiley & Sons Ltd on behalf of European Sleep Research Society.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Ling; Harley, Robert A.; Brown, Nancy J.
Cluster analysis was applied to daily 8 h ozone maxima modeled for a summer season to characterize meteorology-induced variations in the spatial distribution of ozone. Principal component analysis is employed to form a reduced dimension set to describe and interpret ozone spatial patterns. The first three principal components (PCs) capture {approx}85% of total variance, with PC1 describing a general spatial trend, and PC2 and PC3 each describing a spatial contrast. Six clusters were identified for California's San Joaquin Valley (SJV) with two low, three moderate, and one high-ozone cluster. The moderate ozone clusters are distinguished by elevated ozone levels inmore » different parts of the valley: northern, western, and eastern, respectively. The SJV ozone clusters have stronger coupling with the San Francisco Bay area (SFB) than with the Sacramento Valley (SV). Variations in ozone spatial distributions induced by anthropogenic emission changes are small relative to the overall variations in ozone amomalies observed for the whole summer. Ozone regimes identified here are mostly determined by the direct and indirect meteorological effects. Existing measurement sites are sufficiently representative to capture ozone spatial patterns in the SFB and SV, but the western side of the SJV is under-sampled.« less
QCS : a system for querying, clustering, and summarizing documents.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunlavy, Daniel M.
2006-08-01
Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test setsmore » from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence ''trimming'', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.« less
QCS: a system for querying, clustering and summarizing documents.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunlavy, Daniel M.; Schlesinger, Judith D.; O'Leary, Dianne P.
2006-10-01
Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test setsmore » from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence 'trimming', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.« less
Spadone, Sara; de Pasquale, Francesco; Mantini, Dante; Della Penna, Stefania
2012-09-01
Independent component analysis (ICA) is typically applied on functional magnetic resonance imaging, electroencephalographic and magnetoencephalographic (MEG) data due to its data-driven nature. In these applications, ICA needs to be extended from single to multi-session and multi-subject studies for interpreting and assigning a statistical significance at the group level. Here a novel strategy for analyzing MEG independent components (ICs) is presented, Multivariate Algorithm for Grouping MEG Independent Components K-means based (MAGMICK). The proposed approach is able to capture spatio-temporal dynamics of brain activity in MEG studies by running ICA at subject level and then clustering the ICs across sessions and subjects. Distinctive features of MAGMICK are: i) the implementation of an efficient set of "MEG fingerprints" designed to summarize properties of MEG ICs as they are built on spatial, temporal and spectral parameters; ii) the implementation of a modified version of the standard K-means procedure to improve its data-driven character. This algorithm groups the obtained ICs automatically estimating the number of clusters through an adaptive weighting of the parameters and a constraint on the ICs independence, i.e. components coming from the same session (at subject level) or subject (at group level) cannot be grouped together. The performances of MAGMICK are illustrated by analyzing two sets of MEG data obtained during a finger tapping task and median nerve stimulation. The results demonstrate that the method can extract consistent patterns of spatial topography and spectral properties across sessions and subjects that are in good agreement with the literature. In addition, these results are compared to those from a modified version of affinity propagation clustering method. The comparison, evaluated in terms of different clustering validity indices, shows that our methodology often outperforms the clustering algorithm. Eventually, these results are confirmed by a comparison with a MEG tailored version of the self-organizing group ICA, which is largely used for fMRI IC clustering. Copyright © 2012 Elsevier Inc. All rights reserved.
Joint fMRI analysis and subject clustering using sparse dictionary learning
NASA Astrophysics Data System (ADS)
Kim, Seung-Jun; Dontaraju, Krishna K.
2017-08-01
Multi-subject fMRI data analysis methods based on sparse dictionary learning are proposed. In addition to identifying the component spatial maps by exploiting the sparsity of the maps, clusters of the subjects are learned by postulating that the fMRI volumes admit a subspace clustering structure. Furthermore, in order to tune the associated hyper-parameters systematically, a cross-validation strategy is developed based on entry-wise sampling of the fMRI dataset. Efficient algorithms for solving the proposed constrained dictionary learning formulations are developed. Numerical tests performed on synthetic fMRI data show promising results and provides insights into the proposed technique.
Sustainable Development in Indian Automotive Component Clusters
NASA Astrophysics Data System (ADS)
Bhaskaran, E.
2013-01-01
India is the world's second fastest growing auto market and boasts of the sixth largest automobile industry after China, the US, Germany, Japan and Brazil. The Indian auto component industry recorded its highest year-on-year growth of 34.2 % in 2010-2011, raking in revenue of US 39.9 billion; major contribution coming from exports at US five billion and fresh investment from the US at around US two billion. For inclusive growth and sustainable development most of the auto components manufacturers has adopted the cluster development approach. The objective is to study the technical efficiency (θ), peer weights (λ i ), input slacks (S-) and output slacks (S+) of four Auto Component Clusters (ACC) in India. The methodology adopted is using Data Envelopment Analysis of Input Oriented Banker Charnes Cooper Model by taking number of units and number of employments as inputs and sales and exports in crores as an outputs. The non-zero λ i 's represents the weights for efficient clusters. The S > 0 obtained for one ACC reveals the excess no. of units (S-) and employment (S-) and shortage in sales (S+) and exports (S+). However the variable returns to scale are increasing for three clusters, constant for one more cluster and with nil decrease. To conclude, for inclusive growth and sustainable development, the inefficient ACC should increase their turnover and exports, as decrease in no. of enterprises and employment is practically not possible. Moreover for sustainable development, the ACC should strengthen infrastructure interrelationships, technology interrelationships, procurement interrelationships, production interrelationships and marketing interrelationships to increase productivity and efficiency to compete in the world market.
Sequential analysis of hydrochemical data for watershed characterization.
Thyne, Geoffrey; Güler, Cüneyt; Poeter, Eileen
2004-01-01
A methodology for characterizing the hydrogeology of watersheds using hydrochemical data that combine statistical, geochemical, and spatial techniques is presented. Surface water and ground water base flow and spring runoff samples (180 total) from a single watershed are first classified using hierarchical cluster analysis. The statistical clusters are analyzed for spatial coherence confirming that the clusters have a geological basis corresponding to topographic flowpaths and showing that the fractured rock aquifer behaves as an equivalent porous medium on the watershed scale. Then principal component analysis (PCA) is used to determine the sources of variation between parameters. PCA analysis shows that the variations within the dataset are related to variations in calcium, magnesium, SO4, and HCO3, which are derived from natural weathering reactions, and pH, NO3, and chlorine, which indicate anthropogenic impact. PHREEQC modeling is used to quantitatively describe the natural hydrochemical evolution for the watershed and aid in discrimination of samples that have an anthropogenic component. Finally, the seasonal changes in the water chemistry of individual sites were analyzed to better characterize the spatial variability of vertical hydraulic conductivity. The integrated result provides a method to characterize the hydrogeology of the watershed that fully utilizes traditional data.
The Potential of Multivariate Analysis in Assessing Students' Attitude to Curriculum Subjects
ERIC Educational Resources Information Center
Gaotlhobogwe, Michael; Laugharne, Janet; Durance, Isabelle
2011-01-01
Background: Understanding student attitudes to curriculum subjects is central to providing evidence-based options to policy makers in education. Purpose: We illustrate how quantitative approaches used in the social sciences and based on multivariate analysis (categorical Principal Components Analysis, Clustering Analysis and General Linear…
NASA Astrophysics Data System (ADS)
Ojima, Nobutoshi; Fujiwara, Izumi; Inoue, Yayoi; Tsumura, Norimichi; Nakaguchi, Toshiya; Iwata, Kayoko
2011-03-01
Uneven distribution of skin color is one of the biggest concerns about facial skin appearance. Recently several techniques to analyze skin color have been introduced by separating skin color information into chromophore components, such as melanin and hemoglobin. However, there are not many reports on quantitative analysis of unevenness of skin color by considering type of chromophore, clusters of different sizes and concentration of the each chromophore. We propose a new image analysis and simulation method based on chromophore analysis and spatial frequency analysis. This method is mainly composed of three techniques: independent component analysis (ICA) to extract hemoglobin and melanin chromophores from a single skin color image, an image pyramid technique which decomposes each chromophore into multi-resolution images, which can be used for identifying different sizes of clusters or spatial frequencies, and analysis of the histogram obtained from each multi-resolution image to extract unevenness parameters. As the application of the method, we also introduce an image processing technique to change unevenness of melanin component. As the result, the method showed high capabilities to analyze unevenness of each skin chromophore: 1) Vague unevenness on skin could be discriminated from noticeable pigmentation such as freckles or acne. 2) By analyzing the unevenness parameters obtained from each multi-resolution image for Japanese ladies, agerelated changes were observed in the parameters of middle spatial frequency. 3) An image processing system modulating the parameters was proposed to change unevenness of skin images along the axis of the obtained age-related change in real time.
Going beyond Clustering in MD Trajectory Analysis: An Application to Villin Headpiece Folding
Rajan, Aruna; Freddolino, Peter L.; Schulten, Klaus
2010-01-01
Recent advances in computing technology have enabled microsecond long all-atom molecular dynamics (MD) simulations of biological systems. Methods that can distill the salient features of such large trajectories are now urgently needed. Conventional clustering methods used to analyze MD trajectories suffer from various setbacks, namely (i) they are not data driven, (ii) they are unstable to noise and changes in cut-off parameters such as cluster radius and cluster number, and (iii) they do not reduce the dimensionality of the trajectories, and hence are unsuitable for finding collective coordinates. We advocate the application of principal component analysis (PCA) and a non-metric multidimensional scaling (nMDS) method to reduce MD trajectories and overcome the drawbacks of clustering. To illustrate the superiority of nMDS over other methods in reducing data and reproducing salient features, we analyze three complete villin headpiece folding trajectories. Our analysis suggests that the folding process of the villin headpiece is structurally heterogeneous. PMID:20419160
Going beyond clustering in MD trajectory analysis: an application to villin headpiece folding.
Rajan, Aruna; Freddolino, Peter L; Schulten, Klaus
2010-04-15
Recent advances in computing technology have enabled microsecond long all-atom molecular dynamics (MD) simulations of biological systems. Methods that can distill the salient features of such large trajectories are now urgently needed. Conventional clustering methods used to analyze MD trajectories suffer from various setbacks, namely (i) they are not data driven, (ii) they are unstable to noise and changes in cut-off parameters such as cluster radius and cluster number, and (iii) they do not reduce the dimensionality of the trajectories, and hence are unsuitable for finding collective coordinates. We advocate the application of principal component analysis (PCA) and a non-metric multidimensional scaling (nMDS) method to reduce MD trajectories and overcome the drawbacks of clustering. To illustrate the superiority of nMDS over other methods in reducing data and reproducing salient features, we analyze three complete villin headpiece folding trajectories. Our analysis suggests that the folding process of the villin headpiece is structurally heterogeneous.
Henry, David; Dymnicki, Allison B.; Mohatt, Nathaniel; Allen, James; Kelly, James G.
2016-01-01
Qualitative methods potentially add depth to prevention research, but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data, but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-Means clustering, and latent class analysis produced similar levels of accuracy with binary data, and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities. PMID:25946969
Henry, David; Dymnicki, Allison B; Mohatt, Nathaniel; Allen, James; Kelly, James G
2015-10-01
Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a "real-world" example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.
Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques
NASA Astrophysics Data System (ADS)
Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein
2017-10-01
The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.
Multivariate analysis of molecular and morphological diversity in fig (Ficus carica L.)
USDA-ARS?s Scientific Manuscript database
Genetic polymorphism across 15 microsatellite loci among 194 fig accessions including Common, Smyrna, San Pedro, and Caprifig were analyzed using a cluster analysis (CA) and the principal components analysis (PCA). The collection was moderately variable with observed number of alleles per locus rang...
A graph-Laplacian-based feature extraction algorithm for neural spike sorting.
Ghanbari, Yasser; Spence, Larry; Papamichalis, Panos
2009-01-01
Analysis of extracellular neural spike recordings is highly dependent upon the accuracy of neural waveform classification, commonly referred to as spike sorting. Feature extraction is an important stage of this process because it can limit the quality of clustering which is performed in the feature space. This paper proposes a new feature extraction method (which we call Graph Laplacian Features, GLF) based on minimizing the graph Laplacian and maximizing the weighted variance. The algorithm is compared with Principal Components Analysis (PCA, the most commonly-used feature extraction method) using simulated neural data. The results show that the proposed algorithm produces more compact and well-separated clusters compared to PCA. As an added benefit, tentative cluster centers are output which can be used to initialize a subsequent clustering stage.
A spatial, kinematical, and dynamical analysis of Abell 400
NASA Technical Reports Server (NTRS)
Beers, Timothy C.; Gebhardt, Karl; Huchra, John P.; Forman, William; Jones, Christine; Bothun, Gregory D.
1992-01-01
The paper presents a detailed spatial, kinematical, and dynamical analysis for the cluster A400, based on a nearly complete redshift survey of bright galaxies within 1 Mpc of the cluster center. A dispersed component with a high fraction of spiral galaxies at a velocity of 8200 km/s, and a background group with a mean velocity of 13,400 km/s are identified. It is proposed that the main body of A400 is composed of at least two individual subclusters. If subclustering is ignored, the derived dispersion of the 88 galaxies with measured velocities within 4000 km/s of the bright dumbbell galaxy near the cluster center is 702 km/s. When kinematic information is used to split A400 into likely subclusters, the velocity dispersions of the individual units which make up this cluster are on the order of 200-300 km/s. If A400 is considered a single entity, the inferred blue mass-to-light ratio is 1210 solar masses/solar luminosities. It is argued that A400 is an example of a presently occurring merger, and that the individual components of the dumbbell galaxy were once individual D galaxies within the premerger subclusters.
Papaleo, Elena; Mereghetti, Paolo; Fantucci, Piercarlo; Grandori, Rita; De Gioia, Luca
2009-01-01
Several molecular dynamics (MD) simulations were used to sample conformations in the neighborhood of the native structure of holo-myoglobin (holo-Mb), collecting trajectories spanning 0.22 micros at 300 K. Principal component (PCA) and free-energy landscape (FEL) analyses, integrated by cluster analysis, which was performed considering the position and structures of the individual helices of the globin fold, were carried out. The coherence between the different structural clusters and the basins of the FEL, together with the convergence of parameters derived by PCA indicates that an accurate description of the Mb conformational space around the native state was achieved by multiple MD trajectories spanning at least 0.14 micros. The integration of FEL, PCA, and structural clustering was shown to be a very useful approach to gain an overall view of the conformational landscape accessible to a protein and to identify representative protein substates. This method could be also used to investigate the conformational and dynamical properties of Mb apo-, mutant, or delete versions, in which greater conformational variability is expected and, therefore identification of representative substates from the simulations is relevant to disclose structure-function relationship.
Optimal wavelength band clustering for multispectral iris recognition.
Gong, Yazhuo; Zhang, David; Shi, Pengfei; Yan, Jingqi
2012-07-01
This work explores the possibility of clustering spectral wavelengths based on the maximum dissimilarity of iris textures. The eventual goal is to determine how many bands of spectral wavelengths will be enough for iris multispectral fusion and to find these bands that will provide higher performance of iris multispectral recognition. A multispectral acquisition system was first designed for imaging the iris at narrow spectral bands in the range of 420 to 940 nm. Next, a set of 60 human iris images that correspond to the right and left eyes of 30 different subjects were acquired for an analysis. Finally, we determined that 3 clusters were enough to represent the 10 feature bands of spectral wavelengths using the agglomerative clustering based on two-dimensional principal component analysis. The experimental results suggest (1) the number, center, and composition of clusters of spectral wavelengths and (2) the higher performance of iris multispectral recognition based on a three wavelengths-bands fusion.
Clustering of Variables for Mixed Data
NASA Astrophysics Data System (ADS)
Saracco, J.; Chavent, M.
2016-05-01
This chapter presents clustering of variables which aim is to lump together strongly related variables. The proposed approach works on a mixed data set, i.e. on a data set which contains numerical variables and categorical variables. Two algorithms of clustering of variables are described: a hierarchical clustering and a k-means type clustering. A brief description of PCAmix method (that is a principal component analysis for mixed data) is provided, since the calculus of the synthetic variables summarizing the obtained clusters of variables is based on this multivariate method. Finally, the R packages ClustOfVar and PCAmixdata are illustrated on real mixed data. The PCAmix and ClustOfVar approaches are first used for dimension reduction (step 1) before applying in step 2 a standard clustering method to obtain groups of individuals.
Sample size calculation for stepped wedge and other longitudinal cluster randomised trials.
Hooper, Richard; Teerenstra, Steven; de Hoop, Esther; Eldridge, Sandra
2016-11-20
The sample size required for a cluster randomised trial is inflated compared with an individually randomised trial because outcomes of participants from the same cluster are correlated. Sample size calculations for longitudinal cluster randomised trials (including stepped wedge trials) need to take account of at least two levels of clustering: the clusters themselves and times within clusters. We derive formulae for sample size for repeated cross-section and closed cohort cluster randomised trials with normally distributed outcome measures, under a multilevel model allowing for variation between clusters and between times within clusters. Our formulae agree with those previously described for special cases such as crossover and analysis of covariance designs, although simulation suggests that the formulae could underestimate required sample size when the number of clusters is small. Whether using a formula or simulation, a sample size calculation requires estimates of nuisance parameters, which in our model include the intracluster correlation, cluster autocorrelation, and individual autocorrelation. A cluster autocorrelation less than 1 reflects a situation where individuals sampled from the same cluster at different times have less correlated outcomes than individuals sampled from the same cluster at the same time. Nuisance parameters could be estimated from time series obtained in similarly clustered settings with the same outcome measure, using analysis of variance to estimate variance components. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Dong, Skye T; Costa, Daniel S J; Butow, Phyllis N; Lovell, Melanie R; Agar, Meera; Velikova, Galina; Teckle, Paulos; Tong, Allison; Tebbutt, Niall C; Clarke, Stephen J; van der Hoek, Kim; King, Madeleine T; Fayers, Peter M
2016-01-01
Symptom clusters in advanced cancer can influence patient outcomes. There is large heterogeneity in the methods used to identify symptom clusters. To investigate the consistency of symptom cluster composition in advanced cancer patients using different statistical methodologies for all patients across five primary cancer sites, and to examine which clusters predict functional status, a global assessment of health and global quality of life. Principal component analysis and exploratory factor analysis (with different rotation and factor selection methods) and hierarchical cluster analysis (with different linkage and similarity measures) were used on a data set of 1562 advanced cancer patients who completed the European Organization for the Research and Treatment of Cancer Quality of Life Questionnaire-Core 30. Four clusters consistently formed for many of the methods and cancer sites: tense-worry-irritable-depressed (emotional cluster), fatigue-pain, nausea-vomiting, and concentration-memory (cognitive cluster). The emotional cluster was a stronger predictor of overall quality of life than the other clusters. Fatigue-pain was a stronger predictor of overall health than the other clusters. The cognitive cluster and fatigue-pain predicted physical functioning, role functioning, and social functioning. The four identified symptom clusters were consistent across statistical methods and cancer types, although there were some noteworthy differences. Statistical derivation of symptom clusters is in need of greater methodological guidance. A psychosocial pathway in the management of symptom clusters may improve quality of life. Biological mechanisms underpinning symptom clusters need to be delineated by future research. A framework for evidence-based screening, assessment, treatment, and follow-up of symptom clusters in advanced cancer is essential. Copyright © 2016 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Analysis of candidates for interacting galaxy clusters. I. A1204 and A2029/A2033
NASA Astrophysics Data System (ADS)
Gonzalez, Elizabeth Johana; de los Rios, Martín; Oio, Gabriel A.; Lang, Daniel Hernández; Tagliaferro, Tania Aguirre; Domínguez R., Mariano J.; Castellón, José Luis Nilo; Cuevas L., Héctor; Valotto, Carlos A.
2018-04-01
Context. Merging galaxy clusters allow for the study of different mass components, dark and baryonic, separately. Also, their occurrence enables to test the ΛCDM scenario, which can be used to put constraints on the self-interacting cross-section of the dark-matter particle. Aim. It is necessary to perform a homogeneous analysis of these systems. Hence, based on a recently presented sample of candidates for interacting galaxy clusters, we present the analysis of two of these cataloged systems. Methods: In this work, the first of a series devoted to characterizing galaxy clusters in merger processes, we perform a weak lensing analysis of clusters A1204 and A2029/A2033 to derive the total masses of each identified interacting structure together with a dynamical study based on a two-body model. We also describe the gas and the mass distributions in the field through a lensing and an X-ray analysis. This is the first of a series of works which will analyze these type of system in order to characterize them. Results: Neither merging cluster candidate shows evidence of having had a recent merger event. Nevertheless, there is dynamical evidence that these systems could be interacting or could interact in the future. Conclusions: It is necessary to include more constraints in order to improve the methodology of classifying merging galaxy clusters. Characterization of these clusters is important in order to properly understand the nature of these systems and their connection with dynamical studies.
Ruzik, L; Obarski, N; Papierz, A; Mojski, M
2015-06-01
High-performance liquid chromatography (HPLC) with UV/VIS spectrophotometric detection combined with the chemometric method of cluster analysis (CA) was used for the assessment of repeatability of composition of nine types of perfumed waters. In addition, the chromatographic method of separating components of the perfume waters under analysis was subjected to an optimization procedure. The chromatograms thus obtained were used as sources of data for the chemometric method of cluster analysis (CA). The result was a classification of a set comprising 39 perfumed water samples with a similar composition at a specified level of probability (level of agglomeration). A comparison of the classification with the manufacturer's declarations reveals a good degree of consistency and demonstrates similarity between samples in different classes. A combination of the chromatographic method with cluster analysis (HPLC UV/VIS - CA) makes it possible to quickly assess the repeatability of composition of perfumed waters at selected levels of probability. © 2014 Society of Cosmetic Scientists and the Société Française de Cosmétologie.
Ocké, Marga C
2013-05-01
This paper aims to describe different approaches for studying the overall diet with advantages and limitations. Studies of the overall diet have emerged because the relationship between dietary intake and health is very complex with all kinds of interactions. These cannot be captured well by studying single dietary components. Three main approaches to study the overall diet can be distinguished. The first method is researcher-defined scores or indices of diet quality. These are usually based on guidelines for a healthy diet or on diets known to be healthy. The second approach, using principal component or cluster analysis, is driven by the underlying dietary data. In principal component analysis, scales are derived based on the underlying relationships between food groups, whereas in cluster analysis, subgroups of the population are created with people that cluster together based on their dietary intake. A third approach includes methods that are driven by a combination of biological pathways and the underlying dietary data. Reduced rank regression defines linear combinations of food intakes that maximally explain nutrient intakes or intermediate markers of disease. Decision tree analysis identifies subgroups of a population whose members share dietary characteristics that influence (intermediate markers of) disease. It is concluded that all approaches have advantages and limitations and essentially answer different questions. The third approach is still more in an exploration phase, but seems to have great potential with complementary value. More insight into the utility of conducting studies on the overall diet can be gained if more attention is given to methodological issues.
REGIONAL-SCALE WIND FIELD CLASSIFICATION EMPLOYING CLUSTER ANALYSIS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Glascoe, L G; Glaser, R E; Chin, H S
2004-06-17
The classification of time-varying multivariate regional-scale wind fields at a specific location can assist event planning as well as consequence and risk analysis. Further, wind field classification involves data transformation and inference techniques that effectively characterize stochastic wind field variation. Such a classification scheme is potentially useful for addressing overall atmospheric transport uncertainty and meteorological parameter sensitivity issues. Different methods to classify wind fields over a location include the principal component analysis of wind data (e.g., Hardy and Walton, 1978) and the use of cluster analysis for wind data (e.g., Green et al., 1992; Kaufmann and Weber, 1996). The goalmore » of this study is to use a clustering method to classify the winds of a gridded data set, i.e, from meteorological simulations generated by a forecast model.« less
VanderKnyff, Jeremy; Friedman, Daniela B; Tanner, Andrea
2015-01-01
Using a sample of YouTube videos posted on the YouTube channels of organ procurement organizations, a content analysis was conducted to identify the frames used to strategically communicate prodonation messages. A total of 377 videos were coded for general characteristics, format, speaker characteristics, organs discussed, structure, problem definition, and treatment. Principal components analysis identified message frames, and k-means cluster analysis established distinct groupings of videos on the basis of the strength of their relationship to message frames. Analysis of these frames and clusters found that organ procurement organizations present multiple, and sometimes competing, video types and message frames on YouTube. This study serves as important formative research that will inform future studies to measure the effectiveness of the distinct message frames and clusters identified.
COVARIATE-ADAPTIVE CLUSTERING OF EXPOSURES FOR AIR POLLUTION EPIDEMIOLOGY COHORTS*
Keller, Joshua P.; Drton, Mathias; Larson, Timothy; Kaufman, Joel D.; Sandler, Dale P.; Szpiro, Adam A.
2017-01-01
Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Specifically, we present a method that uses geographic covariate information to cluster multi-pollutant observations and predict cluster membership at cohort locations. Our predictive k-means procedure identifies centers using a mixture model and is followed by multi-class spatial prediction. In simulations, we demonstrate that predictive k-means can reduce misclassification error by over 50% compared to ordinary k-means, with minimal loss in cluster representativeness. The improved prediction accuracy results in large gains of 30% or more in power for detecting effect modification by cluster in a simulated health analysis. In an analysis of the NIEHS Sister Study cohort using predictive k-means, we find that the association between systolic blood pressure (SBP) and long-term fine particulate matter (PM2.5) exposure varies significantly between different clusters of PM2.5 component profiles. Our cluster-based analysis shows that for subjects assigned to a cluster located in the Midwestern U.S., a 10 μg/m3 difference in exposure is associated with 4.37 mmHg (95% CI, 2.38, 6.35) higher SBP. PMID:28572869
Combining Mixture Components for Clustering*
Baudry, Jean-Patrick; Raftery, Adrian E.; Celeux, Gilles; Lo, Kenneth; Gottardo, Raphaël
2010-01-01
Model-based clustering consists of fitting a mixture model to data and identifying each cluster with one of its components. Multivariate normal distributions are typically used. The number of clusters is usually determined from the data, often using BIC. In practice, however, individual clusters can be poorly fitted by Gaussian distributions, and in that case model-based clustering tends to represent one non-Gaussian cluster by a mixture of two or more Gaussian distributions. If the number of mixture components is interpreted as the number of clusters, this can lead to overestimation of the number of clusters. This is because BIC selects the number of mixture components needed to provide a good approximation to the density, rather than the number of clusters as such. We propose first selecting the total number of Gaussian mixture components, K, using BIC and then combining them hierarchically according to an entropy criterion. This yields a unique soft clustering for each number of clusters less than or equal to K. These clusterings can be compared on substantive grounds, and we also describe an automatic way of selecting the number of clusters via a piecewise linear regression fit to the rescaled entropy plot. We illustrate the method with simulated data and a flow cytometry dataset. Supplemental Materials are available on the journal Web site and described at the end of the paper. PMID:20953302
Morphological features (defects) in fuel cell membrane electrode assemblies
NASA Astrophysics Data System (ADS)
Kundu, S.; Fowler, M. W.; Simon, L. C.; Grot, S.
Reliability and durability issues in fuel cells are becoming more important as the technology and the industry matures. Although research in this area has increased, systematic failure analysis, such as a failure modes and effects analysis (FMEA), are very limited in the literature. This paper presents a categorization scheme of causes, modes, and effects related to fuel cell degradation and failure, with particular focus on the role of component quality, that can be used in FMEAs for polymer electrolyte membrane (PEM) fuel cells. The work also identifies component defects imparted on catalyst-coated membranes (CCM) by manufacturing and proposes mechanisms by which they can influence overall degradation and reliability. Six major defects have been identified on fresh CCM materials, i.e., cracks, orientation, delamination, electrolyte clusters, platinum clusters, and thickness variations.
Wilderness ecology: virgin plant communities of the Boundary Waters Canoe Area.
Lewis F. Ohmann; Robert R. Ream
1971-01-01
Describes virgin plant communities in the Boundary Waters Canoe Area. Data from all vegetative components of 106 virgin upland stands were used to construct a community classification through a combination of agglomerative clustering and principal components analysis. Discusses the relation of communities to their environment and to past wildfires.
Callings in Career: A Typological Approach to Essential and Optional Components
ERIC Educational Resources Information Center
Hirschi, Andreas
2011-01-01
A sense of calling in career is supposed to have positive implications for individuals and organizations but current theoretical development is plagued with incongruent conceptualizations of what does or does not constitute a calling. The present study used cluster analysis to identify essential and optional components of a presence of calling…
Cao, Zhen; Wang, Zhenjie; Shang, Zhonglin; Zhao, Jiancheng
2017-01-01
Fourier-transform infrared spectroscopy (FTIR) with the attenuated total reflectance technique was used to identify Rhodobryum roseum from its four adulterants. The FTIR spectra of six samples in the range from 4000 cm-1 to 600 cm-1 were obtained. The second-derivative transformation test was used to identify the small and nearby absorption peaks. A cluster analysis was performed to classify the spectra in a dendrogram based on the spectral similarity. Principal component analysis (PCA) was used to classify the species of six moss samples. A cluster analysis with PCA was used to identify different genera. However, some species of the same genus exhibited highly similar chemical components and FTIR spectra. Fourier self-deconvolution and discrete wavelet transform (DWT) were used to enhance the differences among the species with similar chemical components and FTIR spectra. Three scales were selected as the feature-extracting space in the DWT domain. The results show that FTIR spectroscopy with chemometrics is suitable for identifying Rhodobryum roseum and its adulterants.
Wang, Yi; Coleman-Derr, Devin; Chen, Guoping; Gu, Yong Q
2015-07-01
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Khamis, Fathiya M.; Masiga, Daniel K.; Mohamed, Samira A.; Salifu, Daisy; de Meyer, Marc; Ekesi, Sunday
2012-01-01
In 2003, a new fruit fly pest species was recorded for the first time in Kenya and has subsequently been found in 28 countries across tropical Africa. The insect was described as Bactrocera invadens, due to its rapid invasion of the African continent. In this study, the morphometry and DNA Barcoding of different populations of B. invadens distributed across the species range of tropical Africa and a sample from the pest's putative aboriginal home of Sri Lanka was investigated. Morphometry using wing veins and tibia length was used to separate B. invadens populations from other closely related Bactrocera species. The Principal component analysis yielded 15 components which correspond to the 15 morphometric measurements. The first two principal axes contributed to 90.7% of the total variance and showed partial separation of these populations. Canonical discriminant analysis indicated that only the first five canonical variates were statistically significant. The first two canonical variates contributed a total of 80.9% of the total variance clustering B. invadens with other members of the B. dorsalis complex while distinctly separating B. correcta, B. cucurbitae, B. oleae and B. zonata. The largest Mahalanobis squared distance (D2 = 122.9) was found to be between B. cucurbitae and B. zonata, while the lowest was observed between B. invadens populations against B. kandiensis (8.1) and against B. dorsalis s.s (11.4). Evolutionary history inferred by the Neighbor-Joining method clustered the Bactrocera species populations into four clusters. First cluster consisted of the B. dorsalis complex (B. invadens, B. kandiensis and B. dorsalis s. s.), branching from the same node while the second group was paraphyletic clades of B. correcta and B. zonata. The last two are monophyletic clades, consisting of B. cucurbitae and B. oleae, respectively. Principal component analysis using the genetic distances confirmed the clustering inferred by the NJ tree. PMID:23028649
Associations Between Adiposity and Metabolic Syndrome Over Time: The Healthy Twin Study.
Song, Yun-Mi; Sung, Joohon; Lee, Kayoung
2017-04-01
We evaluated the association between changes in adiposity traits including anthropometric and fat mass indicators and changes in metabolic syndrome traits including metabolic syndrome clustering and individual components over time. We also assessed the shared genetic and environmental correlations between the two traits. Participants were 284 South Korean twin individuals and 279 nontwin family members had complete data for changes in adiposity traits and metabolic syndrome traits of the Healthy Twin study. Mixed linear model and bivariate variance-component analysis were applied. Over a period of 3.1 ± 0.6 years of study, changes in adiposity traits [body mass index (BMI), waist circumference, total fat mass, and fat mass to lean mass ratio] had significant associations with changes in metabolic syndrome clustering [high blood pressure, high serum glucose, high triglycerides (TG), and low high-density lipoprotein cholesterol] after adjusting for intra-familial and sibling correlations, age, sex, baseline metabolic syndrome clustering, and socioeconomic factors and health behaviors at follow-up. Change in BMI associated significantly with changes in individual metabolic syndrome components compared to other adiposity traits. Change in metabolic syndrome component TG was a better predictor of changes in adiposity traits compared to changes in other metabolic components. These associations were explained by significant environmental correlations but not by genetic correlations. Changes in anthropometric and fat mass indicators were positively associated with changes in metabolic syndrome clustering and those associations appeared to be regulated by environmental influences.
Carvalho, Carolina Abreu de; Fonsêca, Poliana Cristina de Almeida; Nobre, Luciana Neri; Priore, Silvia Eloiza; Franceschini, Sylvia do Carmo Castro
2016-01-01
The objective of this study is to provide guidance for identifying dietary patterns using the a posteriori approach, and analyze the methodological aspects of the studies conducted in Brazil that identified the dietary patterns of children. Articles were selected from the Latin American and Caribbean Literature on Health Sciences, Scientific Electronic Library Online and Pubmed databases. The key words were: Dietary pattern; Food pattern; Principal Components Analysis; Factor analysis; Cluster analysis; Reduced rank regression. We included studies that identified dietary patterns of children using the a posteriori approach. Seven studies published between 2007 and 2014 were selected, six of which were cross-sectional and one cohort, Five studies used the food frequency questionnaire for dietary assessment; one used a 24-hour dietary recall and the other a food list. The method of exploratory approach used in most publications was principal components factor analysis, followed by cluster analysis. The sample size of the studies ranged from 232 to 4231, the values of the Kaiser-Meyer-Olkin test from 0.524 to 0.873, and Cronbach's alpha from 0.51 to 0.69. Few Brazilian studies identified dietary patterns of children using the a posteriori approach and principal components factor analysis was the technique most used.
X-ray spectral observations of clusters of galaxies undergoing merger events
NASA Astrophysics Data System (ADS)
Henriksen, Mark J.
1993-09-01
We have analyzed the HEAO 1 A2 observations of two clusters whose optical and X-ray isophotes are suggestive of merging subclusters, A119 and A754, and find evidence of nonisothermal X-ray emission from both clusters. The X-ray spectrum of both clusters, when fitted with a single isothermal model, shows residual soft X-ray emission. There is a statistically significant reduction in chi-squared (98 percent probability based on the F-test) when a second temperature component is added. If the asymmetric isophotes seen in the soft X-ray image are indicative of merging subclusters, then our analysis of the Einstein IPC spectra and Solid State Spectrometer observations of A754, which provide some spatial and spectral resolution, suggests that the two temperature components seen in the HEAO 1 A2 spectra are associated with gas trapped in the subcluster potential wells. The implied subcluster isothermal masses suggest that a more massive cluster is accreting a less massive companion in A754. The present observations cannot rule out the alternative possibility that the cooler gas is associated with the outer cluster atmosphere rather than individual subclusters, as appears to be the case for A119. Astro D observations will be necessary to distinguish between these two possibilities for both clusters.
2010-01-01
Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data. PMID:20937082
Analysis of ground-motion simulation big data
NASA Astrophysics Data System (ADS)
Maeda, T.; Fujiwara, H.
2016-12-01
We developed a parallel distributed processing system which applies a big data analysis to the large-scale ground motion simulation data. The system uses ground-motion index values and earthquake scenario parameters as input. We used peak ground velocity value and velocity response spectra as the ground-motion index. The ground-motion index values are calculated from our simulation data. We used simulated long-period ground motion waveforms at about 80,000 meshes calculated by a three dimensional finite difference method based on 369 earthquake scenarios of a great earthquake in the Nankai Trough. These scenarios were constructed by considering the uncertainty of source model parameters such as source area, rupture starting point, asperity location, rupture velocity, fmax and slip function. We used these parameters as the earthquake scenario parameter. The system firstly carries out the clustering of the earthquake scenario in each mesh by the k-means method. The number of clusters is determined in advance using a hierarchical clustering by the Ward's method. The scenario clustering results are converted to the 1-D feature vector. The dimension of the feature vector is the number of scenario combination. If two scenarios belong to the same cluster the component of the feature vector is 1, and otherwise the component is 0. The feature vector shows a `response' of mesh to the assumed earthquake scenario group. Next, the system performs the clustering of the mesh by k-means method using the feature vector of each mesh previously obtained. Here the number of clusters is arbitrarily given. The clustering of scenarios and meshes are performed by parallel distributed processing with Hadoop and Spark, respectively. In this study, we divided the meshes into 20 clusters. The meshes in each cluster are geometrically concentrated. Thus this system can extract regions, in which the meshes have similar `response', as clusters. For each cluster, it is possible to determine particular scenario parameters which characterize the cluster. In other word, by utilizing this system, we can obtain critical scenario parameters of the ground-motion simulation for each evaluation point objectively. This research was supported by CREST, JST.
Sensory characteristics and consumer preference for chicken meat in Guinea.
Sow, T M A; Grongnet, J F
2010-10-01
This study identified the sensory characteristics and consumer preference for chicken meat in Guinea. Five chicken samples [live village chicken, live broiler, live spent laying hen, ready-to-cook broiler, and ready-to-cook broiler (imported)] bought from different locations were assessed by 10 trained panelists using 19 sensory attributes. The ANOVA results showed that 3 chicken appearance attributes (brown, yellow, and white), 5 chicken odor attributes (oily, intense, medicine smell, roasted, and mouth persistent), 3 chicken flavor attributes (sweet, bitter, and astringent), and 8 chicken texture attributes (firm, tender, juicy, chew, smooth, springy, hard, and fibrous) were significantly discriminating between the chicken samples (P<0.05). Principal component analysis of the sensory data showed that the first 2 principal components explained 84% of the sensory data variance. The principal component analysis results showed that the live village chicken, the live spent laying hen, and the ready-to-cook broiler (imported) were very well represented and clearly distinguished from the live broiler and the ready-to-cook broiler. One hundred twenty consumers expressed their preferences for the chicken samples using a 5-point Likert scale. The hierarchical cluster analysis of the preference data identified 4 homogenous consumer clusters. The hierarchical cluster analysis results showed that the live village chicken was the most preferred chicken sample, whereas the ready-to-cook broiler was the least preferred one. The partial least squares regression (PLSR) type 1 showed that 72% of the sensory data for the first 2 principal components explained 83% of the chicken preference. The PLSR1 identified that the sensory characteristics juicy, oily, sweet, hard, mouth persistent, and yellow were the most relevant sensory drivers of the Guinean chicken preference. The PLSR2 (with multiple responses) identified the relationship between the chicken samples, their sensory attributes, and the consumer clusters. Our results showed that there was not a chicken category that was exclusively preferred from the other chicken samples and therefore highlight the existence of place for development of all chicken categories in the local market.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fusco-Femiano, R.; Lapi, A., E-mail: roberto.fuscofemiano@iaps.inaf.it
2015-02-10
We present an analysis of high-quality X-ray data out to the virial radius for the two galaxy clusters A1246 and GMBCG J255.34805+64.23661 (J255) by means of our entropy-based SuperModel. For A1246 we find that the spherically averaged entropy profile of the intracluster medium (ICM) progressively flattens outward, and that a nonthermal pressure component amounting to ≈20% of the total is required to support hydrostatic equilibrium in the outskirts; there we also estimate a modest value C ≈ 1.6 of the ICM clumping factor. These findings agree with previous analyses on other cool-core, relaxed clusters, and lend further support to themore » picture by Lapi et al. that relates the entropy flattening, the development of the nonthermal pressure component, and the azimuthal variation of ICM properties to weakening boundary shocks. In this scenario clusters are born in a high-entropy state throughout, and are expected to develop on similar timescales a low-entropy state both at the center due to cooling, and in the outskirts due to weakening shocks. However, the analysis of J255 testifies how such a typical evolutionary course can be interrupted or even reversed by merging especially at intermediate redshift, as predicted by Cavaliere et al. In fact, a merger has rejuvenated the ICM of this cluster at z ≈ 0.45 by reestablishing a high-entropy state in the outskirts, while leaving intact or erasing only partially the low-entropy, cool core at the center.« less
Modeling of intracerebral interictal epileptic discharges: Evidence for network interactions.
Meesters, Stephan; Ossenblok, Pauly; Colon, Albert; Wagner, Louis; Schijns, Olaf; Boon, Paul; Florack, Luc; Fuster, Andrea
2018-06-01
The interictal epileptic discharges (IEDs) occurring in stereotactic EEG (SEEG) recordings are in general abundant compared to ictal discharges, but difficult to interpret due to complex underlying network interactions. A framework is developed to model these network interactions. To identify the synchronized neuronal activity underlying the IEDs, the variation in correlation over time of the SEEG signals is related to the occurrence of IEDs using the general linear model. The interdependency is assessed of the brain areas that reflect highly synchronized neural activity by applying independent component analysis, followed by cluster analysis of the spatial distributions of the independent components. The spatiotemporal interactions of the spike clusters reveal the leading or lagging of brain areas. The analysis framework was evaluated for five successfully operated patients, showing that the spike cluster that was related to the MRI-visible brain lesions coincided with the seizure onset zone. The additional value of the framework was demonstrated for two more patients, who were MRI-negative and for whom surgery was not successful. A network approach is promising in case of complex epilepsies. Analysis of IEDs is considered a valuable addition to routine review of SEEG recordings, with the potential to increase the success rate of epilepsy surgery. Copyright © 2018 International Federation of Clinical Neurophysiology. Published by Elsevier B.V. All rights reserved.
Yücel, Yasin; Sultanoğlu, Pınar
2013-09-01
Chemical characterisation has been carried out on 45 honey samples collected from Hatay region of Turkey. The concentrations of 17 elements were determined by inductively coupled plasma optical emission spectrometry (ICP-OES). Ca, K, Mg and Na were the most abundant elements, with mean contents of 219.38, 446.93, 49.06 and 95.91 mg kg(-1) respectively. The trace element mean contents ranged between 0.03 and 15.07 mg kg(-1). Chemometric methods such as principal component analysis (PCA) and cluster analysis (CA) techniques were applied to classify honey according to mineral content. The first most important principal component (PC) was strongly associated with the value of Al, B, Cd and Co. CA showed eight clusters corresponding to the eight botanical origins of honey. PCA explained 75.69% of the variance with the first six PC variables. Chemometric analysis of the analytical data allowed the accurate classification of the honey samples according to origin. Copyright © 2013 Elsevier Ltd. All rights reserved.
Pulley, Simon; Foster, Ian; Collins, Adrian L
2017-06-01
The objective classification of sediment source groups is at present an under-investigated aspect of source tracing studies, which has the potential to statistically improve discrimination between sediment sources and reduce uncertainty. This paper investigates this potential using three different source group classification schemes. The first classification scheme was simple surface and subsurface groupings (Scheme 1). The tracer signatures were then used in a two-step cluster analysis to identify the sediment source groupings naturally defined by the tracer signatures (Scheme 2). The cluster source groups were then modified by splitting each one into a surface and subsurface component to suit catchment management goals (Scheme 3). The schemes were tested using artificial mixtures of sediment source samples. Controlled corruptions were made to some of the mixtures to mimic the potential causes of tracer non-conservatism present when using tracers in natural fluvial environments. It was determined how accurately the known proportions of sediment sources in the mixtures were identified after unmixing modelling using the three classification schemes. The cluster analysis derived source groups (2) significantly increased tracer variability ratios (inter-/intra-source group variability) (up to 2122%, median 194%) compared to the surface and subsurface groupings (1). As a result, the composition of the artificial mixtures was identified an average of 9.8% more accurately on the 0-100% contribution scale. It was found that the cluster groups could be reclassified into a surface and subsurface component (3) with no significant increase in composite uncertainty (a 0.1% increase over Scheme 2). The far smaller effects of simulated tracer non-conservatism for the cluster analysis based schemes (2 and 3) was primarily attributed to the increased inter-group variability producing a far larger sediment source signal that the non-conservatism noise (1). Modified cluster analysis based classification methods have the potential to reduce composite uncertainty significantly in future source tracing studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Dimensionally Reduced Clustering Methodology for Heterogeneous Occupational Medicine Data Mining.
Saâdaoui, Foued; Bertrand, Pierre R; Boudet, Gil; Rouffiac, Karine; Dutheil, Frédéric; Chamoux, Alain
2015-10-01
Clustering is a set of techniques of the statistical learning aimed at finding structures of heterogeneous partitions grouping homogenous data called clusters. There are several fields in which clustering was successfully applied, such as medicine, biology, finance, economics, etc. In this paper, we introduce the notion of clustering in multifactorial data analysis problems. A case study is conducted for an occupational medicine problem with the purpose of analyzing patterns in a population of 813 individuals. To reduce the data set dimensionality, we base our approach on the Principal Component Analysis (PCA), which is the statistical tool most commonly used in factorial analysis. However, the problems in nature, especially in medicine, are often based on heterogeneous-type qualitative-quantitative measurements, whereas PCA only processes quantitative ones. Besides, qualitative data are originally unobservable quantitative responses that are usually binary-coded. Hence, we propose a new set of strategies allowing to simultaneously handle quantitative and qualitative data. The principle of this approach is to perform a projection of the qualitative variables on the subspaces spanned by quantitative ones. Subsequently, an optimal model is allocated to the resulting PCA-regressed subspaces.
Theory of anomalous critical-cluster content in high-pressure binary nucleation.
Kalikmanov, V I; Labetski, D G
2007-02-23
Nucleation experiments in binary (a-b) mixtures, when component a is supersaturated and b (carrier gas) is undersaturated, reveal that for some mixtures at high pressures the a content of the critical cluster dramatically decreases with pressure contrary to expectations based on classical nucleation theory. We show that this phenomenon is a manifestation of the dominant role of the unlike interactions at high pressures resulting in the negative partial molar volume of component a in the vapor phase beyond the compensation pressure. The analysis is based on the pressure nucleation theorem for multicomponent systems which is invariant to a nucleation model.
Implementation of the force decomposition machine for molecular dynamics simulations.
Borštnik, Urban; Miller, Benjamin T; Brooks, Bernard R; Janežič, Dušanka
2012-09-01
We present the design and implementation of the force decomposition machine (FDM), a cluster of personal computers (PCs) that is tailored to running molecular dynamics (MD) simulations using the distributed diagonal force decomposition (DDFD) parallelization method. The cluster interconnect architecture is optimized for the communication pattern of the DDFD method. Our implementation of the FDM relies on standard commodity components even for networking. Although the cluster is meant for DDFD MD simulations, it remains general enough for other parallel computations. An analysis of several MD simulation runs on both the FDM and a standard PC cluster demonstrates that the FDM's interconnect architecture provides a greater performance compared to a more general cluster interconnect. Copyright © 2012 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Cautun, Marius; van de Weygaert, Rien; Jones, Bernard J. T.; Frenk, Carlos S.
2014-07-01
The cosmic web is the largest scale manifestation of the anisotropic gravitational collapse of matter. It represents the transitional stage between linear and non-linear structures and contains easily accessible information about the early phases of structure formation processes. Here we investigate the characteristics and the time evolution of morphological components. Our analysis involves the application of the NEXUS Multiscale Morphology Filter technique, predominantly its NEXUS+ version, to high resolution and large volume cosmological simulations. We quantify the cosmic web components in terms of their mass and volume content, their density distribution and halo populations. We employ new analysis techniques to determine the spatial extent of filaments and sheets, like their total length and local width. This analysis identifies clusters and filaments as the most prominent components of the web. In contrast, while voids and sheets take most of the volume, they correspond to underdense environments and are devoid of group-sized and more massive haloes. At early times the cosmos is dominated by tenuous filaments and sheets, which, during subsequent evolution, merge together, such that the present-day web is dominated by fewer, but much more massive, structures. The analysis of the mass transport between environments clearly shows how matter flows from voids into walls, and then via filaments into cluster regions, which form the nodes of the cosmic web. We also study the properties of individual filamentary branches, to find long, almost straight, filaments extending to distances larger than 100 h-1 Mpc. These constitute the bridges between massive clusters, which seem to form along approximatively straight lines.
B. subtilis as a Model for Studying the Assembly of Fe-S Clusters in Gram-Positive Bacteria.
Dos Santos, Patricia C
2017-01-01
Complexes of iron and sulfur (Fe-S clusters) are widely distributed in nature and participate in essential biochemical reactions. The biological formation of Fe-S clusters involves dedicated pathways responsible for the mobilization of sulfur, the assembly of Fe-S clusters, and the transfer of these clusters to target proteins. Genomic analysis of Bacillus subtilis and other Gram-positive bacteria indicated the presence of only one Fe-S cluster biosynthesis pathway, which is distinct in number of components and organization from previously studied systems. B. subtilis has been used as a model system for the characterization of cysteine desulfurases responsible for sulfur mobilization reactions in the biogenesis of Fe-S clusters and other sulfur-containing cofactors. Cysteine desulfurases catalyze the cleavage of the C-S bond from the amino acid cysteine and subsequent transfer of sulfur to acceptor molecules. These reactions can be monitored by the rate of alanine formation, the first product in the reaction, and sulfide formation, a byproduct of reactions performed under reducing conditions. The assembly of Fe-S clusters on protein scaffolds and the transfer of these clusters to target acceptors are determined through a combination of spectroscopic methods probing the rate of cluster assembly and transfer. This chapter provides a description of reactions promoting the assembly of Fe-S clusters in bacteria as well as methods used to study functions of each biosynthetic component and identify mechanistic differences employed by these enzymes across different pathways. © 2017 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Heiser, Willem J.; And Others
1997-01-01
The least squares loss function of cluster differences scaling, originally defined only on residuals of pairs allocated to different clusters, is extended with a loss component for pairs allocated to the same cluster. Findings show that this makes the method equivalent to multidimensional scaling with cluster constraints on the coordinates. (SLD)
Savary, Serge; Delbac, Lionel; Rochas, Amélie; Taisant, Guillaume; Willocquet, Laetitia
2009-08-01
Dual epidemics are defined as epidemics developing on two or several plant organs in the course of a cropping season. Agricultural pathosystems where such epidemics develop are often very important, because the harvestable part is one of the organs affected. These epidemics also are often difficult to manage, because the linkage between epidemiological components occurring on different organs is poorly understood, and because prediction of the risk toward the harvestable organs is difficult. In the case of downy mildew (DM) and powdery mildew (PM) of grapevine, nonlinear modeling and logistic regression indicated nonlinearity in the foliage-cluster relationships. Nonlinear modeling enabled the parameterization of a transmission coefficient that numerically links the two components, leaves and clusters, in DM and PM epidemics. Logistic regression analysis yielded a series of probabilistic models that enabled predicting preset levels of cluster infection risks based on DM and PM severities on the foliage at successive crop stages. The usefulness of this framework for tactical decision-making for disease control is discussed.
Collision-induced evaporation of water clusters and contribution of momentum transfer
NASA Astrophysics Data System (ADS)
Calvo, Florent; Berthias, Francis; Feketeová, Linda; Abdoul-Carime, Hassan; Farizon, Bernadette; Farizon, Michel
2017-05-01
The evaporation of water molecules from high-velocity argon atoms impinging on protonated water clusters has been computationally investigated using molecular dynamics simulations with the reactive OSS2 potential to model water clusters and the ZBL pair potential to represent their interaction with the projectile. Swarms of trajectories and an event-by-event analysis reveal the conditions under which a specific number of molecular evaporation events is found one nanosecond after impact, thereby excluding direct knockout events from the analysis. These simulations provide velocity distributions that exhibit two main features, with a major statistical component arising from a global redistribution of the collision energy into intermolecular degrees of freedom, and another minor but non-ergodic feature at high velocities. The latter feature is produced by direct impacts on the peripheral water molecules and reflects a more complete momentum transfer. These two components are consistent with recent experimental measurements and confirm that electronic processes are not explicitly needed to explain the observed non-ergodic behavior. Contribution to the Topical Issue "Dynamics of Systems at the Nanoscale", edited by Andrey Solov'yov and Andrei Korol.
Type 2 diabetes mellitus: distribution of genetic markers in Kazakh population.
Sikhayeva, Nurgul; Talzhanov, Yerkebulan; Iskakova, Aisha; Dzharmukhanov, Jarkyn; Nugmanova, Raushan; Zholdybaeva, Elena; Ramanculov, Erlan
2018-01-01
Ethnic differences exist in the frequencies of genetic variations that contribute to the risk of common disease. This study aimed to analyse the distribution of several genes, previously associated with susceptibility to type 2 diabetes and obesity-related phenotypes, in a Kazakh population. A total of 966 individuals belonging to the Kazakh ethnicity were recruited from an outpatient clinic. We genotyped 41 common single nucleotide polymorphisms (SNPs) previously associated with type 2 diabetes in other ethnic groups and 31 of these were in Hardy-Weinberg equilibrium. The obtained allele frequencies were further compared to publicly available data from other ethnic populations. Allele frequencies for other (compared) populations were pooled from the haplotype map (HapMap) database. Principal component analysis (PCA), cluster analysis, and multidimensional scaling (MDS) were used for the analysis of genetic relationship between the populations. Comparative analysis of allele frequencies of the studied SNPs showed significant differentiation among the studied populations. The Kazakh population was grouped with Asian populations according to the cluster analysis and with the Caucasian populations according to PCA. According to MDS, results of the current study show that the Kazakh population holds an intermediate position between Caucasian and Asian populations. A high percentage of population differentiation was observed between Kazakh and world populations. The Kazakh population was clustered with Caucasian populations, and this result may indicate a significant Caucasian component in the Kazakh gene pool.
UFVA, A Combined Linear and Nonlinear Factor Analysis Program Package for Chemical Data Evaluation.
1980-11-01
that one cluster consists of the monoterpenes and Isoprene; the second is of the sesquiterpenes. Compound 8 (Caryophyllene) should therefore belong to...two clusters very clearly (Fig. 6). Figure 6 The very similar fragmentation pattern of Isoprene and the monoterpenes is reflected by their close...13 of another set of 13 terpene components. These are Isoprene, four monoterpenes (Myrcene, Menthol, Camphene, Umbellulone), four sesquiterpenes
Cerón-Muñoz, M F; Tonhati, H; Costa, C N; Rojas-Sarmiento, D; Echeverri Echeverri, D M
2004-08-01
Descriptive herd variables (DVHE) were used to explain genotype by environment interactions (G x E) for milk yield (MY) in Brazilian and Colombian production environments and to develop a herd-cluster model to estimate covariance components and genetic parameters for each herd environment group. Data consisted of 180,522 lactation records of 94,558 Holstein cows from 937 Brazilian and 400 Colombian herds. Herds in both countries were jointly grouped in thirds according to 8 DVHE: production level, phenotypic variability, age at first calving, calving interval, percentage of imported semen, lactation length, and herd size. For each DVHE, REML bivariate animal model analyses were used to estimate genetic correlations for MY between upper and lower thirds of the data. Based on estimates of genetic correlations, weights were assigned to each DVHE to group herds in a cluster analysis using the FASTCLUS procedure in SAS. Three clusters were defined, and genetic and residual variance components were heterogeneous among herd clusters. Estimates of heritability in clusters 1 and 3 were 0.28 and 0.29, respectively, but the estimate was larger (0.39) in Cluster 2. The genetic correlations of MY from different clusters ranged from 0.89 to 0.97. The herd-cluster model based on DVHE properly takes into account G x E by grouping similar environments accordingly and seems to be an alternative to simply considering country borders to distinguish between environments.
Yang, Guang; Raschke, Felix; Barrick, Thomas R; Howe, Franklyn A
2015-09-01
To investigate whether nonlinear dimensionality reduction improves unsupervised classification of (1) H MRS brain tumor data compared with a linear method. In vivo single-voxel (1) H magnetic resonance spectroscopy (55 patients) and (1) H magnetic resonance spectroscopy imaging (MRSI) (29 patients) data were acquired from histopathologically diagnosed gliomas. Data reduction using Laplacian eigenmaps (LE) or independent component analysis (ICA) was followed by k-means clustering or agglomerative hierarchical clustering (AHC) for unsupervised learning to assess tumor grade and for tissue type segmentation of MRSI data. An accuracy of 93% in classification of glioma grade II and grade IV, with 100% accuracy in distinguishing tumor and normal spectra, was obtained by LE with unsupervised clustering, but not with the combination of k-means and ICA. With (1) H MRSI data, LE provided a more linear distribution of data for cluster analysis and better cluster stability than ICA. LE combined with k-means or AHC provided 91% accuracy for classifying tumor grade and 100% accuracy for identifying normal tissue voxels. Color-coded visualization of normal brain, tumor core, and infiltration regions was achieved with LE combined with AHC. The LE method is promising for unsupervised clustering to separate brain and tumor tissue with automated color-coding for visualization of (1) H MRSI data after cluster analysis. © 2014 Wiley Periodicals, Inc.
Alerts Analysis and Visualization in Network-based Intrusion Detection Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Dr. Li
2010-08-01
The alerts produced by network-based intrusion detection systems, e.g. Snort, can be difficult for network administrators to efficiently review and respond to due to the enormous number of alerts generated in a short time frame. This work describes how the visualization of raw IDS alert data assists network administrators in understanding the current state of a network and quickens the process of reviewing and responding to intrusion attempts. The project presented in this work consists of three primary components. The first component provides a visual mapping of the network topology that allows the end-user to easily browse clustered alerts. Themore » second component is based on the flocking behavior of birds such that birds tend to follow other birds with similar behaviors. This component allows the end-user to see the clustering process and provides an efficient means for reviewing alert data. The third component discovers and visualizes patterns of multistage attacks by profiling the attacker s behaviors.« less
NASA Astrophysics Data System (ADS)
Atherton, Daniel
Early detection of disease and insect infestation within crops and precise application of pesticides can help reduce potential production losses, reduce environmental risk, and reduce the cost of farming. The goal of this study was the advanced detection of early blight (Alternaria solani) in potato (Solanum tuberosum) plants using hyperspectral remote sensing data captured with a handheld spectroradiometer. Hyperspectral reflectance spectra were captured 10 times over five weeks from plants grown to the vegetative and tuber bulking growth stages. The spectra were analyzed using principal component analysis (PCA), spectral change (ratio) analysis, partial least squares (PLS), cluster analysis, and vegetative indices. PCA successfully distinguished more heavily diseased plants from healthy and minimally diseased plants using two principal components. Spectral change (ratio) analysis provided wavelengths (490-510, 640, 665-670, 690, 740-750, and 935 nm) most sensitive to early blight infection followed by ANOVA results indicating a highly significant difference (p < 0.0001) between disease rating group means. In the majority of the experiments, comparisons of diseased plants with healthy plants using Fisher's LSD revealed more heavily diseased plants were significantly different from healthy plants. PLS analysis demonstrated the feasibility of detecting early blight infected plants, finding four optimal factors for raw spectra with the predictor variation explained ranging from 93.4% to 94.6% and the response variation explained ranging from 42.7% to 64.7%. Cluster analysis successfully distinguished healthy plants from all diseased plants except for the most mildly diseased plants, showing clustering analysis was an effective method for detection of early blight. Analysis of the reflectance spectra using the simple ratio (SR) and the normalized difference vegetative index (NDVI) was effective at differentiating all diseased plants from healthy plants, except for the most mildly diseased plants. Of the analysis methods attempted, cluster analysis and vegetative indices were the most promising. The results show the potential of hyperspectral remote sensing for the detection of early blight in potato plants.
NASA Technical Reports Server (NTRS)
Kalton, G.
1983-01-01
A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.
GALAXY INFALL BY INTERACTING WITH ITS ENVIRONMENT: A COMPREHENSIVE STUDY OF 340 GALAXY CLUSTERS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gu, Liyi; Wen, Zhonglue; Gandhi, Poshak
To study systematically the evolution of the angular extents of the galaxy, intracluster medium (ICM), and dark matter components in galaxy clusters, we compiled the optical and X-ray properties of a sample of 340 clusters with redshifts <0.5, based on all the available data from the Sloan Digital Sky Survey and Chandra / XMM-Newton . For each cluster, the member galaxies were determined primarily with photometric redshift measurements. The radial ICM mass distribution, as well as the total gravitational mass distribution, was derived from a spatially resolved spectral analysis of the X-ray data. When normalizing the radial profile of galaxymore » number to that of the ICM mass, the relative curve was found to depend significantly on the cluster redshift; it drops more steeply toward the outside in lower-redshift subsamples. The same evolution is found in the galaxy-to-total mass profile, while the ICM-to-total mass profile varies in an opposite way. The behavior of the galaxy-to-ICM distribution does not depend on the cluster mass, suggesting that the detected redshift dependence is not due to mass-related effects, such as sample selection bias. Also, it cannot be ascribed to various redshift-dependent systematic errors. We interpret that the galaxies, the ICM, and the dark matter components had similar angular distributions when a cluster was formed, while the galaxies traveling in the interior of the cluster have continuously fallen toward the center relative to the other components, and the ICM has slightly expanded relative to the dark matter although it suffers strong radiative loss. This cosmological galaxy infall, accompanied by an ICM expansion, can be explained by considering that the galaxies interact strongly with the ICM while they are moving through it. The interaction is considered to create a large energy flow of 10{sup 4445} erg s{sup 1} per cluster from the member galaxies to their environment, which is expected to continue over cosmological timescales.« less
Galaxy Infall by Interacting with Its Environment: A Comprehensive Study of 340 Galaxy Clusters
NASA Astrophysics Data System (ADS)
Gu, Liyi; Wen, Zhonglue; Gandhi, Poshak; Inada, Naohisa; Kawaharada, Madoka; Kodama, Tadayuki; Konami, Saori; Nakazawa, Kazuhiro; Xu, Haiguang; Makishima, Kazuo
2016-07-01
To study systematically the evolution of the angular extents of the galaxy, intracluster medium (ICM), and dark matter components in galaxy clusters, we compiled the optical and X-ray properties of a sample of 340 clusters with redshifts <0.5, based on all the available data from the Sloan Digital Sky Survey and Chandra/XMM-Newton. For each cluster, the member galaxies were determined primarily with photometric redshift measurements. The radial ICM mass distribution, as well as the total gravitational mass distribution, was derived from a spatially resolved spectral analysis of the X-ray data. When normalizing the radial profile of galaxy number to that of the ICM mass, the relative curve was found to depend significantly on the cluster redshift; it drops more steeply toward the outside in lower-redshift subsamples. The same evolution is found in the galaxy-to-total mass profile, while the ICM-to-total mass profile varies in an opposite way. The behavior of the galaxy-to-ICM distribution does not depend on the cluster mass, suggesting that the detected redshift dependence is not due to mass-related effects, such as sample selection bias. Also, it cannot be ascribed to various redshift-dependent systematic errors. We interpret that the galaxies, the ICM, and the dark matter components had similar angular distributions when a cluster was formed, while the galaxies traveling in the interior of the cluster have continuously fallen toward the center relative to the other components, and the ICM has slightly expanded relative to the dark matter although it suffers strong radiative loss. This cosmological galaxy infall, accompanied by an ICM expansion, can be explained by considering that the galaxies interact strongly with the ICM while they are moving through it. The interaction is considered to create a large energy flow of 1044-45 erg s-1 per cluster from the member galaxies to their environment, which is expected to continue over cosmological timescales.
NASA Astrophysics Data System (ADS)
Serrano, Francisco; Guerra-Merchán, Antonio; Lozano-Francisco, Carmen; Vera-Peláez, José Luis
1997-09-01
Nerja Cave is a karstic cavity used by humans from Late Paleolithic to post-Chalcolithic times. Remains of molluscan foods in the uppermost Pleistocene and Holocene sediments were studied with cluster analysis and principal components analysis, in both Qand Rmodes. The results from cluster analysis distinguished interval groups mainly in accordance with chronology and distinguished assemblages of species mainly according to habitat. Significant changes in the shellfish diet through time were revealed. In the Late Magdalenian, most molluscs consumed consisted of pulmonate gastropods and species from sandy sea bottoms. The Epipaleolithic diet was more varied and included species from rocky shorelines. From the Neolithic onward most molluscs consumed were from rocky shorelines. From the principal components analysis in Qmode, the first factor reflected mainly changes in the predominant capture environment, probably because of major paleogeographic changes. The second factor may reflect selective capture along rocky coastlines during certain times. The third factor correlated well with the sea-surface temperature curve in the western Mediterranean (Alboran Sea) during the late Quaternary.
Automated Classification and Analysis of Non-metallic Inclusion Data Sets
NASA Astrophysics Data System (ADS)
Abdulsalam, Mohammad; Zhang, Tongsheng; Tan, Jia; Webler, Bryan A.
2018-05-01
The aim of this study is to utilize principal component analysis (PCA), clustering methods, and correlation analysis to condense and examine large, multivariate data sets produced from automated analysis of non-metallic inclusions. Non-metallic inclusions play a major role in defining the properties of steel and their examination has been greatly aided by automated analysis in scanning electron microscopes equipped with energy dispersive X-ray spectroscopy. The methods were applied to analyze inclusions on two sets of samples: two laboratory-scale samples and four industrial samples from a near-finished 4140 alloy steel components with varying machinability. The laboratory samples had well-defined inclusions chemistries, composed of MgO-Al2O3-CaO, spinel (MgO-Al2O3), and calcium aluminate inclusions. The industrial samples contained MnS inclusions as well as (Ca,Mn)S + calcium aluminate oxide inclusions. PCA could be used to reduce inclusion chemistry variables to a 2D plot, which revealed inclusion chemistry groupings in the samples. Clustering methods were used to automatically classify inclusion chemistry measurements into groups, i.e., no user-defined rules were required.
Sun, Liping; Luo, Yonglong; Ding, Xintao; Zhang, Ji
2014-01-01
An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE) algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect.
A Census of Baryons in Galaxy Clusters and Groups
NASA Astrophysics Data System (ADS)
Gonzalez, Anthony H.; Zaritsky, Dennis; Zabludoff, Ann I.
2007-09-01
We determine the contribution of stars in galaxies, intracluster stars, and the intracluster medium to the total baryon budget in nearby galaxy clusters and groups. We find that the baryon mass fraction (fb≡Ωb/Ωm) within r500 is constant for systems with M500 between 6×1013 and 1×1015 Msolar. Although fb is lower than the WMAP value, the shortfall is on the order of both the observational systematic uncertainties and the depletion of baryons within r500 that is predicted by simulations. The data therefore provide no compelling evidence for undetected baryonic components, particularly any that would be expected to vary in importance with cluster mass. A unique feature of the current analysis is direct inclusion of the contribution of intracluster light (ICL) in the baryon budget. With the addition of the ICL to the stellar mass in galaxies, the increase in X-ray gas mass fraction with increasing total mass is entirely accounted for by a decrease in the total stellar mass fraction, supporting the argument that the behavior of both the stellar and X-ray gas components is dominated by a decrease in star formation efficiency in more massive environments. Within just the stellar component, the fraction of the total stellar luminosity in the central, giant brightest cluster galaxy (BCG) and ICL (hereafter the BCG+ICL component) decreases as velocity dispersion (σ) increases for systems with 145 km s-1<=σ<=1026 km s-1, suggesting that the BCG+ICL component, and in particular the dominant ICL component, grows less efficiently in higher mass environments. The degree to which this behavior arises from our sample selection, which favored systems with central, giant elliptical galaxies, remains unclear. A more robust result is the identification of low-mass groups with large BCG+ICL components, demonstrating that the creation of ``intracluster'' stars does not require a massive cluster environment. Within r500 and r200, the BCG+ICL contributes on average 40% and 33% of the total stellar light, respectively, for the clusters and groups in our sample. Because these fractions are functions of both enclosed radius and system mass, care should be exercised when comparing these values with other studies and simulations.
Intra-class correlation estimates for assessment of vitamin A intake in children.
Agarwal, Girdhar G; Awasthi, Shally; Walter, Stephen D
2005-03-01
In many community-based surveys, multi-level sampling is inherent in the design. In the design of these studies, especially to calculate the appropriate sample size, investigators need good estimates of intra-class correlation coefficient (ICC), along with the cluster size, to adjust for variation inflation due to clustering at each level. The present study used data on the assessment of clinical vitamin A deficiency and intake of vitamin A-rich food in children in a district in India. For the survey, 16 households were sampled from 200 villages nested within eight randomly-selected blocks of the district. ICCs and components of variances were estimated from a three-level hierarchical random effects analysis of variance model. Estimates of ICCs and variance components were obtained at village and block levels. Between-cluster variation was evident at each level of clustering. In these estimates, ICCs were inversely related to cluster size, but the design effect could be substantial for large clusters. At the block level, most ICC estimates were below 0.07. At the village level, many ICC estimates ranged from 0.014 to 0.45. These estimates may provide useful information for the design of epidemiological studies in which the sampled (or allocated) units range in size from households to large administrative zones.
Potential of SNP markers for the characterization of Brazilian cassava germplasm.
de Oliveira, Eder Jorge; Ferreira, Cláudia Fortes; da Silva Santos, Vanderlei; de Jesus, Onildo Nunes; Oliveira, Gilmara Alvarenga Fachardo; da Silva, Maiane Suzarte
2014-06-01
High-throughput markers, such as SNPs, along with different methodologies were used to evaluate the applicability of the Bayesian approach and the multivariate analysis in structuring the genetic diversity in cassavas. The objective of the present work was to evaluate the diversity and genetic structure of the largest cassava germplasm bank in Brazil. Complementary methodological approaches such as discriminant analysis of principal components (DAPC), Bayesian analysis and molecular analysis of variance (AMOVA) were used to understand the structure and diversity of 1,280 accessions genotyped using 402 single nucleotide polymorphism markers. The genetic diversity (0.327) and the average observed heterozygosity (0.322) were high considering the bi-allelic markers. In terms of population, the presence of a complex genetic structure was observed indicating the formation of 30 clusters by DAPC and 34 clusters by Bayesian analysis. Both methodologies presented difficulties and controversies in terms of the allocation of some accessions to specific clusters. However, the clusters suggested by the DAPC analysis seemed to be more consistent for presenting higher probability of allocation of the accessions within the clusters. Prior information related to breeding patterns and geographic origins of the accessions were not sufficient for providing clear differentiation between the clusters according to the AMOVA analysis. In contrast, the F ST was maximized when considering the clusters suggested by the Bayesian and DAPC analyses. The high frequency of germplasm exchange between producers and the subsequent alteration of the name of the same material may be one of the causes of the low association between genetic diversity and geographic origin. The results of this study may benefit cassava germplasm conservation programs, and contribute to the maximization of genetic gains in breeding programs.
[Effects of bkdAB interruption on avermectin biosynthesis].
Zhu, Hao-Jun; Liang, Yun-Xiang; Zhou, Jun-Chu; Zheng, Ying-Hua
2004-03-01
In this study, Streptomyces avermitilis Bjbm0006 which produces four avermectin B components was used as an original test strain. A replacement plasmid containing a gene cluster bkdAB (branched-chain alpha-keto acid dehydrogenase gene) involved in the biosynthesis of avermectin B in S. avermitilis Bjbm0006 was constructed by means of PCR technique and then named as pHJ5821 (pHZ1358::bkdAB&erm). A recombinant strain Bjbm5821 was obtained after the gene cluster was interrupted by double crossover. This strain was tested in laboratory conditions and analysed by PCR using the total DNA as template. The HPLC analysis showed that the strain Bjbm5821 synthesized the same 'a' components Bla and B2a as the original strain did. However, It lost the ability for the production of 'b'components for example B1b and B2b. A novel compound was detected in fermentation products. The results of present study suggests that the production of gene cluster bkdAB may play a main role similar to alpha-ketoisovaleric acid dehydrogenase in the pathway of avermectin synthesis.
Observations and analysis of the contact binary H 235 in the open cluster NGC 752
NASA Astrophysics Data System (ADS)
Milone, E. F.; Stagg, C. R.; Sugars, B. A.; McVean, J. R.; Schiller, S. J.; Kallrath, J.; Bradstreet, D. H.
1995-01-01
The short-period variable star Heinemann 235 in the open cluster NGC 752 has been identified as a contact binary with a variable period of about 0 d 4118. BVRI light curves and radial velocity curves have been obtained and analyzed with enhanced versions of the Wilson-Devinney light curve program. We find that the system is best modeled as an A-type W UMa system, with a contact parameter of 0.21 +/- 0.11. The masses of the components are found to be 1.18 +/- 0.17 and 0.24 +/- 0.04 solar mass, with bolometric magnitudes of 3.60 +/- 0.10 and 5.21 +/- 0.13, for the hotter (6500 K, assumed) and cooler (6421 K) components, respectively, with Delta T=79 +/- 25 K. The distance to the binary is established at 381 +/- 17 pc. H235 becomes one of a relatively small number of open-cluster contact systems with detailed light curve analysis for which an age may be estimated. If it is coeval with the cluster, and with the detached eclipsing and double-lined spectroscopic binary H219 (DS And), H235 is approximately 1.8 Gyr old, and may provide a fiducial point for the evolution of contact systems. There is, however, evidence for dynamical evolution of the cluster and the likelihood of weak interactions over the age of the binary precludes the determination of its initial state with certainty.
Nepusz, Tamás; Sasidharan, Rajkumar; Paccanaro, Alberto
2010-03-09
An important problem in genomics is the automatic inference of groups of homologous proteins from pairwise sequence similarities. Several approaches have been proposed for this task which are "local" in the sense that they assign a protein to a cluster based only on the distances between that protein and the other proteins in the set. It was shown recently that global methods such as spectral clustering have better performance on a wide variety of datasets. However, currently available implementations of spectral clustering methods mostly consist of a few loosely coupled Matlab scripts that assume a fair amount of familiarity with Matlab programming and hence they are inaccessible for large parts of the research community. SCPS (Spectral Clustering of Protein Sequences) is an efficient and user-friendly implementation of a spectral method for inferring protein families. The method uses only pairwise sequence similarities, and is therefore practical when only sequence information is available. SCPS was tested on difficult sets of proteins whose relationships were extracted from the SCOP database, and its results were extensively compared with those obtained using other popular protein clustering algorithms such as TribeMCL, hierarchical clustering and connected component analysis. We show that SCPS is able to identify many of the family/superfamily relationships correctly and that the quality of the obtained clusters as indicated by their F-scores is consistently better than all the other methods we compared it with. We also demonstrate the scalability of SCPS by clustering the entire SCOP database (14,183 sequences) and the complete genome of the yeast Saccharomyces cerevisiae (6,690 sequences). Besides the spectral method, SCPS also implements connected component analysis and hierarchical clustering, it integrates TribeMCL, it provides different cluster quality tools, it can extract human-readable protein descriptions using GI numbers from NCBI, it interfaces with external tools such as BLAST and Cytoscape, and it can produce publication-quality graphical representations of the clusters obtained, thus constituting a comprehensive and effective tool for practical research in computational biology. Source code and precompiled executables for Windows, Linux and Mac OS X are freely available at http://www.paccanarolab.org/software/scps.
Fogel, Paul; Gaston-Mathé, Yann; Hawkins, Douglas; Fogel, Fajwel; Luta, George; Young, S. Stanley
2016-01-01
Often data can be represented as a matrix, e.g., observations as rows and variables as columns, or as a doubly classified contingency table. Researchers may be interested in clustering the observations, the variables, or both. If the data is non-negative, then Non-negative Matrix Factorization (NMF) can be used to perform the clustering. By its nature, NMF-based clustering is focused on the large values. If the data is normalized by subtracting the row/column means, it becomes of mixed signs and the original NMF cannot be used. Our idea is to split and then concatenate the positive and negative parts of the matrix, after taking the absolute value of the negative elements. NMF applied to the concatenated data, which we call PosNegNMF, offers the advantages of the original NMF approach, while giving equal weight to large and small values. We use two public health datasets to illustrate the new method and compare it with alternative clustering methods, such as K-means and clustering methods based on the Singular Value Decomposition (SVD) or Principal Component Analysis (PCA). With the exception of situations where a reasonably accurate factorization can be achieved using the first SVD component, we recommend that the epidemiologists and environmental scientists use the new method to obtain clusters with improved quality and interpretability. PMID:27213413
Fogel, Paul; Gaston-Mathé, Yann; Hawkins, Douglas; Fogel, Fajwel; Luta, George; Young, S Stanley
2016-05-18
Often data can be represented as a matrix, e.g., observations as rows and variables as columns, or as a doubly classified contingency table. Researchers may be interested in clustering the observations, the variables, or both. If the data is non-negative, then Non-negative Matrix Factorization (NMF) can be used to perform the clustering. By its nature, NMF-based clustering is focused on the large values. If the data is normalized by subtracting the row/column means, it becomes of mixed signs and the original NMF cannot be used. Our idea is to split and then concatenate the positive and negative parts of the matrix, after taking the absolute value of the negative elements. NMF applied to the concatenated data, which we call PosNegNMF, offers the advantages of the original NMF approach, while giving equal weight to large and small values. We use two public health datasets to illustrate the new method and compare it with alternative clustering methods, such as K-means and clustering methods based on the Singular Value Decomposition (SVD) or Principal Component Analysis (PCA). With the exception of situations where a reasonably accurate factorization can be achieved using the first SVD component, we recommend that the epidemiologists and environmental scientists use the new method to obtain clusters with improved quality and interpretability.
Sensory over responsivity and obsessive compulsive symptoms: A cluster analysis.
Ben-Sasson, Ayelet; Podoly, Tamar Yonit
2017-02-01
Several studies have examined the sensory component in Obsesseive Compulsive Disorder (OCD) and described an OCD subtype which has a unique profile, and that Sensory Phenomena (SP) is a significant component of this subtype. SP has some commonalities with Sensory Over Responsivity (SOR) and might be in part a characteristic of this subtype. Although there are some studies that have examined SOR and its relation to Obsessive Compulsive Symptoms (OCS), literature lacks sufficient data on this interplay. First to further examine the correlations between OCS and SOR, and to explore the correlations between SOR modalities (i.e. smell, touch, etc.) and OCS subscales (i.e. washing, ordering, etc.). Second, to investigate the cluster analysis of SOR and OCS dimensions in adults, that is, to classify the sample using the sensory scores to find whether a sensory OCD subtype can be specified. Our third goal was to explore the psychometric features of a new sensory questionnaire: the Sensory Perception Quotient (SPQ). A sample of non clinical adults (n=350) was recruited via e-mail, social media and social networks. Participants completed questionnaires for measuring SOR, OCS, and anxiety. SOR and OCI-F scores were moderately significantly correlated (n=274), significant correlations between all SOR modalities and OCS subscales were found with no specific higher correlation between one modality to one OCS subscale. Cluster analysis revealed four distinct clusters: (1) No OC and SOR symptoms (NONE; n=100), (2) High OC and SOR symptoms (BOTH; n=28), (3) Moderate OC symptoms (OCS; n=63), (4) Moderate SOR symptoms (SOR; n=83). The BOTH cluster had significantly higher anxiety levels than the other clusters, and shared OC subscales scores with the OCS cluster. The BOTH cluster also reported higher SOR scores across tactile, vision, taste and olfactory modalities. The SPQ was found reliable and suitable to detect SOR, the sample SPQ scores was normally distributed (n=350). SOR is a dimensional feature that can influence the severity of OCS and may characterize a unique sensory OCD subtype. Copyright © 2016 Elsevier Inc. All rights reserved.
Decomposing the Apoptosis Pathway Into Biologically Interpretable Principal Components
Wang, Min; Kornblau, Steven M; Coombes, Kevin R
2018-01-01
Principal component analysis (PCA) is one of the most common techniques in the analysis of biological data sets, but applying PCA raises 2 challenges. First, one must determine the number of significant principal components (PCs). Second, because each PC is a linear combination of genes, it rarely has a biological interpretation. Existing methods to determine the number of PCs are either subjective or computationally extensive. We review several methods and describe a new R package, PCDimension, that implements additional methods, the most important being an algorithm that extends and automates a graphical Bayesian method. Using simulations, we compared the methods. Our newly automated procedure is competitive with the best methods when considering both accuracy and speed and is the most accurate when the number of objects is small compared with the number of attributes. We applied the method to a proteomics data set from patients with acute myeloid leukemia. Proteins in the apoptosis pathway could be explained using 6 PCs. By clustering the proteins in PC space, we were able to replace the PCs by 6 “biological components,” 3 of which could be immediately interpreted from the current literature. We expect this approach combining PCA with clustering to be widely applicable. PMID:29881252
Statistical indicators of collective behavior and functional clusters in gene networks of yeast
NASA Astrophysics Data System (ADS)
Živković, J.; Tadić, B.; Wick, N.; Thurner, S.
2006-03-01
We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
Bayesian multivariate hierarchical transformation models for ROC analysis.
O'Malley, A James; Zou, Kelly H
2006-02-15
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis
O'Malley, A. James; Zou, Kelly H.
2006-01-01
SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
Puma (Puma concolor) epididymal sperm morphometry
Cucho, Hernán; Alarcón, Virgilio; Ordóñez, César; Ampuero, Enrique; Meza, Aydee; Soler, Carles
2016-01-01
The Andean puma (Puma concolor) has not been widely studied, particularly in reference to its semen characteristics. The aim of the present study was to define the morphometry of puma sperm heads and classify their subpopulations by cluster analysis. Samples were recovered postmortem from two epididymides from one animal and prepared for morphological observation after staining with the Hemacolor kit. Morphometric data were obtained from 581 spermatozoa using a CASA-Morph system, rendering 13 morphometric parameters. The principal component (PC) analysis was performed followed by cluster analysis for the establishment of subpopulations. Two PC components were obtained, the first related to size and the second to shape. Three subpopulations were observed, corresponding to elongated and intermediate-size sperm heads and acrosomes, to large heads with large acrosomes, and to small heads with short acrosomes. In conclusion, puma spermatozoa showed no uniform sperm morphology but three clear subpopulations. These results should be used for future work in the establishment of an adequate germplasm bank of this species. PMID:27678466
Puma (Puma concolor) epididymal sperm morphometry.
Cucho, Hernán; Alarcón, Virgilio; Ordóñez, César; Ampuero, Enrique; Meza, Aydee; Soler, Carles
2016-01-01
The Andean puma (Puma concolor) has not been widely studied, particularly in reference to its semen characteristics. The aim of the present study was to define the morphometry of puma sperm heads and classify their subpopulations by cluster analysis. Samples were recovered postmortem from two epididymides from one animal and prepared for morphological observation after staining with the Hemacolor kit. Morphometric data were obtained from 581 spermatozoa using a CASA-Morph system, rendering 13 morphometric parameters. The principal component (PC) analysis was performed followed by cluster analysis for the establishment of subpopulations. Two PC components were obtained, the first related to size and the second to shape. Three subpopulations were observed, corresponding to elongated and intermediate-size sperm heads and acrosomes, to large heads with large acrosomes, and to small heads with short acrosomes. In conclusion, puma spermatozoa showed no uniform sperm morphology but three clear subpopulations. These results should be used for future work in the establishment of an adequate germplasm bank of this species.
Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian
2016-01-01
The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Technical and biological reproducibility ranged between 96.8-99.4% and 47.6-94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable.
Traiperm, Paweena; Chow, Janene; Nopun, Possathorn; Staples, G; Swangpol, Sasivimon C
2017-12-01
The genus Argyreia Lour. is one of the species-rich Asian genera in the family Convolvulaceae. Several species complexes were recognized in which taxon delimitation was imprecise, especially when examining herbarium materials without fully developed open flowers. The main goal of this study is to investigate and describe leaf anatomy for some morphologically similar Argyreia using epidermal peeling, leaf and petiole transverse sections, and scanning electron microscopy. Phenetic analyses including cluster analysis and principal component analysis were used to investigate the similarity of these morpho-types. Anatomical differences observed between the morpho-types include epidermal cell walls and the trichome types on the leaf epidermis. Additional differences in the leaf and petiole transverse sections include the epidermal cell shape of the adaxial leaf blade, the leaf margins, and the petiole transverse sectional outline. The phenogram from cluster analysis using the UPGMA method represented four groups with an R value of 0.87. Moreover, the important quantitative and qualitative leaf anatomical traits of the four groups were confirmed by the principal component analysis of the first two components. The results from phenetic analyses confirmed the anatomical differentiation between the morpho-types. Leaf anatomical features regarded as particularly informative for morpho-type differentiation can be used to supplement macro morphological identification.
[A spatial adaptive algorithm for endmember extraction on multispectral remote sensing image].
Zhu, Chang-Ming; Luo, Jian-Cheng; Shen, Zhan-Feng; Li, Jun-Li; Hu, Xiao-Dong
2011-10-01
Due to the problem that the convex cone analysis (CCA) method can only extract limited endmember in multispectral imagery, this paper proposed a new endmember extraction method by spatial adaptive spectral feature analysis in multispectral remote sensing image based on spatial clustering and imagery slice. Firstly, in order to remove spatial and spectral redundancies, the principal component analysis (PCA) algorithm was used for lowering the dimensions of the multispectral data. Secondly, iterative self-organizing data analysis technology algorithm (ISODATA) was used for image cluster through the similarity of the pixel spectral. And then, through clustering post process and litter clusters combination, we divided the whole image data into several blocks (tiles). Lastly, according to the complexity of image blocks' landscape and the feature of the scatter diagrams analysis, the authors can determine the number of endmembers. Then using hourglass algorithm extracts endmembers. Through the endmember extraction experiment on TM multispectral imagery, the experiment result showed that the method can extract endmember spectra form multispectral imagery effectively. What's more, the method resolved the problem of the amount of endmember limitation and improved accuracy of the endmember extraction. The method has provided a new way for multispectral image endmember extraction.
Sun, Li-Li; Wang, Meng; Zhang, Hui-Jie; Liu, Ya-Nan; Ren, Xiao-Liang; Deng, Yan-Ru; Qi, Ai-Di
2018-01-01
Polygoni Multiflori Radix (PMR) is increasingly being used not just as a traditional herbal medicine but also as a popular functional food. In this study, multivariate chemometric methods and mass spectrometry were combined to analyze the ultra-high-performance liquid chromatograph (UPLC) fingerprints of PMR from six different geographical origins. A chemometric strategy based on multivariate curve resolution-alternating least squares (MCR-ALS) and three classification methods is proposed to analyze the UPLC fingerprints obtained. Common chromatographic problems, including the background contribution, baseline contribution, and peak overlap, were handled by the established MCR-ALS model. A total of 22 components were resolved. Moreover, relative species concentrations were obtained from the MCR-ALS model, which was used for multivariate classification analysis. Principal component analysis (PCA) and Ward's method have been applied to classify 72 PMR samples from six different geographical regions. The PCA score plot showed that the PMR samples fell into four clusters, which related to the geographical location and climate of the source areas. The results were then corroborated by Ward's method. In addition, according to the variance-weighted distance between cluster centers obtained from Ward's method, five components were identified as the most significant variables (chemical markers) for cluster discrimination. A counter-propagation artificial neural network has been applied to confirm and predict the effects of chemical markers on different samples. Finally, the five chemical markers were identified by UPLC-quadrupole time-of-flight mass spectrometer. Components 3, 12, 16, 18, and 19 were identified as 2,3,5,4'-tetrahydroxy-stilbene-2-O-β-d-glucoside, emodin-8-O-β-d-glucopyranoside, emodin-8-O-(6'-O-acetyl)-β-d-glucopyranoside, emodin, and physcion, respectively. In conclusion, the proposed method can be applied for the comprehensive analysis of natural samples. Copyright © 2016. Published by Elsevier B.V.
Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan
2017-01-01
PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.
Non-thermal pressure in the outskirts of Abell 2142
NASA Astrophysics Data System (ADS)
Fusco-Femiano, Roberto; Lapi, Andrea
2018-03-01
Clumping and turbulence are expected to affect the matter accreted on to the outskirts of galaxy clusters. To determine their impact on the thermodynamic properties of Abell 2142, we perform an analysis of the X-ray temperature data from XMM-Newton via our SuperModel, a state-of-the-art tool for investigating the astrophysics of the intracluster medium already tested on many individual clusters (since Cavaliere, Lapi & Fusco-Femiano 2009). Using the gas density profile corrected for clumpiness derived by Tchernin et al. (2016), we find evidence for the presence of a non-thermal pressure component required to sustain gravity in the cluster outskirts of Abell 2142, that amounts to about 30 per cent of the total pressure at the virial radius. The presence of the non-thermal component implies the gas fraction to be consistent with the universal value at the virial radius and the electron thermal pressure profile to be in good agreement with that inferred from the SZ data. Our results indicate that the presence of gas clumping and of a non-thermal pressure component are both necessary to recover the observed physical properties in the cluster outskirts. Moreover, we stress that an alternative method often exploited in the literature (included Abell 2142) to determine the temperature profile kBT = Pe/ne basing on a combination of the Sunyaev-Zel'dovich (SZ) pressure Pe and of the X-ray electron density ne does not allow us to highlight the presence of non-thermal pressure support in the cluster outskirts.
Benson, Nsikak U.; Asuquo, Francis E.; Williams, Akan B.; Essien, Joseph P.; Ekong, Cyril I.; Akpabio, Otobong; Olajire, Abaas A.
2016-01-01
Trace metals (Cd, Cr, Cu, Ni and Pb) concentrations in benthic sediments were analyzed through multi-step fractionation scheme to assess the levels and sources of contamination in estuarine, riverine and freshwater ecosystems in Niger Delta (Nigeria). The degree of contamination was assessed using the individual contamination factors (ICF) and global contamination factor (GCF). Multivariate statistical approaches including principal component analysis (PCA), cluster analysis and correlation test were employed to evaluate the interrelationships and associated sources of contamination. The spatial distribution of metal concentrations followed the pattern Pb>Cu>Cr>Cd>Ni. Ecological risk index by ICF showed significant potential mobility and bioavailability for Cu, Cu and Ni. The ICF contamination trend in the benthic sediments at all studied sites was Cu>Cr>Ni>Cd>Pb. The principal component and agglomerative clustering analyses indicate that trace metals contamination in the ecosystems was influenced by multiple pollution sources. PMID:27257934
Beebe, D W; Holmbeck, G N; Albright, J S; Noga, K; DeCastro, B
1995-01-01
This study investigated the escape model of binge eating through a cluster analysis using standardized measures. A sample of 126 undergraduate women underwent a manipulation of their level of cognition and were asked to "taste-test" several flavors of ice cream. Questionnaire data from these women were entered into a cluster analysis. Two groups emerged: women in the "binge-prone" group were significantly more depressed, had lower self-esteem, had more chaotic and extreme eating patterns, and were more self-conscious than those in the control group. In validation work, binge-prone women were shown to report elevated levels of bulimic symptomatology and, when in the presence of a food they enjoyed, to respond to increases in level of cognition by eating more. These results were consistent with some, but not all, of the components of the escape model.
EMPCA and Cluster Analysis of Quasar Spectra: Construction and Application to Simulated Spectra
NASA Astrophysics Data System (ADS)
Marrs, Adam; Leighly, Karen; Wagner, Cassidy; Macinnis, Francis
2017-01-01
Quasars have complex spectra with emission lines influenced by many factors. Therefore, to fully describe the spectrum requires specification of a large number of parameters, such as line equivalent width, blueshift, and ratios. Principal Component Analysis (PCA) aims to construct eigenvectors-or principal components-from the data with the goal of finding a few key parameters that can be used to predict the rest of the spectrum fairly well. Analysis of simulated quasar spectra was used to verify and justify our modified application of PCA.We used a variant of PCA called Weighted Expectation Maximization PCA (EMPCA; Bailey 2012) along with k-means cluster analysis to analyze simulated quasar spectra. Our approach combines both analytical methods to address two known problems with classical PCA. EMPCA uses weights to account for uncertainty and missing points in the spectra. K-means groups similar spectra together to address the nonlinearity of quasar spectra, specifically variance in blueshifts and widths of the emission lines.In producing and analyzing simulations, we first tested the effects of varying equivalent widths and blueshifts on the derived principal components, and explored the differences between standard PCA and EMPCA. We also tested the effects of varying signal-to-noise ratio. Next we used the results of fits to composite quasar spectra (see accompanying poster by Wagner et al.) to construct a set of realistic simulated spectra, and subjected those spectra to the EMPCA /k-means analysis. We concluded that our approach was validated when we found that the mean spectra from our k-means clusters derived from PCA projection coefficients reproduced the trends observed in the composite spectra.Furthermore, our method needed only two eigenvectors to identify both sets of correlations used to construct the simulations, as well as indicating the linear and nonlinear segments. Comparing this to regular PCA, which can require a dozen or more components, or to direct spectral analysis that may need measurement of 20 fit parameters, shows why the dual application of these two techniques is such a powerful tool.
HICOSMO - X-ray analysis of a complete sample of galaxy clusters
NASA Astrophysics Data System (ADS)
Schellenberger, G.; Reiprich, T.
2017-10-01
Galaxy clusters are known to be the largest virialized objects in the Universe. Based on the theory of structure formation one can use them as cosmological probes, since they originate from collapsed overdensities in the early Universe and witness its history. The X-ray regime provides the unique possibility to measure in detail the most massive visible component, the intra cluster medium. Using Chandra observations of a local sample of 64 bright clusters (HIFLUGCS) we provide total (hydrostatic) and gas mass estimates of each cluster individually. Making use of the completeness of the sample we quantify two interesting cosmological parameters by a Bayesian cosmological likelihood analysis. We find Ω_{M}=0.3±0.01 and σ_{8}=0.79±0.03 (statistical uncertainties) using our default analysis strategy combining both, a mass function analysis and the gas mass fraction results. The main sources of biases that we discuss and correct here are (1) the influence of galaxy groups (higher incompleteness in parent samples and a differing behavior of the L_{x} - M relation), (2) the hydrostatic mass bias (as determined by recent hydrodynamical simulations), (3) the extrapolation of the total mass (comparing various methods), (4) the theoretical halo mass function and (5) other cosmological (non-negligible neutrino mass), and instrumental (calibration) effects.
[Identification of two varieties of Citri Fructus by fingerprint and chemometrics].
Su, Jing-hua; Zhang, Chao; Sun, Lei; Gu, Bing-ren; Ma, Shuang-cheng
2015-06-01
Citri Fructus identification by fingerprint and chemometrics was investigated in this paper. Twenty-three Citri Fructus samples were collected which referred to two varieties as Cirtus wilsonii and C. medica recorded in Chinese Pharmacopoeia. HPLC chromatograms were obtained. The components were partly identified by reference substances, and then common pattern was established for chemometrics analysis. Similarity analysis, principal component analysis (PCA) , partial least squares-discriminant analysis (PLS-DA) and hierarchical cluster analysis heatmap were applied. The results indicated that C. wilsonii and C. medica could be ideally classified with common pattern contained twenty-five characteristic peaks. Besides, preliminary pattern recognition had verified the chemometrics analytical results. Absolute peak area (APA) was used for relevant quantitative analysis, results showed the differences between two varieties and it was valuable for further quality control as selection of characteristic components.
Stojanovic, Gordana S; Jovanović, Snežana C; Zlatković, Bojan K
2015-06-01
The present study is engaged in the chemical composition of methanol extracts of Sedum taxa from the central part of the Balkan Peninsula, and representatives from other genera of Crassulaceae (Crassula, Echeveria and Kalanchoe) considered as out-groups. The chemical composition of extracts was determined by HPLC analysis, according to retention time of standards and characteristic absorption spectra of components. Identified components were considered as original variables with possible chemotaxonomic significance. Relationships of examined plant samples were investigated by agglomerative hierarchical cluster analysis (AHC). The obtained results showed how the distribution of methanol extract components (mostly phenolics) affected grouping of the examined samples. The obtained clustering showed satisfactory grouping of the examined samples, among which some representatives of the Sedum series, Rupestria and Magellensia, are the most remote. The out-group samples were not clearly singled out with regard to Sedum samples as expected; this especially applies to samples of Crassula ovata and Echeveria lilacina, while Kalanchoe daigremontiana was more separated from most of the Sedum samples.
Roushangar, Kiyoumars; Alizadeh, Farhad; Adamowski, Jan
2018-08-01
Understanding precipitation on a regional basis is an important component of water resources planning and management. The present study outlines a methodology based on continuous wavelet transform (CWT) and multiscale entropy (CWME), combined with self-organizing map (SOM) and k-means clustering techniques, to measure and analyze the complexity of precipitation. Historical monthly precipitation data from 1960 to 2010 at 31 rain gauges across Iran were preprocessed by CWT. The multi-resolution CWT approach segregated the major features of the original precipitation series by unfolding the structure of the time series which was often ambiguous. The entropy concept was then applied to components obtained from CWT to measure dispersion, uncertainty, disorder, and diversification of subcomponents. Based on different validity indices, k-means clustering captured homogenous areas more accurately, and additional analysis was performed based on the outcome of this approach. The 31 rain gauges in this study were clustered into 6 groups, each one having a unique CWME pattern across different time scales. The results of clustering showed that hydrologic similarity (multiscale variation of precipitation) was not based on geographic contiguity. According to the pattern of entropy across the scales, each cluster was assigned an entropy signature that provided an estimation of the entropy pattern of precipitation data in each cluster. Based on the pattern of mean CWME for each cluster, a characteristic signature was assigned, which provided an estimation of the CWME of a cluster across scales of 1-2, 3-8, and 9-13 months relative to other stations. The validity of the homogeneous clusters demonstrated the usefulness of the proposed approach to regionalize precipitation. Further analysis based on wavelet coherence (WTC) was performed by selecting central rain gauges in each cluster and analyzing against temperature, wind, Multivariate ENSO index (MEI), and East Atlantic (EA) and North Atlantic Oscillation (NAO), indeces. The results revealed that all climatic features except NAO influenced precipitation in Iran during the 1960-2010 period. Copyright © 2018 Elsevier Inc. All rights reserved.
Symptom clustering and quality of life in patients with ovarian cancer undergoing chemotherapy.
Nho, Ju-Hee; Reul Kim, Sung; Nam, Joo-Hyun
2017-10-01
The symptom clusters in patients with ovarian cancer undergoing chemotherapy have not been well evaluated. We investigated the symptom clusters and effects of symptom clusters on the quality of life of patients with ovarian cancer. We recruited 210 ovarian cancer patients being treated with chemotherapy and used a descriptive cross-sectional study design to collect information on their symptoms. To determine inter-relationships among symptoms, a principal component analysis with varimax rotation was performed based on the patient's symptoms (fatigue, pain, sleep disturbance, chemotherapy-induced peripheral neuropathy, anxiety, depression, and sexual dysfunction). All patients had experienced at least two domains of concurrent symptoms, and there were two types of symptom clusters. The first symptom cluster consisted of anxiety, depression, fatigue, and sleep disturbance symptoms, while the second symptom cluster consisted of pain and chemotherapy-induced peripheral neuropathy symptoms. Our subgroup cluster analysis showed that ovarian cancer patients with higher-scoring symptoms had significantly poorer quality of life in both symptom cluster 1 and 2 subgroups, with subgroup-specific patterns. The symptom clusters were different depending on age, age at disease onset, disease duration, recurrence, and performance status of patients with ovarian cancer. In addition, ovarian cancer patients experienced different symptom clusters according to cancer stage. The current study demonstrated that there is a specific pattern of symptom clusters, and symptom clusters negatively influence the quality of life in patients with ovarian cancer. Identifying symptom clusters of ovarian cancer patients may have clinical implications in improving symptom management. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kharroubi, Adel; Gargouri, Dorra; Baati, Houda; Azri, Chafai
2012-06-01
Concentrations of selected heavy metals (Cd, Pb, Zn, Cu, Mn, and Fe) in surface sediments from 66 sites in both northern and eastern Mediterranean Sea-Boughrara lagoon exchange areas (southeastern Tunisia) were studied in order to understand current metal contamination due to the urbanization and economic development of nearby several coastal regions of the Gulf of Gabès. Multiple approaches were applied for the sediment quality assessment. These approaches were based on GIS coupled with chemometric methods (enrichment factors, geoaccumulation index, principal component analysis, and cluster analysis). Enrichment factors and principal component analysis revealed two distinct groups of metals. The first group corresponded to Fe and Mn derived from natural sources, and the second group contained Cd, Pb, Zn, and Cu originated from man-made sources. For these latter metals, cluster analysis showed two distinct distributions in the selected areas. They were attributed to temporal and spatial variations of contaminant sources input. The geoaccumulation index (I (geo)) values explained that only Cd, Pb, and Cu can be considered as moderate to extreme pollutants in the studied sediments.
Tchabo, William; Ma, Yongkun; Kwaw, Emmanuel; Zhang, Haining; Xiao, Lulu; Apaliya, Maurice T
2018-01-15
The four different methods of color measurement of wine proposed by Boulton, Giusti, Glories and Commission International de l'Eclairage (CIE) were applied to assess the statistical relationship between the phytochemical profile and chromatic characteristics of sulfur dioxide-free mulberry (Morus nigra) wine submitted to non-thermal maturation processes. The alteration in chromatic properties and phenolic composition of non-thermal aged mulberry wine were examined, aided by the used of Pearson correlation, cluster and principal component analysis. The results revealed a positive effect of non-thermal processes on phytochemical families of wines. From Pearson correlation analysis relationships between chromatic indexes and flavonols as well as anthocyanins were established. Cluster analysis highlighted similarities between Boulton and Giusti parameters, as well as Glories and CIE parameters in the assessment of chromatic properties of wines. Finally, principal component analysis was able to discriminate wines subjected to different maturation techniques on the basis of their chromatic and phenolics characteristics. Copyright © 2017. Published by Elsevier Ltd.
Visual target modulation of functional connectivity networks revealed by self-organizing group ICA.
van de Ven, Vincent; Bledowski, Christoph; Prvulovic, David; Goebel, Rainer; Formisano, Elia; Di Salle, Francesco; Linden, David E J; Esposito, Fabrizio
2008-12-01
We applied a data-driven analysis based on self-organizing group independent component analysis (sogICA) to fMRI data from a three-stimulus visual oddball task. SogICA is particularly suited to the investigation of the underlying functional connectivity and does not rely on a predefined model of the experiment, which overcomes some of the limitations of hypothesis-driven analysis. Unlike most previous applications of ICA in functional imaging, our approach allows the analysis of the data at the group level, which is of particular interest in high order cognitive studies. SogICA is based on the hierarchical clustering of spatially similar independent components, derived from single subject decompositions. We identified four main clusters of components, centered on the posterior cingulate, bilateral insula, bilateral prefrontal cortex, and right posterior parietal and prefrontal cortex, consistently across all participants. Post hoc comparison of time courses revealed that insula, prefrontal cortex and right fronto-parietal components showed higher activity for targets than for distractors. Activation for distractors was higher in the posterior cingulate cortex, where deactivation was observed for targets. While our results conform to previous neuroimaging studies, they also complement conventional results by showing functional connectivity networks with unique contributions to the task that were consistent across subjects. SogICA can thus be used to probe functional networks of active cognitive tasks at the group-level and can provide additional insights to generate new hypotheses for further study. Copyright 2007 Wiley-Liss, Inc.
Type 2 diabetes mellitus: distribution of genetic markers in Kazakh population
Sikhayeva, Nurgul; Talzhanov, Yerkebulan; Iskakova, Aisha; Dzharmukhanov, Jarkyn; Nugmanova, Raushan; Zholdybaeva, Elena; Ramanculov, Erlan
2018-01-01
Background Ethnic differences exist in the frequencies of genetic variations that contribute to the risk of common disease. This study aimed to analyse the distribution of several genes, previously associated with susceptibility to type 2 diabetes and obesity-related phenotypes, in a Kazakh population. Methods A total of 966 individuals belonging to the Kazakh ethnicity were recruited from an outpatient clinic. We genotyped 41 common single nucleotide polymorphisms (SNPs) previously associated with type 2 diabetes in other ethnic groups and 31 of these were in Hardy–Weinberg equilibrium. The obtained allele frequencies were further compared to publicly available data from other ethnic populations. Allele frequencies for other (compared) populations were pooled from the haplotype map (HapMap) database. Principal component analysis (PCA), cluster analysis, and multidimensional scaling (MDS) were used for the analysis of genetic relationship between the populations. Results Comparative analysis of allele frequencies of the studied SNPs showed significant differentiation among the studied populations. The Kazakh population was grouped with Asian populations according to the cluster analysis and with the Caucasian populations according to PCA. According to MDS, results of the current study show that the Kazakh population holds an intermediate position between Caucasian and Asian populations. Conclusion A high percentage of population differentiation was observed between Kazakh and world populations. The Kazakh population was clustered with Caucasian populations, and this result may indicate a significant Caucasian component in the Kazakh gene pool. PMID:29551892
Performance evaluation of PCA-based spike sorting algorithms.
Adamos, Dimitrios A; Kosmidis, Efstratios K; Theophilidis, George
2008-09-01
Deciphering the electrical activity of individual neurons from multi-unit noisy recordings is critical for understanding complex neural systems. A widely used spike sorting algorithm is being evaluated for single-electrode nerve trunk recordings. The algorithm is based on principal component analysis (PCA) for spike feature extraction. In the neuroscience literature it is generally assumed that the use of the first two or most commonly three principal components is sufficient. We estimate the optimum PCA-based feature space by evaluating the algorithm's performance on simulated series of action potentials. A number of modifications are made to the open source nev2lkit software to enable systematic investigation of the parameter space. We introduce a new metric to define clustering error considering over-clustering more favorable than under-clustering as proposed by experimentalists for our data. Both the program patch and the metric are available online. Correlated and white Gaussian noise processes are superimposed to account for biological and artificial jitter in the recordings. We report that the employment of more than three principal components is in general beneficial for all noise cases considered. Finally, we apply our results to experimental data and verify that the sorting process with four principal components is in agreement with a panel of electrophysiology experts.
Genetic diversity and relationship analysis of Gossypium arboreum accessions.
Liu, F; Zhou, Z L; Wang, C Y; Wang, Y H; Cai, X Y; Wang, X X; Zhang, Z S; Wang, K B
2015-11-19
Simple sequence repeat techniques were used to identify the genetic diversity of 101 Gossypium arboreum accessions collected from India, Vietnam, and the southwest of China (Guizhou, Guangxi, and Yunnan provinces). Twenty-six pairs of SSR primers produced a total of 103 polymorphic loci with an average of 3.96 polymorphic loci per primer. The average of the effective number of alleles, Nei's gene diversity, and Shannon's information index were 0.59, 0.2835, and 0.4361, respectively. The diversity varied among different geographic regions. The result of principal component analysis was consistent with that of unweighted pair group method with arithmetic mean clustering analysis. The 101 G. arboreum accessions were clustered into 2 groups.
Carlsen, Lars; Bruggemann, Rainer
2018-06-03
In chemistry there is a long tradition in classification. Usually methods are adopted from the wide field of cluster analysis. Here, based on the example of 21 alkyl anilines we show that also concepts taken out from the mathematical discipline of partially ordered sets may also be applied. The chemical compounds are described by a multi-indicator system. For the present study four indicators, mainly taken from the field of environmental chemistry were applied and a Hasse diagram was constructed. A Hasse diagram is an acyclic, transitively reduced, triangle free graph that may have several components. The crucial question is, whether or not the Hasse diagram can be interpreted from a structural chemical point of view. This is indeed the case, but it must be clearly stated that a guarantee for meaningful results in general cannot be given. For that further theoretical work is needed. Two cluster analysis methods are applied (K-means and a hierarchical cluster method). In both cases the partitioning of the set of 21 compounds by the component structure of the Hasse diagram appears to be better interpretable. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Evaluation of Low-Voltage Distribution Network Index Based on Improved Principal Component Analysis
NASA Astrophysics Data System (ADS)
Fan, Hanlu; Gao, Suzhou; Fan, Wenjie; Zhong, Yinfeng; Zhu, Lei
2018-01-01
In order to evaluate the development level of the low-voltage distribution network objectively and scientifically, chromatography analysis method is utilized to construct evaluation index model of low-voltage distribution network. Based on the analysis of principal component and the characteristic of logarithmic distribution of the index data, a logarithmic centralization method is adopted to improve the principal component analysis algorithm. The algorithm can decorrelate and reduce the dimensions of the evaluation model and the comprehensive score has a better dispersion degree. The clustering method is adopted to analyse the comprehensive score because the comprehensive score of the courts is concentrated. Then the stratification evaluation of the courts is realized. An example is given to verify the objectivity and scientificity of the evaluation method.
Recuerda, Maximilien; Périé, Delphine; Gilbert, Guillaume; Beaudoin, Gilles
2012-10-12
The treatment planning of spine pathologies requires information on the rigidity and permeability of the intervertebral discs (IVDs). Magnetic resonance imaging (MRI) offers great potential as a sensitive and non-invasive technique for describing the mechanical properties of IVDs. However, the literature reported small correlation coefficients between mechanical properties and MRI parameters. Our hypothesis is that the compressive modulus and the permeability of the IVD can be predicted by a linear combination of MRI parameters. Sixty IVDs were harvested from bovine tails, and randomly separated in four groups (in-situ, digested-6h, digested-18h, digested-24h). Multi-parametric MRI acquisitions were used to quantify the relaxation times T1 and T2, the magnetization transfer ratio MTR, the apparent diffusion coefficient ADC and the fractional anisotropy FA. Unconfined compression, confined compression and direct permeability measurements were performed to quantify the compressive moduli and the hydraulic permeabilities. Differences between groups were evaluated from a one way ANOVA. Multi linear regressions were performed between dependent mechanical properties and independent MRI parameters to verify our hypothesis. A principal component analysis was used to convert the set of possibly correlated variables into a set of linearly uncorrelated variables. Agglomerative Hierarchical Clustering was performed on the 3 principal components. Multilinear regressions showed that 45 to 80% of the Young's modulus E, the aggregate modulus in absence of deformation HA0, the radial permeability kr and the axial permeability in absence of deformation k0 can be explained by the MRI parameters within both the nucleus pulposus and the annulus pulposus. The principal component analysis reduced our variables to two principal components with a cumulative variability of 52-65%, which increased to 70-82% when considering the third principal component. The dendograms showed a natural division into four clusters for the nucleus pulposus and into three or four clusters for the annulus fibrosus. The compressive moduli and the permeabilities of isolated IVDs can be assessed mostly by MT and diffusion sequences. However, the relationships have to be improved with the inclusion of MRI parameters more sensitive to IVD degeneration. Before the use of this technique to quantify the mechanical properties of IVDs in vivo on patients suffering from various diseases, the relationships have to be defined for each degeneration state of the tissue that mimics the pathology. Our MRI protocol associated to principal component analysis and agglomerative hierarchical clustering are promising tools to classify the degenerated intervertebral discs and further find biomarkers and predictive factors of the evolution of the pathologies.
CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data.
Fidaner, Işık Barış; Cankorur-Cetinkaya, Ayca; Dikicioglu, Duygu; Kirdar, Betul; Cemgil, Ali Taylan; Oliver, Stephen G
2016-02-01
Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets. We present a novel statistical application called CLUSTERnGO, which uses a model-based clustering algorithm that fulfils this need. This algorithm involves two components of operation. Component 1 constructs a Bayesian non-parametric model (Infinite Mixture of Piecewise Linear Sequences) and Component 2, which applies a novel clustering methodology (Two-Stage Clustering). The software can also assign biological meaning to the identified clusters using an appropriate ontology. It applies multiple hypothesis testing to report the significance of these enrichments. The algorithm has a four-phase pipeline. The application can be executed using either command-line tools or a user-friendly Graphical User Interface. The latter has been developed to address the needs of both specialist and non-specialist users. We use three diverse test cases to demonstrate the flexibility of the proposed strategy. In all cases, CLUSTERnGO not only outperformed existing algorithms in assigning unique GO term enrichments to the identified clusters, but also revealed novel insights regarding the biological systems examined, which were not uncovered in the original publications. The C++ and QT source codes, the GUI applications for Windows, OS X and Linux operating systems and user manual are freely available for download under the GNU GPL v3 license at http://www.cmpe.boun.edu.tr/content/CnG. sgo24@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Chemometrics-based Approach in Analysis of Arnicae flos
Zheleva-Dimitrova, Dimitrina Zh.; Balabanova, Vessela; Gevrenova, Reneta; Doichinova, Irini; Vitkova, Antonina
2015-01-01
Introduction: Arnica montana flowers have a long history as herbal medicines for external use on injuries and rheumatic complaints. Objective: To investigate Arnicae flos of cultivated accessions from Bulgaria, Poland, Germany, Finland, and Pharmacy store for phenolic derivatives and sesquiterpene lactones (STLs). Materials and Methods: Samples of Arnica from nine origins were prepared by ultrasound-assisted extraction with 80% methanol for phenolic compounds analysis. Subsequent reverse-phase high-performance liquid chromatography (HPLC) separation of the analytes was performed using gradient elution and ultraviolet detection at 280 and 310 nm (phenolic acids), and 360 nm (flavonoids). Total STLs were determined in chloroform extracts by solid-phase extraction-HPLC at 225 nm. The HPLC generated chromatographic data were analyzed using principal component analysis (PCA) and hierarchical clustering (HC). Results: The highest total amount of phenolic acids was found in the sample from Botanical Garden at Joensuu University, Finland (2.36 mg/g dw). Astragalin, isoquercitrin, and isorhamnetin 3-glucoside were the main flavonol glycosides being present up to 3.37 mg/g (astragalin). Three well-defined clusters were distinguished by PCA and HC. Cluster C1 comprised of the German and Finnish accessions characterized by the highest content of flavonols. Cluster C2 included the Bulgarian and Polish samples presenting a low content of flavonoids. Cluster C3 consisted only of one sample from a pharmacy store. Conclusion: A validated HPLC method for simultaneous determination of phenolic acids, flavonoid glycosides, and aglycones in A. montana flowers was developed. The PCA loading plot showed that quercetin, kaempferol, and isorhamnetin can be used to distinguish different Arnica accessions. SUMMARY A principal component analysis (PCA) on 13 phenolic compounds and total amount of sesquiterpene lactones in Arnicae flos collection tended to cluster the studied 9 accessions into three main groups. The profiles obtained demonstrated that the samples from Germany and Finland are characterized by greater amounts of phenolic derivatives than the Bulgarian and Polish ones. The PCA loading plot showed that quercetin, kaemferol and isorhamnetin can be used to distinguish different arnica accessions. PMID:27013791
Network visualization of conformational sampling during molecular dynamics simulation.
Ahlstrom, Logan S; Baker, Joseph Lee; Ehrlich, Kent; Campbell, Zachary T; Patel, Sunita; Vorontsov, Ivan I; Tama, Florence; Miyashita, Osamu
2013-11-01
Effective data reduction methods are necessary for uncovering the inherent conformational relationships present in large molecular dynamics (MD) trajectories. Clustering algorithms provide a means to interpret the conformational sampling of molecules during simulation by grouping trajectory snapshots into a few subgroups, or clusters, but the relationships between the individual clusters may not be readily understood. Here we show that network analysis can be used to visualize the dominant conformational states explored during simulation as well as the connectivity between them, providing a more coherent description of conformational space than traditional clustering techniques alone. We compare the results of network visualization against 11 clustering algorithms and principal component conformer plots. Several MD simulations of proteins undergoing different conformational changes demonstrate the effectiveness of networks in reaching functional conclusions. Copyright © 2013 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Fučkar, Neven-Stjepan; Guemas, Virginie; Massonnet, François; Doblas-Reyes, Francisco
2015-04-01
Over the modern observational era, the northern hemisphere sea ice concentration, age and thickness have experienced a sharp long-term decline superimposed with strong internal variability. Hence, there is a crucial need to identify robust patterns of Arctic sea ice variability on interannual timescales and disentangle them from the long-term trend in noisy datasets. The principal component analysis (PCA) is a versatile and broadly used method for the study of climate variability. However, the PCA has several limiting aspects because it assumes that all modes of variability have symmetry between positive and negative phases, and suppresses nonlinearities by using a linear covariance matrix. Clustering methods offer an alternative set of dimension reduction tools that are more robust and capable of taking into account possible nonlinear characteristics of a climate field. Cluster analysis aggregates data into groups or clusters based on their distance, to simultaneously minimize the distance between data points in a given cluster and maximize the distance between the centers of the clusters. We extract modes of Arctic interannual sea-ice variability with nonhierarchical K-means cluster analysis and investigate the mechanisms leading to these modes. Our focus is on the sea ice thickness (SIT) as the base variable for clustering because SIT holds most of the climate memory for variability and predictability on interannual timescales. We primarily use global reconstructions of sea ice fields with a state-of-the-art ocean-sea-ice model, but we also verify the robustness of determined clusters in other Arctic sea ice datasets. Applied cluster analysis over the 1958-2013 period shows that the optimal number of detrended SIT clusters is K=3. Determined SIT cluster patterns and their time series of occurrence are rather similar between different seasons and months. Two opposite thermodynamic modes are characterized with prevailing negative or positive SIT anomalies over the Arctic basin. The intermediate mode, with negative anomalies centered on the East Siberian shelf and positive anomalies along the North American side of the basin, has predominately dynamic characteristics. The associated sea ice concentration (SIC) clusters vary more between different seasons and months, but the SIC patterns are physically framed by the SIT cluster patterns.
Stuckey, Bronwyn G A; Opie, Nicole; Cussons, Andrea J; Watts, Gerald F; Burke, Valerie
2014-08-01
Polycystic ovary syndrome (PCOS) is a prevalent condition with heterogeneity of clinical features and cardiovascular risk factors that implies multiple aetiological factors and possible outcomes. To reduce a set of correlated variables to a smaller number of uncorrelated and interpretable factors that may delineate subgroups within PCOS or suggest pathogenetic mechanisms. We used principal component analysis (PCA) to examine the endocrine and cardiometabolic variables associated with PCOS defined by the National Institutes of Health (NIH) criteria. Data were retrieved from the database of a single clinical endocrinologist. We included women with PCOS (N = 378) who were not taking the oral contraceptive pill or other sex hormones, lipid lowering medication, metformin or other medication that could influence the variables of interest. PCA was performed retaining those factors with eigenvalues of at least 1.0. Varimax rotation was used to produce interpretable factors. We identified three principal components. In component 1, the dominant variables were homeostatic model assessment (HOMA) index, body mass index (BMI), high density lipoprotein (HDL) cholesterol and sex hormone binding globulin (SHBG); in component 2, systolic blood pressure, low density lipoprotein (LDL) cholesterol and triglycerides; in component 3, total testosterone and LH/FSH ratio. These components explained 37%, 13% and 11% of the variance in the PCOS cohort respectively. Multiple correlated variables from patients with PCOS can be reduced to three uncorrelated components characterised by insulin resistance, dyslipidaemia/hypertension or hyperandrogenaemia. Clustering of risk factors is consistent with different pathogenetic pathways within PCOS and/or differing cardiometabolic outcomes. Copyright © 2014 Elsevier Inc. All rights reserved.
Finding reproducible cluster partitions for the k-means algorithm
2013-01-01
K-means clustering is widely used for exploratory data analysis. While its dependence on initialisation is well-known, it is common practice to assume that the partition with lowest sum-of-squares (SSQ) total i.e. within cluster variance, is both reproducible under repeated initialisations and also the closest that k-means can provide to true structure, when applied to synthetic data. We show that this is generally the case for small numbers of clusters, but for values of k that are still of theoretical and practical interest, similar values of SSQ can correspond to markedly different cluster partitions. This paper extends stability measures previously presented in the context of finding optimal values of cluster number, into a component of a 2-d map of the local minima found by the k-means algorithm, from which not only can values of k be identified for further analysis but, more importantly, it is made clear whether the best SSQ is a suitable solution or whether obtaining a consistently good partition requires further application of the stability index. The proposed method is illustrated by application to five synthetic datasets replicating a real world breast cancer dataset with varying data density, and a large bioinformatics dataset. PMID:23369085
Finding reproducible cluster partitions for the k-means algorithm.
Lisboa, Paulo J G; Etchells, Terence A; Jarman, Ian H; Chambers, Simon J
2013-01-01
K-means clustering is widely used for exploratory data analysis. While its dependence on initialisation is well-known, it is common practice to assume that the partition with lowest sum-of-squares (SSQ) total i.e. within cluster variance, is both reproducible under repeated initialisations and also the closest that k-means can provide to true structure, when applied to synthetic data. We show that this is generally the case for small numbers of clusters, but for values of k that are still of theoretical and practical interest, similar values of SSQ can correspond to markedly different cluster partitions. This paper extends stability measures previously presented in the context of finding optimal values of cluster number, into a component of a 2-d map of the local minima found by the k-means algorithm, from which not only can values of k be identified for further analysis but, more importantly, it is made clear whether the best SSQ is a suitable solution or whether obtaining a consistently good partition requires further application of the stability index. The proposed method is illustrated by application to five synthetic datasets replicating a real world breast cancer dataset with varying data density, and a large bioinformatics dataset.
Mantle, Peter; Modalca, Mirela; Nicholls, Andrew; Tatu, Calin; Tatu, Diana; Toncheva, Draga
2011-01-01
1H NMR spectroscopy of urine has been applied to exploring metabolomic differences between people diagnosed with Balkan endemic nephropathy (BEN), and treated by haemodialysis, and those without overt renal disease in Romania and Bulgaria. Convenience sampling was made from patients receiving haemodialysis in hospital and healthy controls in their village. Principal component analysis clustered healthy controls from both countries together. Bulgarian BEN patients clustered separately from controls, though in the same space. However, Romanian BEN patients not only also clustered away from controls but also clustered separately from the BEN patients in Bulgaria. Notably, the urinary metabolomic data of two people sampled as Romanian controls clustered within the Romanian BEN group. One of these had been suspected of incipient symptoms of BEN at the time of selection as a ‘healthy’ control. This implies, at first sight, that metabolomic analysis can be predictive of impending morbidity before conventional criteria can diagnose BEN. Separate clustering of BEN patients from Romania and Bulgaria could indicate difference in aetiology of this particular silent renal atrophy in different geographic foci across the Balkans. PMID:22069742
Žák, A; Burda, M; Vecka, M; Zeman, M; Tvrzická, E; Staňková, B
2014-01-01
Dietary composition and metabolism of fatty acids (FA) influence insulin resistance, atherogenic dyslipidemia and other components of the metabolic syndrome (MS). It is known that patients with MS exhibit a heterogeneous phenotype; however, the relationships of individual FA to MS components have not yet been consistently studied. We examined the plasma phosphatidylcholine FA composition of 166 individuals (68F/98M) with MS and of 188 (87F/101M) controls. Cluster analysis of FA divided the groups into two clusters. In cluster 1, there were 65.7 % of MS patients and 37.8 % of controls, cluster 2 contained 34.3 % of patients and 62.2 % of controls (P<0.001). Those with MS within cluster 1 (MS1) differed from individuals with MS in cluster 2 (MS2) by concentrations of glucose (P<0.05), NEFA (P<0.001), HOMA-IR (P<0.05), and levels of conjugated dienes in LDL (P<0.05). The FA composition in MS1 group differed from MS2 by higher contents of palmitoleic (+30 %), gamma-linolenic (+22 %), dihomo-gamma-linolenic (+9 %) acids and by a lower content of linoleic acid (-25 %) (all P<0.01). These FA patterns are supposed to be connected with the progression and/or impaired biochemical measures of MS (lipolysis, oxidative stress, dysglycidemia, and insulin resistance).
Ecological characteristics of Simulium breeding sites in West Africa.
Cheke, Robert A; Young, Stephen; Garms, Rolf
2017-03-01
Twenty-nine taxa of Simulium were identified amongst 527 collections of larvae and pupae from untreated rivers and streams in Liberia (362 collections in 1967-71 & 1989), Togo (125 in 1979-81), Benin (35 in 1979-81) and Ghana (5 in 1980-81). Presence or absence of associations between different taxa were used to group them into six clusters using Ward agglomerative hierarchical cluster analysis. Environmental data associated with the pre-imaginal habitats were then analysed in relation to the six clusters by one way ANOVA. The results revealed significant effects in determining the clusters of maximum river width (all P<0.001 unless stated otherwise), water temperature, dry bulb air temperature, relative humidity, altitude, type of water (on a range from trickle to large river), water level, slope, current, vegetation, light conditions, discharge, length of breeding area, environs, terrain, river bed type (P<0.01), and the supports to which the insects were attached (P<0.01). When four non-significant contributors (wet bulb temperature, river features, height of waterfall and depth) were excluded and the reduced data-set analysed by principal components analysis (PCA), the first two principal components (PCs) accounted for 87% of the variance, with geographical features dominant in PC1 and hydrological characteristics in PC2. The analyses also revealed the ecological characteristics of each taxon's pre-imaginal habitats, which are discussed with particular reference to members of the Simulium damnosum species complex, whose breeding site distributions were further analysed by canonical correspondence analysis (CCA), a method also applied to the data on non-vector species. Copyright © 2016 Elsevier B.V. All rights reserved.
Phung, Dung; Huang, Cunrui; Rutherford, Shannon; Dwirahmadi, Febi; Chu, Cordia; Wang, Xiaoming; Nguyen, Minh; Nguyen, Nga Huy; Do, Cuong Manh; Nguyen, Trung Hieu; Dinh, Tuan Anh Diep
2015-05-01
The present study is an evaluation of temporal/spatial variations of surface water quality using multivariate statistical techniques, comprising cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA). Eleven water quality parameters were monitored at 38 different sites in Can Tho City, a Mekong Delta area of Vietnam from 2008 to 2012. Hierarchical cluster analysis grouped the 38 sampling sites into three clusters, representing mixed urban-rural areas, agricultural areas and industrial zone. FA/PCA resulted in three latent factors for the entire research location, three for cluster 1, four for cluster 2, and four for cluster 3 explaining 60, 60.2, 80.9, and 70% of the total variance in the respective water quality. The varifactors from FA indicated that the parameters responsible for water quality variations are related to erosion from disturbed land or inflow of effluent from sewage plants and industry, discharges from wastewater treatment plants and domestic wastewater, agricultural activities and industrial effluents, and contamination by sewage waste with faecal coliform bacteria through sewer and septic systems. Discriminant analysis (DA) revealed that nephelometric turbidity units (NTU), chemical oxygen demand (COD) and NH₃ are the discriminating parameters in space, affording 67% correct assignation in spatial analysis; pH and NO₂ are the discriminating parameters according to season, assigning approximately 60% of cases correctly. The findings suggest a possible revised sampling strategy that can reduce the number of sampling sites and the indicator parameters responsible for large variations in water quality. This study demonstrates the usefulness of multivariate statistical techniques for evaluation of temporal/spatial variations in water quality assessment and management.
Cluster tool solution for fabrication and qualification of advanced photomasks
NASA Astrophysics Data System (ADS)
Schaetz, Thomas; Hartmann, Hans; Peter, Kai; Lalanne, Frederic P.; Maurin, Olivier; Baracchi, Emanuele; Miramond, Corinne; Brueck, Hans-Juergen; Scheuring, Gerd; Engel, Thomas; Eran, Yair; Sommer, Karl
2000-07-01
The reduction of wavelength in optical lithography, phase shift technology and optical proximity correction (OPC), requires a rapid increase in cost effective qualification of photomasks. The knowledge about CD variation, loss of pattern fidelity especially for OPC pattern and mask defects concerning the impact on wafer level is becoming a key issue for mask quality assessment. As part of the European Community supported ESPRIT projection 'Q-CAP', a new cluster concept has been developed, which allows the combination of hardware tools as well as software tools via network communication. It is designed to be open for any tool manufacturer and mask hose. The bi-directional network access allows the exchange of all relevant mask data including grayscale images, measurement results, lithography parameters, defect coordinates, layout data, process data etc. and its storage to a SQL database. The system uses SEMI format descriptions as well as standard network hardware and software components for the client server communication. Each tool is used mainly to perform its specific application without using expensive time to perform optional analysis, but the availability of the database allows each component to share the full data ste gathered by all components. Therefore, the cluster can be considered as one single virtual tool. The paper shows the advantage of the cluster approach, the benefits of the tools linked together already, and a vision of a mask house in the near future.
Goekoop, Rutger; Goekoop, Jaap G
2014-01-01
The vast number of psychopathological syndromes that can be observed in clinical practice can be described in terms of a limited number of elementary syndromes that are differentially expressed. Previous attempts to identify elementary syndromes have shown limitations that have slowed progress in the taxonomy of psychiatric disorders. To examine the ability of network community detection (NCD) to identify elementary syndromes of psychopathology and move beyond the limitations of current classification methods in psychiatry. 192 patients with unselected mental disorders were tested on the Comprehensive Psychopathological Rating Scale (CPRS). Principal component analysis (PCA) was performed on the bootstrapped correlation matrix of symptom scores to extract the principal component structure (PCS). An undirected and weighted network graph was constructed from the same matrix. Network community structure (NCS) was optimized using a previously published technique. In the optimal network structure, network clusters showed a 89% match with principal components of psychopathology. Some 6 network clusters were found, including "Depression", "Mania", "Anxiety", "Psychosis", "Retardation", and "Behavioral Disorganization". Network metrics were used to quantify the continuities between the elementary syndromes. We present the first comprehensive network graph of psychopathology that is free from the biases of previous classifications: a 'Psychopathology Web'. Clusters within this network represent elementary syndromes that are connected via a limited number of bridge symptoms. Many problems of previous classifications can be overcome by using a network approach to psychopathology.
Keshtkaran, Mohammad Reza; Yang, Zhi
2017-06-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.
NASA Astrophysics Data System (ADS)
Keshtkaran, Mohammad Reza; Yang, Zhi
2017-06-01
Objective. Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. Approach. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Main results. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. Significance. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.
Hierarchical clustering using mutual information
NASA Astrophysics Data System (ADS)
Kraskov, A.; Stögbauer, H.; Andrzejak, R. G.; Grassberger, P.
2005-04-01
We present a conceptually simple method for hierarchical clustering of data called mutual information clustering (MIC) algorithm. It uses mutual information (MI) as a similarity measure and exploits its grouping property: The MI between three objects X, Y, and Z is equal to the sum of the MI between X and Y, plus the MI between Z and the combined object (XY). We use this both in the Shannon (probabilistic) version of information theory and in the Kolmogorov (algorithmic) version. We apply our method to the construction of phylogenetic trees from mitochondrial DNA sequences and to the output of independent components analysis (ICA) as illustrated with the ECG of a pregnant woman.
Calculating the Motion and Direction of Flux Transfer Events with Cluster
NASA Technical Reports Server (NTRS)
Collado-Vega, Yaireska M.; Sibeck, David Gary
2011-01-01
We use multi-point timing analysis to determine the orientation and motion of flux transfer events (FTEs) detected by the four Cluster spacecraft on the high-latitude dayside and flank magnetopause during 2002 and 2003. During these years, the distances between the Cluster spacecraft were greater than 1000 km, providing the tetrahedral configuration needed to select events and determine velocities. Each velocity and location will be examined in detail and compared to the velocities and locations determined by the predictions of the component and antiparallel reconnection models for event formation, orientation, motion, and acceleration for a wide range of spacecraft locations and solar wind conditions.
Burnett, Andrew D; Fan, Wenhui; Upadhya, Prashanth C; Cunningham, John E; Hargreaves, Michael D; Munshi, Tasnim; Edwards, Howell G M; Linfield, Edmund H; Davies, A Giles
2009-08-01
Terahertz frequency time-domain spectroscopy has been used to analyse a wide range of samples containing cocaine hydrochloride, heroin and ecstasy--common drugs-of-abuse. We investigated real-world samples seized by law enforcement agencies, together with pure drugs-of-abuse, and pure drugs-of-abuse systematically adulterated in the laboratory to emulate real-world samples. In order to investigate the feasibility of automatic spectral recognition of such illicit materials by terahertz spectroscopy, principal component analysis was employed to cluster spectra of similar compounds.
Jiang, Shun-Yuan; Sun, Hong-Bing; Sun, Hui; Ma, Yu-Ying; Chen, Hong-Yu; Zhu, Wen-Tao; Zhou, Yi
2016-03-01
This paper aims to explore a comprehensive assessment method combined traditional Chinese medicinal material specifications with quantitative quality indicators. Seventy-six samples of Notopterygii Rhizoma et Radix were collected on market and at producing areas. Traditional commercial specifications were described and assigned, and 10 chemical components and volatile oils were determined for each sample. Cluster analysis, Fisher discriminant analysis and correspondence analysis were used to establish the relationship between the traditional qualitative commercial specifications and quantitative chemical indices for comprehensive evaluating quality of medicinal materials, and quantitative classification of commercial grade and quality grade. A herb quality index (HQI) including traditional commercial specifications and chemical components for quantitative grade classification were established, and corresponding discriminant function were figured out for precise determination of quality grade and sub-grade of Notopterygii Rhizoma et Radix. The result showed that notopterol, isoimperatorin and volatile oil were the major components for determination of chemical quality, and their dividing values were specified for every grade and sub-grade of the commercial materials of Notopterygii Rhizoma et Radix. According to the result, essential relationship between traditional medicinal indicators, qualitative commercial specifications, and quantitative chemical composition indicators can be examined by K-mean cluster, Fisher discriminant analysis and correspondence analysis, which provide a new method for comprehensive quantitative evaluation of traditional Chinese medicine quality integrated traditional commodity specifications and quantitative modern chemical index. Copyright© by the Chinese Pharmaceutical Association.
NASA Astrophysics Data System (ADS)
Jee, M. James; Hughes, John P.; Menanteau, Felipe; Sifón, Cristóbal; Mandelbaum, Rachel; Barrientos, L. Felipe; Infante, Leopoldo; Ng, Karen Y.
2014-04-01
We present a Hubble Space Telescope weak-lensing study of the merging galaxy cluster "El Gordo" (ACT-CL J0102-4915) at z = 0.87 discovered by the Atacama Cosmology Telescope (ACT) collaboration as the strongest Sunyaev-Zel'dovich decrement in its ~1000 deg2 survey. Our weak-lensing analysis confirms that ACT-CL J0102-4915 is indeed an extreme system consisting of two massive (gsim 1015 M ⊙ each) subclusters with a projected separation of {\\sim }0.7\\,h_{70}^{-1} Mpc. This binary mass structure revealed by our lensing study is consistent with the cluster galaxy distribution and the dynamical study carried out with 89 spectroscopic members. We estimate the mass of ACT-CL J0102-4915 by simultaneously fitting two axisymmetric Navarro-Frenk-White (NFW) profiles allowing their centers to vary. We use only a single parameter for the NFW mass profile by enforcing the mass-concentration relation from numerical simulations. Our Markov-Chain-Monte-Carlo analysis shows that the masses of the northwestern (NW) and the southeastern (SE) components are M_{200c}=(1.38+/- 0.22)\\times 10^{15} \\,h_{70}^{-1}\\, M_{\\odot } and (0.78+/- 0.20)\\times 10^{15} \\,h_{70}^{-1}\\, M_{\\odot }, respectively, where the quoted errors include only 1σ statistical uncertainties determined by the finite number of source galaxies. These mass estimates are subject to additional uncertainties (20%-30%) due to the possible presence of triaxiality, correlated/uncorrelated large scale structure, and departure of the cluster profile from the NFW model. The lensing-based velocity dispersions are 1133_{-61}^{+58}\\; km\\; s^{-1} and 1064_{-66} ^{+62}\\; km\\; s^{-1} for the NW and SE components, respectively, which are consistent with their spectroscopic measurements (1290 ± 134 km s-1 and 1089 ± 200 km s-1, respectively). The centroids of both components are tightly constrained (~4'') and close to the optical luminosity centers. The X-ray and mass peaks are spatially offset by ~8'' ({\\sim }62\\,h_{70}^{-1} kpc), which is significant at the ~2σ confidence level. The mass peak, however, does not lead the gas peak in the direction expected if we are viewing the cluster soon after first core passage during a high speed merger. Under the assumption that the merger is happening in the plane of the sky, extrapolation of the two NFW halos to a radius r_{200a}=2.4\\,h_{70}^{-1} Mpc yields a combined mass of M_{200a}=(3.13+/- 0.56)\\times 10^{15}\\,h_{70}^{-1}\\, M_{\\odot }. This extrapolated total mass is consistent with our two-component-based dynamical analysis and previous X-ray measurements, projecting ACT-CL J0102-4915 to be the most massive cluster at z > 0.6 known to date.
Pes, Giovanni Mario; Delitala, Alessandro Palmerio; Errigo, Alessandra; Delitala, Giuseppe; Dore, Maria Pina
2016-06-01
Latent autoimmune diabetes in adults (LADA) which accounts for more than 10 % of all cases of diabetes is characterized by onset after age 30, absence of ketoacidosis, insulin independence for at least 6 months, and presence of circulating islet-cell antibodies. Its marked heterogeneity in clinical features and immunological markers suggests the existence of multiple mechanisms underlying its pathogenesis. The principal component (PC) analysis is a statistical approach used for finding patterns in data of high dimension. In this study the PC analysis was applied to a set of variables from a cohort of Sardinian LADA patients to identify a smaller number of latent patterns. A list of 11 variables including clinical (gender, BMI, lipid profile, systolic and diastolic blood pressure and insulin-free time period), immunological (anti-GAD65, anti-IA-2 and anti-TPO antibody titers) and genetic features (predisposing gene variants previously identified as risk factors for autoimmune diabetes) retrieved from clinical records of 238 LADA patients referred to the Internal Medicine Unit of University of Sassari, Italy, were analyzed by PC analysis. The predictive value of each PC on the further development of insulin dependence was evaluated using Kaplan-Meier curves. Overall 4 clusters were identified by PC analysis. In component PC-1, the dominant variables were: BMI, triglycerides, systolic and diastolic blood pressure and duration of insulin-free time period; in PC-2: genetic variables such as Class II HLA, CTLA-4 as well as anti-GAD65, anti-IA-2 and anti-TPO antibody titers, and the insulin-free time period predominated; in PC-3: gender and triglycerides; and in PC-4: total cholesterol. These components explained 18, 15, 12, and 12 %, respectively, of the total variance in the LADA cohort. The predictive power of insulin dependence of the four components was different. PC-2 (characterized mostly by high antibody titers and presence of predisposing genetic markers) showed a faster beta-cells failure and PC-3 (characterized mostly by gender and high triglycerides) and PC-4 (high cholesterol) showed a slower beta-cells failure. PC-1 (including dislipidemia and other metabolic dysfunctions), showed a mild beta-cells failure. In conclusion variable clustering might be consistent with different pathogenic pathways and/or distinct immune mechanisms in LADA and could potentially help physicians improve the clinical management of these patients.
ERIC Educational Resources Information Center
Meulman, Jacqueline J.; Verboon, Peter
1993-01-01
Points of view analysis, as a way to deal with individual differences in multidimensional scaling, was largely supplanted by the weighted Euclidean model. It is argued that the approach deserves new attention, especially as a technique to analyze group differences. A streamlined and integrated process is proposed. (SLD)
A semi-supervised classification algorithm using the TAD-derived background as training data
NASA Astrophysics Data System (ADS)
Fan, Lei; Ambeau, Brittany; Messinger, David W.
2013-05-01
In general, spectral image classification algorithms fall into one of two categories: supervised and unsupervised. In unsupervised approaches, the algorithm automatically identifies clusters in the data without a priori information about those clusters (except perhaps the expected number of them). Supervised approaches require an analyst to identify training data to learn the characteristics of the clusters such that they can then classify all other pixels into one of the pre-defined groups. The classification algorithm presented here is a semi-supervised approach based on the Topological Anomaly Detection (TAD) algorithm. The TAD algorithm defines background components based on a mutual k-Nearest Neighbor graph model of the data, along with a spectral connected components analysis. Here, the largest components produced by TAD are used as regions of interest (ROI's),or training data for a supervised classification scheme. By combining those ROI's with a Gaussian Maximum Likelihood (GML) or a Minimum Distance to the Mean (MDM) algorithm, we are able to achieve a semi supervised classification method. We test this classification algorithm against data collected by the HyMAP sensor over the Cooke City, MT area and University of Pavia scene.
NASA Astrophysics Data System (ADS)
Forbes, Angus; Villegas, Javier; Almryde, Kyle R.; Plante, Elena
2014-03-01
In this paper, we present a novel application, 3D+Time Brain View, for the stereoscopic visualization of functional Magnetic Resonance Imaging (fMRI) data gathered from participants exposed to unfamiliar spoken languages. An analysis technique based on Independent Component Analysis (ICA) is used to identify statistically significant clusters of brain activity and their changes over time during different testing sessions. That is, our system illustrates the temporal evolution of participants' brain activity as they are introduced to a foreign language through displaying these clusters as they change over time. The raw fMRI data is presented as a stereoscopic pair in an immersive environment utilizing passive stereo rendering. The clusters are presented using a ray casting technique for volume rendering. Our system incorporates the temporal information and the results of the ICA into the stereoscopic 3D rendering, making it easier for domain experts to explore and analyze the data.
Unsupervised spike sorting based on discriminative subspace learning.
Keshtkaran, Mohammad Reza; Yang, Zhi
2014-01-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. In this paper, we present two unsupervised spike sorting algorithms based on discriminative subspace learning. The first algorithm simultaneously learns the discriminative feature subspace and performs clustering. It uses histogram of features in the most discriminative projection to detect the number of neurons. The second algorithm performs hierarchical divisive clustering that learns a discriminative 1-dimensional subspace for clustering in each level of the hierarchy until achieving almost unimodal distribution in the subspace. The algorithms are tested on synthetic and in-vivo data, and are compared against two widely used spike sorting methods. The comparative results demonstrate that our spike sorting methods can achieve substantially higher accuracy in lower dimensional feature space, and they are highly robust to noise. Moreover, they provide significantly better cluster separability in the learned subspace than in the subspace obtained by principal component analysis or wavelet transform.
Custelcean, Radu; Williams, Neil J.; Seipp, Charles A.; ...
2015-12-18
Quantitative removal of sulfate from seawater was achieved by selective crystallization of the anion with a bis(guanidinium) ligand self-assembled in situ through imine condensation of simple components. The resulting crystalline salt has an exceptionally low aqueous solubility, on a par with BaSO 4. Single-crystal X-ray diffraction analysis revealed pairs of sulfate anions clustered together with four water molecules within the crystals.
Lindsey, Cary R.; Neupane, Ghanashym; Spycher, Nicolas; ...
2018-01-03
Although many Known Geothermal Resource Areas in Oregon and Idaho were identified during the 1970s and 1980s, few were subsequently developed commercially. Because of advances in power plant design and energy conversion efficiency since the 1980s, some previously identified KGRAs may now be economically viable prospects. Unfortunately, available characterization data vary widely in accuracy, precision, and granularity, making assessments problematic. In this paper, we suggest a procedure for comparing test areas against proven resources using Principal Component Analysis and cluster identification. The result is a low-cost tool for evaluating potential exploration targets using uncertain or incomplete data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lindsey, Cary R.; Neupane, Ghanashym; Spycher, Nicolas
Although many Known Geothermal Resource Areas in Oregon and Idaho were identified during the 1970s and 1980s, few were subsequently developed commercially. Because of advances in power plant design and energy conversion efficiency since the 1980s, some previously identified KGRAs may now be economically viable prospects. Unfortunately, available characterization data vary widely in accuracy, precision, and granularity, making assessments problematic. In this paper, we suggest a procedure for comparing test areas against proven resources using Principal Component Analysis and cluster identification. The result is a low-cost tool for evaluating potential exploration targets using uncertain or incomplete data.
NASA Astrophysics Data System (ADS)
Zhou, Jiangying; Lopresti, Daniel P.; Tasdizen, Tolga
1998-04-01
In this paper, we consider the problem of locating and extracting text from WWW images. A previous algorithm based on color clustering and connected components analysis works well as long as the color of each character is relatively uniform and the typography is fairly simple. It breaks down quickly, however, when these assumptions are violated. In this paper, we describe more robust techniques for dealing with this challenging problem. We present an improved color clustering algorithm that measures similarity based on both RGB and spatial proximity. Layout analysis is also incorporated to handle more complex typography. THese changes significantly enhance the performance of our text detection procedure.
Fleming, Brandon J.; LaMotte, Andrew E.; Sekellick, Andrew J.
2013-01-01
Hydrogeologic regions in the fractured rock area of Maryland were classified using geographic information system tools with principal components and cluster analyses. A study area consisting of the 8-digit Hydrologic Unit Code (HUC) watersheds with rivers that flow through the fractured rock area of Maryland and bounded by the Fall Line was further subdivided into 21,431 catchments from the National Hydrography Dataset Plus. The catchments were then used as a common hydrologic unit to compile relevant climatic, topographic, and geologic variables. A principal components analysis was performed on 10 input variables, and 4 principal components that accounted for 83 percent of the variability in the original data were identified. A subsequent cluster analysis grouped the catchments based on four principal component scores into six hydrogeologic regions. Two crystalline rock hydrogeologic regions, including large parts of the Washington, D.C. and Baltimore metropolitan regions that represent over 50 percent of the fractured rock area of Maryland, are distinguished by differences in recharge, Precipitation minus Potential Evapotranspiration, sand content in soils, and groundwater contributions to streams. This classification system will provide a georeferenced digital hydrogeologic framework for future investigations of groundwater availability in the fractured rock area of Maryland.
NASA Astrophysics Data System (ADS)
Mantini, D.; Hild, K. E., II; Alleva, G.; Comani, S.
2006-02-01
Independent component analysis (ICA) algorithms have been successfully used for signal extraction tasks in the field of biomedical signal processing. We studied the performances of six algorithms (FastICA, CubICA, JADE, Infomax, TDSEP and MRMI-SIG) for fetal magnetocardiography (fMCG). Synthetic datasets were used to check the quality of the separated components against the original traces. Real fMCG recordings were simulated with linear combinations of typical fMCG source signals: maternal and fetal cardiac activity, ambient noise, maternal respiration, sensor spikes and thermal noise. Clusters of different dimensions (19, 36 and 55 sensors) were prepared to represent different MCG systems. Two types of signal-to-interference ratios (SIR) were measured. The first involves averaging over all estimated components and the second is based solely on the fetal trace. The computation time to reach a minimum of 20 dB SIR was measured for all six algorithms. No significant dependency on gestational age or cluster dimension was observed. Infomax performed poorly when a sub-Gaussian source was included; TDSEP and MRMI-SIG were sensitive to additive noise, whereas FastICA, CubICA and JADE showed the best performances. Of all six methods considered, FastICA had the best overall performance in terms of both separation quality and computation times.
NASA Astrophysics Data System (ADS)
Yamashita, S.; Nakajo, T.; Naruse, H.
2009-12-01
In this study, we statistically classified the grain size distribution of the bottom surface sediment on a microtidal sand flat to analyze the depositional processes of the sediment. Multiple classification analysis revealed that two types of sediment populations exist in the bottom surface sediment. Then, we employed the sediment trend model developed by Gao and Collins (1992) for the estimation of sediment transport pathways. As a result, we found that statistical discrimination of the bottom surface sediment provides useful information for the sediment trend model while dealing with various types of sediment transport processes. The microtidal sand flat along the Kushida River estuary, Ise Bay, central Japan, was investigated, and 102 bottom surface sediment samples were obtained. Then, their grain size distribution patterns were measured by the settling tube method, and each grain size distribution parameter (mud and gravel contents, mean grain size, coefficient of variance (CV), skewness, kurtosis, 5, 25, 50, 75, and 95 percentile) was calculated. Here, CV is the normalized sorting value divided by the mean grain size. Two classical statistical methods—principal component analysis (PCA) and fuzzy cluster analysis—were applied. The results of PCA showed that the bottom surface sediment of the study area is mainly characterized by grain size (mean grain size and 5-95 percentile) and the CV value, indicating predominantly large absolute values of factor loadings in primal component (PC) 1. PC1 is interpreted as being indicative of the grain-size trend, in which a finer grain-size distribution indicates better size sorting. The frequency distribution of PC1 has a bimodal shape and suggests the existence of two types of sediment populations. Therefore, we applied fuzzy cluster analysis, the results of which revealed two groupings of the sediment (Cluster 1 and Cluster 2). Cluster 1 shows a lower value of PC1, indicating coarse and poorly sorted sediments. Cluster 1 sediments are distributed around the branched channel from Kushida River and show an expanding distribution from the river mouth toward the northeast direction. Cluster 2 shows a higher value of PC1, indicating fine and well-sorted sediments; this cluster is distributed in a distant area from the river mouth, including the offshore region. Therefore, Cluster 1 and Cluster 2 are interpreted as being deposited by fluvial and wave processes, respectively. Finally, on the basis of this distribution pattern, the sediment trend model was applied in areas dominated separately by fluvial and wave processes. Resultant sediment transport patterns showed good agreement with those obtained by field observations. The results of this study provide an important insight into the numerical models of sediment transport.
Colour image segmentation using unsupervised clustering technique for acute leukemia images
NASA Astrophysics Data System (ADS)
Halim, N. H. Abd; Mashor, M. Y.; Nasir, A. S. Abdul; Mustafa, N.; Hassan, R.
2015-05-01
Colour image segmentation has becoming more popular for computer vision due to its important process in most medical analysis tasks. This paper proposes comparison between different colour components of RGB(red, green, blue) and HSI (hue, saturation, intensity) colour models that will be used in order to segment the acute leukemia images. First, partial contrast stretching is applied on leukemia images to increase the visual aspect of the blast cells. Then, an unsupervised moving k-means clustering algorithm is applied on the various colour components of RGB and HSI colour models for the purpose of segmentation of blast cells from the red blood cells and background regions in leukemia image. Different colour components of RGB and HSI colour models have been analyzed in order to identify the colour component that can give the good segmentation performance. The segmented images are then processed using median filter and region growing technique to reduce noise and smooth the images. The results show that segmentation using saturation component of HSI colour model has proven to be the best in segmenting nucleus of the blast cells in acute leukemia image as compared to the other colour components of RGB and HSI colour models.
Cross-correlating the γ-ray Sky with Catalogs of Galaxy Clusters
NASA Astrophysics Data System (ADS)
Branchini, Enzo; Camera, Stefano; Cuoco, Alessandro; Fornengo, Nicolao; Regis, Marco; Viel, Matteo; Xia, Jun-Qing
2017-01-01
We report the detection of a cross-correlation signal between Fermi Large Area Telescope diffuse γ-ray maps and catalogs of clusters. In our analysis, we considered three different catalogs: WHL12, redMaPPer, and PlanckSZ. They all show a positive correlation with different amplitudes, related to the average mass of the objects in each catalog, which also sets the catalog bias. The signal detection is confirmed by the results of a stacking analysis. The cross-correlation signal extends to rather large angular scales, around 1°, that correspond, at the typical redshift of the clusters in these catalogs, to a few to tens of megaparsecs, I.e., the typical scale-length of the large-scale structures in the universe. Most likely this signal is contributed by the cumulative emission from active galactic nuclei (AGNs) associated with the filamentary structures that converge toward the high peaks of the matter density field in which galaxy clusters reside. In addition, our analysis reveals the presence of a second component, more compact in size and compatible with a point-like emission from within individual clusters. At present, we cannot distinguish between the two most likely interpretations for such a signal, I.e., whether it is produced by AGNs inside clusters or if it is a diffuse γ-ray emission from the intracluster medium. We argue that this latter, intriguing, hypothesis might be tested by applying this technique to a low-redshift large-mass cluster sample.
Metsalu, Tauno; Vilo, Jaak
2015-01-01
The Principal Component Analysis (PCA) is a widely used method of reducing the dimensionality of high-dimensional data, often followed by visualizing two of the components on the scatterplot. Although widely used, the method is lacking an easy-to-use web interface that scientists with little programming skills could use to make plots of their own data. The same applies to creating heatmaps: it is possible to add conditional formatting for Excel cells to show colored heatmaps, but for more advanced features such as clustering and experimental annotations, more sophisticated analysis tools have to be used. We present a web tool called ClustVis that aims to have an intuitive user interface. Users can upload data from a simple delimited text file that can be created in a spreadsheet program. It is possible to modify data processing methods and the final appearance of the PCA and heatmap plots by using drop-down menus, text boxes, sliders etc. Appropriate defaults are given to reduce the time needed by the user to specify input parameters. As an output, users can download PCA plot and heatmap in one of the preferred file formats. This web server is freely available at http://biit.cs.ut.ee/clustvis/. PMID:25969447
NASA Astrophysics Data System (ADS)
Kholodov, V. A.; Yaroslavtseva, N. V.; Lazarev, V. I.; Frid, A. S.
2016-09-01
Cluster analysis and principal component analysis (PCA) have been used for the interpretation of dry sieving data. Chernozems from the treatments of long-term field experiments with different land-use patterns— annually mowed steppe, continuous potato culture, permanent black fallow, and untilled fallow since 1998 after permanent black fallow—have been used. Analysis of dry sieving data by PCA has shown that the treatments of untilled fallow after black fallow and annually mowed steppe differ most in the series considered; the content of dry aggregates of 10-7 mm makes the largest contribution to the distribution of objects along the first principal component. This fraction has been sieved in water and analyzed by PCA. In contrast to dry sieving data, the wet sieving data showed the closest mathematical distance between the treatment of untilled fallow after black fallow and the undisturbed treatment of annually mowed steppe, while the untilled fallow after black fallow and the permanent black fallow were the most distant treatments. Thus, it may be suggested that the water stability of structure is first restored after the removal of destructive anthropogenic load. However, the restoration of the distribution of structural separates to the parameters characteristic of native soils is a significantly longer process.
Clinical Study of the 3D-Master Color System among the Spanish Population.
Gómez-Polo, Cristina; Gómez-Polo, Miguel; Martínez Vázquez de Parga, Juan Antonio; Celemín-Viñuela, Alicia
2017-01-12
To study whether the shades of the 3D-Master System were grouped and represented in the chromatic space according to the three-color coordinates of value, chroma, and hue. Maxillary central incisor color was measured on tooth surfaces through the Easyshade Compact spectrophotometer using 1361 participants aged between 16 and 89. The natural (not bleached teeth) color of the middle thirds was registered in the 3D-Master System nomenclature and in the CIELCh system. Principal component analysis and cluster analysis were applied. 75 colors of the 3D-Master System were found. The statistical analysis revealed the existence of 5 cluster groups. The centroid, the average of the 75 samples, in relation to lightness (L*) was 74.64, 22.87 for chroma (C*), and 88.85 for hue (h*). All of the clusters, except cluster 3, showed significant statistical differences with the centroid for the three-color coordinates (p <0.001). The results of this study indicated that 75 shades in the 3D-Master System were grouped into 5 clusters following coordinates L*, C*, and h* resulting from the dental spectrophotometer Vita Easyshade compact. The shades that composed each cluster did not belong to the same lightness color dimension groups. There was no special uniform chromatic distribution among the colors of the 3D-Master System. © 2017 by the American College of Prosthodontists.
Zhang, Xiujun; Parry, Ronald J.
2007-01-01
The pyrrolomycins are a family of polyketide antibiotics, some of which contain a nitro group. To gain insight into the nitration mechanism associated with the formation of these antibiotics, the pyrrolomycin biosynthetic gene cluster from Actinosporangium vitaminophilum was cloned. Sequencing of ca. 56 kb of A. vitaminophilum DNA revealed 35 open reading frames (ORFs). Sequence analysis revealed a clear relationship between some of these ORFs and the biosynthetic gene cluster for pyoluteorin, a structurally related antibiotic. Since a gene transfer system could not be devised for A. vitaminophilum, additional proof for the identity of the cloned gene cluster was sought by cloning the pyrrolomycin gene cluster from Streptomyces sp. strain UC 11065, a transformable pyrrolomycin producer. Sequencing of ca. 26 kb of UC 11065 DNA revealed the presence of 17 ORFs, 15 of which exhibit strong similarity to ORFs in the A. vitaminophilum cluster as well as a nearly identical organization. Single-crossover disruption of two genes in the UC 11065 cluster abolished pyrrolomycin production in both cases. These results confirm that the genetic locus cloned from UC 11065 is essential for pyrrolomycin production, and they also confirm that the highly similar locus in A. vitaminophilum encodes pyrrolomycin biosynthetic genes. Sequence analysis revealed that both clusters contain genes encoding the two components of an assimilatory nitrate reductase. This finding suggests that nitrite is required for the formation of the nitrated pyrrolomycins. However, sequence analysis did not provide additional insights into the nitration process, suggesting the operation of a novel nitration mechanism. PMID:17158935
Wang, Xihua; Zhang, Guangxin; Xu, Y Jun; Sun, Guangzhi
2015-11-01
Assessment on the interaction between groundwater and surface water (GW-SW) can generate information that is critical to regional water resource management, especially for regions that are highly dependent on groundwater resources for irrigation. This study investigated such interaction on China's Sanjiang Plain (10.9 × 10(4) km(2)) and produced results to assist sustainable regional water management for intensive agricultural activities. Methods of hierarchical cluster analysis (HCA), principal component analysis (PCA), and statistical analysis were used in this study. One hundred two water samplings (60 from shallow groundwater, 7 from deep groundwater, and 35 from surface water) were collected and grouped into three clusters and seven sub-clusters during the analyses. The PCA analysis identified four principal components of the interaction, which explained 85.9% variance of total database, attributed to the dissolution and evolution of gypsum, feldspar, and other natural minerals in the region that was affected by anthropic and geological (sedimentary rock mineral) activities. The analyses showed that surface water in the upper region of the Sanjiang Plain gained water from local shallow groundwater, indicating that the surface water in the upper region was relatively more resilient to withdrawal for usage, whereas in the middle region, there was only a weak interaction between shallow groundwater and surface water. In the lower region of the Sanjiang Plain, surface water lost water to shallow groundwater, indicating that the groundwater was vulnerable to pollution by pesticides and fertilizers from terrestrial sources.
An analysis of cluster headache information provided on internet websites.
Peterlin, B Lee; Gambini-Suarez, Eduardo; Lidicker, Jeffrey; Levin, Morris
2008-03-01
To evaluate the quality of websites providing cluster headache information for patients and healthcare providers. The Internet has become an increasingly important source of healthcare information. However, limited data exist regarding the quality of websites providing headache information. This was a cross-sectional study conducted in February 2007. Websites providing cluster headache information were determined on the search engine MetaCrawler and classified as either patient oriented or healthcare provider oriented. The overall quality of each site was evaluated using a score system. Readability was evaluated using the Flesch-Kincaid Grade Level Readability Score (FKRS). Website quality was analyzed based on ownership, purpose, authorship, author qualifications, attribution, interactivity, and currency. The technical quality of the cluster headache information was analyzed based on content specific to cluster headache. The final ranking, based on the sum of the ranks of all 3 categories, was determined and then contrasted between the patient-oriented and healthcare professional-oriented websites using 2-sample t-tests. Of the first 40 websites found on MetaCrawler, 72.5% were advertisements, unrelated to headache, or repeated websites. Although the standard US writing averages are at a seventh to eighth grade level, the mean FKRS of all sites was at a 12th grade level of difficulty, with no significant difference between the patient-oriented or healthcare provider-oriented websites (P = .54). Of a total possible 14 points, the overall mean quality component score was 9.9 for all sites; and of a total possible 23 points, the overall mean technical component score was 13.9. There was no significant difference for either the quality or technical component scores between patient-oriented or healthcare provider-oriented websites (P = .45 and P = .80, respectively). There are numerous cluster headache websites that can be found on the Internet. The quality of most of the websites dedicated to cluster headache is mediocre, and although there are some excellent cluster headache websites, these sites may be challenging for many users to locate. There was no significant difference in the overall quality of websites oriented for patients or healthcare providers providing cluster headache information evaluated in this study. In addition, websites providing high-quality cluster headache information are written at an educational level too high for a significant portion of the general population to fully utilize. Physicians should strongly consider providing lists of quality websites on cluster headache for their patients.
Yang, Yongxin; Bu, Dengpan; Zhao, Xiaowei; Sun, Peng; Wang, Jiaqi; Zhou, Lingyun
2013-04-05
To aid in unraveling diverse genetic and biological unknowns, a proteomic approach was used to analyze the whey proteome in cow, yak, buffalo, goat, and camel milk based on the isobaric tag for relative and absolute quantification (iTRAQ) techniques. This analysis is the first to produce proteomic data for the milk from the above-mentioned animal species: 211 proteins have been identified and 113 proteins have been categorized according to molecular function, cellular components, and biological processes based on gene ontology annotation. The results of principal component analysis showed significant differences in proteomic patterns among goat, camel, cow, buffalo, and yak milk. Furthermore, 177 differentially expressed proteins were submitted to advanced hierarchical clustering. The resulting clustering pattern included three major sample clusters: (1) cow, buffalo, and yak milk; (2) goat, cow, buffalo, and yak milk; and (3) camel milk. Certain proteins were chosen as characterization traits for a given species: whey acidic protein and quinone oxidoreductase for camel milk, biglycan for goat milk, uncharacterized protein (Accession Number: F1MK50 ) for yak milk, clusterin for buffalo milk, and primary amine oxidase for cow milk. These results help reveal the quantitative milk whey proteome pattern for analyzed species. This provides information for evaluating adulteration of specific specie milk and may provide potential directions for application of specific milk protein production based on physiological differences among animal species.
Advanced Treatment Monitoring for Olympic-Level Athletes Using Unsupervised Modeling Techniques
Siedlik, Jacob A.; Bergeron, Charles; Cooper, Michael; Emmons, Russell; Moreau, William; Nabhan, Dustin; Gallagher, Philip; Vardiman, John P.
2016-01-01
Context Analysis of injury and illness data collected at large international competitions provides the US Olympic Committee and the national governing bodies for each sport with information to best prepare for future competitions. Research in which authors have evaluated medical contacts to provide the expected level of medical care and sports medicine services at international competitions is limited. Objective To analyze the medical-contact data for athletes, staff, and coaches who participated in the 2011 Pan American Games in Guadalajara, Mexico, using unsupervised modeling techniques to identify underlying treatment patterns. Design Descriptive epidemiology study. Setting Pan American Games. Patients or Other Participants A total of 618 US athletes (337 males, 281 females) participated in the 2011 Pan American Games. Main Outcome Measure(s) Medical data were recorded from the injury-evaluation and injury-treatment forms used by clinicians assigned to the central US Olympic Committee Sport Medicine Clinic and satellite locations during the operational 17-day period of the 2011 Pan American Games. We used principal components analysis and agglomerative clustering algorithms to identify and define grouped modalities. Lift statistics were calculated for within-cluster subgroups. Results Principal component analyses identified 3 components, accounting for 72.3% of the variability in datasets. Plots of the principal components showed that individual contacts focused on 4 treatment clusters: massage, paired manipulation and mobilization, soft tissue therapy, and general medical. Conclusions Unsupervised modeling techniques were useful for visualizing complex treatment data and provided insights for improved treatment modeling in athletes. Given its ability to detect clinically relevant treatment pairings in large datasets, unsupervised modeling should be considered a feasible option for future analyses of medical-contact data from international competitions. PMID:26794628
Recognizing different tissues in human fetal femur cartilage by label-free Raman microspectroscopy
NASA Astrophysics Data System (ADS)
Kunstar, Aliz; Leijten, Jeroen; van Leuveren, Stefan; Hilderink, Janneke; Otto, Cees; van Blitterswijk, Clemens A.; Karperien, Marcel; van Apeldoorn, Aart A.
2012-11-01
Traditionally, the composition of bone and cartilage is determined by standard histological methods. We used Raman microscopy, which provides a molecular "fingerprint" of the investigated sample, to detect differences between the zones in human fetal femur cartilage without the need for additional staining or labeling. Raman area scans were made from the (pre)articular cartilage, resting, proliferative, and hypertrophic zones of growth plate and endochondral bone within human fetal femora. Multivariate data analysis was performed on Raman spectral datasets to construct cluster images with corresponding cluster averages. Cluster analysis resulted in detection of individual chondrocyte spectra that could be separated from cartilage extracellular matrix (ECM) spectra and was verified by comparing cluster images with intensity-based Raman images for the deoxyribonucleic acid/ribonucleic acid (DNA/RNA) band. Specific dendrograms were created using Ward's clustering method, and principal component analysis (PCA) was performed with the separated and averaged Raman spectra of cells and ECM of all measured zones. Overall (dis)similarities between measured zones were effectively visualized on the dendrograms and main spectral differences were revealed by PCA allowing for label-free detection of individual cartilaginous zones and for label-free evaluation of proper cartilaginous matrix formation for future tissue engineering and clinical purposes.
High-throughput analysis of the satellitome illuminates satellite DNA evolution
NASA Astrophysics Data System (ADS)
Ruiz-Ruano, Francisco J.; López-León, María Dolores; Cabrero, Josefa; Camacho, Juan Pedro M.
2016-07-01
Satellite DNA (satDNA) is a major component yet the great unknown of eukaryote genomes and clearly underrepresented in genome sequencing projects. Here we show the high-throughput analysis of satellite DNA content in the migratory locust by means of the bioinformatic analysis of Illumina reads with the RepeatExplorer and RepeatMasker programs. This unveiled 62 satDNA families and we propose the term “satellitome” for the whole collection of different satDNA families in a genome. The finding that satDNAs were present in many contigs of the migratory locust draft genome indicates that they show many genomic locations invisible by fluorescent in situ hybridization (FISH). The cytological pattern of five satellites showing common descent (belonging to the SF3 superfamily) suggests that non-clustered satDNAs can become into clustered through local amplification at any of the many genomic loci resulting from previous dissemination of short satDNA arrays. The fact that all kinds of satDNA (micro- mini- and satellites) can show the non-clustered and clustered states suggests that all these elements are mostly similar, except for repeat length. Finally, the presence of VNTRs in bacteria, showing similar properties to non-clustered satDNAs in eukaryotes, suggests that this kind of tandem repeats show common properties in all living beings.
A novel unsupervised spike sorting algorithm for intracranial EEG.
Yadav, R; Shah, A K; Loeb, J A; Swamy, M N S; Agarwal, R
2011-01-01
This paper presents a novel, unsupervised spike classification algorithm for intracranial EEG. The method combines template matching and principal component analysis (PCA) for building a dynamic patient-specific codebook without a priori knowledge of the spike waveforms. The problem of misclassification due to overlapping classes is resolved by identifying similar classes in the codebook using hierarchical clustering. Cluster quality is visually assessed by projecting inter- and intra-clusters onto a 3D plot. Intracranial EEG from 5 patients was utilized to optimize the algorithm. The resulting codebook retains 82.1% of the detected spikes in non-overlapping and disjoint clusters. Initial results suggest a definite role of this method for both rapid review and quantitation of interictal spikes that could enhance both clinical treatment and research studies on epileptic patients.
Cumulative trauma, hyperarousal, and suicidality in the general population: a path analysis.
Briere, John; Godbout, Natacha; Dias, Colin
2015-01-01
Although trauma exposure and posttraumatic stress disorder (PTSD) both have been linked to suicidal thoughts and behavior, the underlying basis for this relationship is not clear. In a sample of 357 trauma-exposed individuals from the general population, younger participant age, cumulative trauma exposure, and all three Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, PTSD clusters (reexperiencing, avoidance, and hyperarousal) were correlated with clinical levels of suicidality. However, logistic regression analysis indicated that when all PTSD clusters were considered simultaneously, only hyperarousal continued to be predictive. A path analysis confirmed that posttraumatic hyperarousal (but not other components of PTSD) fully mediated the relationship between extent of trauma exposure and degree of suicidal thoughts and behaviors.
First evidence of diffuse ultra-steep-spectrum radio emission surrounding the cool core of a cluster
NASA Astrophysics Data System (ADS)
Savini, F.; Bonafede, A.; Brüggen, M.; van Weeren, R.; Brunetti, G.; Intema, H.; Botteon, A.; Shimwell, T.; Wilber, A.; Rafferty, D.; Giacintucci, S.; Cassano, R.; Cuciti, V.; de Gasperin, F.; Röttgering, H.; Hoeft, M.; White, G.
2018-05-01
Diffuse synchrotron radio emission from cosmic-ray electrons is observed at the center of a number of galaxy clusters. These sources can be classified either as giant radio halos, which occur in merging clusters, or as mini halos, which are found only in cool-core clusters. In this paper, we present the first discovery of a cool-core cluster with an associated mini halo that also shows ultra-steep-spectrum emission extending well beyond the core that resembles radio halo emission. The large-scale component is discovered thanks to LOFAR observations at 144 MHz. We also analyse GMRT observations at 610 MHz to characterise the spectrum of the radio emission. An X-ray analysis reveals that the cluster is slightly disturbed, and we suggest that the steep-spectrum radio emission outside the core could be produced by a minor merger that powers electron re-acceleration without disrupting the cool core. This discovery suggests that, under particular circumstances, both a mini and giant halo could co-exist in a single cluster, opening new perspectives for particle acceleration mechanisms in galaxy clusters.
Energy Efficient Engine Low Pressure Subsystem Flow Analysis
NASA Technical Reports Server (NTRS)
Hall, Edward J.; Lynn, Sean R.; Heidegger, Nathan J.; Delaney, Robert A.
1998-01-01
The objective of this project is to provide the capability to analyze the aerodynamic performance of the complete low pressure subsystem (LPS) of the Energy Efficient Engine (EEE). The analyses were performed using three-dimensional Navier-Stokes numerical models employing advanced clustered processor computing platforms. The analysis evaluates the impact of steady aerodynamic interaction effects between the components of the LPS at design and off-design operating conditions. Mechanical coupling is provided by adjusting the rotational speed of common shaft-mounted components until a power balance is achieved. The Navier-Stokes modeling of the complete low pressure subsystem provides critical knowledge of component aero/mechanical interactions that previously were unknown to the designer until after hardware testing.
Energy Efficient Engine Low Pressure Subsystem Aerodynamic Analysis
NASA Technical Reports Server (NTRS)
Hall, Edward J.; Delaney, Robert A.; Lynn, Sean R.; Veres, Joseph P.
1998-01-01
The objective of this study was to demonstrate the capability to analyze the aerodynamic performance of the complete low pressure subsystem (LPS) of the Energy Efficient Engine (EEE). Detailed analyses were performed using three- dimensional Navier-Stokes numerical models employing advanced clustered processor computing platforms. The analysis evaluates the impact of steady aerodynamic interaction effects between the components of the LPS at design and off- design operating conditions. Mechanical coupling is provided by adjusting the rotational speed of common shaft-mounted components until a power balance is achieved. The Navier-Stokes modeling of the complete low pressure subsystem provides critical knowledge of component acro/mechanical interactions that previously were unknown to the designer until after hardware testing.
High- and low-level hierarchical classification algorithm based on source separation process
NASA Astrophysics Data System (ADS)
Loghmari, Mohamed Anis; Karray, Emna; Naceur, Mohamed Saber
2016-10-01
High-dimensional data applications have earned great attention in recent years. We focus on remote sensing data analysis on high-dimensional space like hyperspectral data. From a methodological viewpoint, remote sensing data analysis is not a trivial task. Its complexity is caused by many factors, such as large spectral or spatial variability as well as the curse of dimensionality. The latter describes the problem of data sparseness. In this particular ill-posed problem, a reliable classification approach requires appropriate modeling of the classification process. The proposed approach is based on a hierarchical clustering algorithm in order to deal with remote sensing data in high-dimensional space. Indeed, one obvious method to perform dimensionality reduction is to use the independent component analysis process as a preprocessing step. The first particularity of our method is the special structure of its cluster tree. Most of the hierarchical algorithms associate leaves to individual clusters, and start from a large number of individual classes equal to the number of pixels; however, in our approach, leaves are associated with the most relevant sources which are represented according to mutually independent axes to specifically represent some land covers associated with a limited number of clusters. These sources contribute to the refinement of the clustering by providing complementary rather than redundant information. The second particularity of our approach is that at each level of the cluster tree, we combine both a high-level divisive clustering and a low-level agglomerative clustering. This approach reduces the computational cost since the high-level divisive clustering is controlled by a simple Boolean operator, and optimizes the clustering results since the low-level agglomerative clustering is guided by the most relevant independent sources. Then at each new step we obtain a new finer partition that will participate in the clustering process to enhance semantic capabilities and give good identification rates.
Wu, Xiao; Yin, Hao; Shi, Zebin; Chen, Yangyang; Qi, Kaijie; Qiao, Xin; Wang, Guoming; Cao, Peng; Zhang, Shaoling
2018-01-01
An evaluation of fruit wax components will provide us with valuable information for pear breeding and enhancing fruit quality. Here, we dissected the epicuticular wax concentration, composition and structure of mature fruits from 35 pear cultivars belonging to five different species and hybrid interspecies. A total of 146 epicuticular wax compounds were detected, and the wax composition and concentration varied dramatically among species, with the highest level of 1.53 mg/cm2 in Pyrus communis and the lowest level of 0.62 mg/cm2 in Pyrus pyrifolia. Field emission scanning electron microscopy (FESEM) analysis showed amorphous structures of the epicuticular wax crystals of different pear cultivars. Cluster analysis revealed that the Pyrus bretschneideri cultivars were grouped much closer to Pyrus pyrifolia and Pyrus ussuriensis, and the Pyrus sinkiangensis cultivars were clustered into a distant group. Based on the principal component analysis (PCA), the cultivars could be divided into three groups and five groups according to seven main classes of epicuticular wax compounds and 146 wax compounds, respectively. PMID:29875784
Yi, YaXiong; Zhang, Yong; Ding, Yue; Lu, Lu; Zhang, Tong; Zhao, Yuan; Xu, XiaoJun; Zhang, YuXin
2016-11-01
J. Sep. Sci. 2016, 39, 4147-4157 DOI: 10.1002/jssc.201600284 Yinchenhao decoction (YCHD) is a famous Chinese herbal formula recorded in the Shang Han Lun which was prescribed by Zhongjing Zhang during 150-219 AD. A novel quantitative analysis method was developed, based on ultrahigh performance liquid chromatography coupled with a diode array detector for the simultaneous determination of 14 main active components in Yinchenhao decoction. Furthermore, the method has been applied for compositional difference analysis of the 14 components in eight normal extraction samples of Yinchenhao decoction, with the aid of hierarchical clustering analysis and similarity analysis. The present research could help hospital, factory and lab choose the best way to make Yinchenhao decoction with better efficacy. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Effect of video server topology on contingency capacity requirements
NASA Astrophysics Data System (ADS)
Kienzle, Martin G.; Dan, Asit; Sitaram, Dinkar; Tetzlaff, William H.
1996-03-01
Video servers need to assign a fixed set of resources to each video stream in order to guarantee on-time delivery of the video data. If a server has insufficient resources to guarantee the delivery, it must reject the stream request rather than slowing down all existing streams. Large scale video servers are being built as clusters of smaller components, so as to be economical, scalable, and highly available. This paper uses a blocking model developed for telephone systems to evaluate video server cluster topologies. The goal is to achieve high utilization of the components and low per-stream cost combined with low blocking probability and high user satisfaction. The analysis shows substantial economies of scale achieved by larger server images. Simple distributed server architectures can result in partitioning of resources with low achievable resource utilization. By comparing achievable resource utilization of partitioned and monolithic servers, we quantify the cost of partitioning. Next, we present an architecture for a distributed server system that avoids resource partitioning and results in highly efficient server clusters. Finally, we show how, in these server clusters, further optimizations can be achieved through caching and batching of video streams.
The fine-scale genetic structure and evolution of the Japanese population.
Takeuchi, Fumihiko; Katsuya, Tomohiro; Kimura, Ryosuke; Nabika, Toru; Isomura, Minoru; Ohkubo, Takayoshi; Tabara, Yasuharu; Yamamoto, Ken; Yokota, Mitsuhiro; Liu, Xuanyao; Saw, Woei-Yuh; Mamatyusupu, Dolikun; Yang, Wenjun; Xu, Shuhua; Teo, Yik-Ying; Kato, Norihiro
2017-01-01
The contemporary Japanese populations largely consist of three genetically distinct groups-Hondo, Ryukyu and Ainu. By principal-component analysis, while the three groups can be clearly separated, the Hondo people, comprising 99% of the Japanese, form one almost indistinguishable cluster. To understand fine-scale genetic structure, we applied powerful haplotype-based statistical methods to genome-wide single nucleotide polymorphism data from 1600 Japanese individuals, sampled from eight distinct regions in Japan. We then combined the Japanese data with 26 other Asian populations data to analyze the shared ancestry and genetic differentiation. We found that the Japanese could be separated into nine genetic clusters in our dataset, showing a marked concordance with geography; and that major components of ancestry profile of Japanese were from the Korean and Han Chinese clusters. We also detected and dated admixture in the Japanese. While genetic differentiation between Ryukyu and Hondo was suggested to be caused in part by positive selection, genetic differentiation among the Hondo clusters appeared to result principally from genetic drift. Notably, in Asians, we found the possibility that positive selection accentuated genetic differentiation among distant populations but attenuated genetic differentiation among close populations. These findings are significant for studies of human evolution and medical genetics.
Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P.; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian
2016-01-01
Background The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Material/Methods Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Results Technical and biological reproducibility ranged between 96.8–99.4% and 47.6–94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Conclusions Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable. PMID:27798637
Slaus, Mario; Tomicić, Zeljko; Uglesić, Ante; Jurić, Radomir
2004-08-01
To determine the ethnic composition of the early medieval Croats, the location from which they migrated to the east coast of the Adriatic, and to separate early medieval Croats from Bijelo brdo culture members, using principal components analysis and discriminant function analysis of craniometric data from Central and South-East European medieval archaeological sites. Mean male values for 8 cranial measurements from 39 European and 5 Iranian sites were analyzed by principal components analysis. Raw data for 17 cranial measurements for 103 female and 112 male skulls were used to develop discriminant functions. The scatter-plot of the analyzed sites on the first 2 principal components showed a pattern of intergroup relationships consistent with geographical and archaeological information not included in the data set. The first 2 principal components separated the sites into 4 distinct clusters: Avaroslav sites west of the Danube, Avaroslav sites east of the Danube, Bijelo brdo sites, and Polish sites. All early medieval Croat sites were located in the cluster of Polish sites. Two discriminant functions successfully differentiated between early medieval Croats and Bijelo brdo members. Overall accuracies were high -- 89.3% for males, and 97.1% for females. Early medieval Croats seem to be of Slavic ancestry, and at one time shared a common homeland with medieval Poles. Application of unstandardized discriminant function coefficients to unclassified crania from 18 sites showed an expansion of early medieval Croats into continental Croatia during the 10th to 13th century.
Pain sensitivity profiles in patients with advanced knee osteoarthritis
Frey-Law, Laura A.; Bohr, Nicole L.; Sluka, Kathleen A.; Herr, Keela; Clark, Charles R.; Noiseux, Nicolas O.; Callaghan, John J; Zimmerman, M Bridget; Rakel, Barbara A.
2016-01-01
The development of patient profiles to subgroup individuals on a variety of variables has gained attention as a potential means to better inform clinical decision-making. Patterns of pain sensitivity response specific to quantitative sensory testing (QST) modality have been demonstrated in healthy subjects. It has not been determined if these patterns persist in a knee osteoarthritis population. In a sample of 218 participants, 19 QST measures along with pain, psychological factors, self-reported function, and quality of life were assessed prior to total knee arthroplasty. Component analysis was used to identify commonalities across the 19 QST assessments to produce standardized pain sensitivity factors. Cluster analysis then grouped individuals that exhibited similar patterns of standardized pain sensitivity component scores. The QST resulted in four pain sensitivity components: heat, punctate, temporal summation, and pressure. Cluster analysis resulted in five pain sensitivity profiles: a “low pressure pain” group, an “average pain” group, and three “high pain” sensitivity groups who were sensitive to different modalities (punctate, heat, and temporal summation). Pain and function differed between pain sensitivity profiles, along with sex distribution; however no differences in OA grade, medication use, or psychological traits were found. Residualizing QST data by age and sex resulted in similar components and pain sensitivity profiles. Further, these profiles are surprisingly similar to those reported in healthy populations suggesting that individual differences in pain sensitivity are a robust finding even in an older population with significant disease. PMID:27152688
Chemical Polymorphism of Essential Oils of Artemisia vulgaris Growing Wild in Lithuania.
Judzentiene, Asta; Budiene, Jurga
2018-02-01
Compositional variability of mugwort (Artemisia vulgaris L.) essential oils has been investigated in the study. Plant material (over ground parts at full flowering stage) was collected from forty-four wild populations in Lithuania. The oils from aerial parts were obtained by hydrodistillation and analyzed by GC(FID) and GC/MS. In total, up to 111 components were determined in the oils. As the major constituents were found: sabinene, 1,8-cineole, artemisia ketone, both thujone isomers, camphor, cis-chrysanthenyl acetate, davanone and davanone B. The compositional data were subjected to statistical analysis. The application of PCA (Principal Component Analysis) and AHC (Agglomerative Hierarchical Clustering) allowed grouping the oils into six clusters. AHC permitted to distinguish an artemisia ketone chemotype, which, to the best of our knowledge, is very scarce. Additionally, two rare cis-chrysanthenyl acetate and sabinene oil types were determined for the plants growing in Lithuania. Besides, davanone was found for the first time as a principal component in mugwort oils. The performed study revealed significant chemical polymorphism of essential oils in mugwort plants native to Lithuania; it has expanded our chemotaxonomic knowledge both of A. vulgaris species and Artemisia genus. © 2018 Wiley-VHCA AG, Zurich, Switzerland.
A Systematic Analysis of Candidate Genes Associated with Nicotine Addiction
Liu, Meng; Li, Xia; Fan, Rui; Liu, Xinhua; Wang, Ju
2015-01-01
Nicotine, as the major psychoactive component of tobacco, has broad physiological effects within the central nervous system, but our understanding of the molecular mechanism underlying its neuronal effects remains incomplete. In this study, we performed a systematic analysis on a set of nicotine addiction-related genes to explore their characteristics at network levels. We found that NAGenes tended to have a more moderate degree and weaker clustering coefficient and to be less central in the network compared to alcohol addiction-related genes or cancer genes. Further, clustering of these genes resulted in six clusters with themes in synaptic transmission, signal transduction, metabolic process, and apoptosis, which provided an intuitional view on the major molecular functions of the genes. Moreover, functional enrichment analysis revealed that neurodevelopment, neurotransmission activity, and metabolism related biological processes were involved in nicotine addiction. In summary, by analyzing the overall characteristics of the nicotine addiction related genes, this study provided valuable information for understanding the molecular mechanisms underlying nicotine addiction. PMID:26097843
Gambling, games of skill and human ecology: a pilot study by a multidimensional analysis approach.
Valera, Luca; Giuliani, Alessandro; Gizzi, Alessio; Tartaglia, Francesco; Tambone, Vittoradolfo
2015-01-01
The present pilot study aims at analyzing the human activity of playing in the light of an indicator of human ecology (HE). We highlighted the four essential anthropological dimensions (FEAD), starting from the analysis of questionnaires administered to actual gamers. The coherence between theoretical construct and observational data is a remarkable proof-of-concept of the possibility of establishing an experimentally motivated link between a philosophical construct (coming from Huizinga's Homo ludens definition) and actual gamers' motivation pattern. The starting hypothesis is that the activity of playing becomes ecological (and thus not harmful) when it achieves the harmony between the FEAD, thus realizing HE; conversely, it becomes at risk of creating some form of addiction, when destroying FEAD balance. We analyzed the data by means of variable clustering (oblique principal components) so to experimentally verify the existence of the hypothesized dimensions. The subsequent projection of statistical units (gamers) on the orthogonal space spanned by principal components allowed us to generate a meaningful, albeit preliminary, clusterization of gamer profiles.
Analysis of Helium Segregation on Surfaces of Plasma-Exposed Tungsten
NASA Astrophysics Data System (ADS)
Maroudas, Dimitrios; Hu, Lin; Hammond, Karl; Wirth, Brian
2015-11-01
We report a systematic theoretical and atomic-scale computational study of implanted helium segregation on surfaces of tungsten, which is considered as a plasma facing component in nuclear fusion reactors. We employ a hierarchy of atomic-scale simulations, including molecular statics to understand the origin of helium surface segregation, targeted molecular-dynamics (MD) simulations of near-surface cluster reactions, and large-scale MD simulations of implanted helium evolution in plasma-exposed tungsten. We find that small, mobile helium clusters (of 1-7 He atoms) in the near-surface region are attracted to the surface due to an elastic interaction force. This thermodynamic driving force induces drift fluxes of these mobile clusters toward the surface, facilitating helium segregation. Moreover, the clusters' drift toward the surface enables cluster reactions, most importantly trap mutation, at rates much higher than in the bulk material. This cluster dynamics has significant effects on the surface morphology, near-surface defect structures, and the amount of helium retained in the material upon plasma exposure.
Chen, Lei Tai; Sun, Ai Qing; Yang, Min; Chen, Lu Lu; Ma, Xue Li; Li, Mei Ling; Yin, Yan Ping
2016-09-01
A total of 16 wheat cultivars were selected to detect seed vigor of different genotypes using standard germination test, seed germination test under stress conditions and field emergence test. The adversity resistance indices of seed vigor indices and field emergence percentage under different germination conditions were used as the indices to evaluate adversity resistance. Principal component analysis and cluster analysis were used for the comprehensive evaluation of seed vigor. Results showed that drought stress, artificial aging and cold soaking treatments affected seed vigor to some extent. The adversity resistance indices of the artificial aging and cold soaking tests were significantly positively correlated with the field emergence percentage, while the adversity resistance index of drought stress test had no significant correlation with the field emergence percentage. 16 wheat cultivars were classified as three groups based on the principal component analysis and cluster analysis. Yunong 949, Yumai 49-198, Luyuan 502, Zhengyumai 9987, Shimai 21, Shannong 23, and Shixin 828 belonged to high vigor seeds. Xunong 5, Yunong 982, Tangmai 8, Jimai 20, Jimai 22, Jinan 17, and Shannong 20 belonged to medium vigor seeds. The other two cultivars, Chang 4738 and Lunxuan 061, belonged to low vigor seeds.
Shape component analysis: structure-preserving dimension reduction on biological shape spaces.
Lee, Hao-Chih; Liao, Tao; Zhang, Yongjie Jessica; Yang, Ge
2016-03-01
Quantitative shape analysis is required by a wide range of biological studies across diverse scales, ranging from molecules to cells and organisms. In particular, high-throughput and systems-level studies of biological structures and functions have started to produce large volumes of complex high-dimensional shape data. Analysis and understanding of high-dimensional biological shape data require dimension-reduction techniques. We have developed a technique for non-linear dimension reduction of 2D and 3D biological shape representations on their Riemannian spaces. A key feature of this technique is that it preserves distances between different shapes in an embedded low-dimensional shape space. We demonstrate an application of this technique by combining it with non-linear mean-shift clustering on the Riemannian spaces for unsupervised clustering of shapes of cellular organelles and proteins. Source code and data for reproducing results of this article are freely available at https://github.com/ccdlcmu/shape_component_analysis_Matlab The implementation was made in MATLAB and supported on MS Windows, Linux and Mac OS. geyang@andrew.cmu.edu. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NeatMap--non-clustering heat map alternatives in R.
Rajaram, Satwik; Oono, Yoshi
2010-01-22
The clustered heat map is the most popular means of visualizing genomic data. It compactly displays a large amount of data in an intuitive format that facilitates the detection of hidden structures and relations in the data. However, it is hampered by its use of cluster analysis which does not always respect the intrinsic relations in the data, often requiring non-standardized reordering of rows/columns to be performed post-clustering. This sometimes leads to uninformative and/or misleading conclusions. Often it is more informative to use dimension-reduction algorithms (such as Principal Component Analysis and Multi-Dimensional Scaling) which respect the topology inherent in the data. Yet, despite their proven utility in the analysis of biological data, they are not as widely used. This is at least partially due to the lack of user-friendly visualization methods with the visceral impact of the heat map. NeatMap is an R package designed to meet this need. NeatMap offers a variety of novel plots (in 2 and 3 dimensions) to be used in conjunction with these dimension-reduction techniques. Like the heat map, but unlike traditional displays of such results, it allows the entire dataset to be displayed while visualizing relations between elements. It also allows superimposition of cluster analysis results for mutual validation. NeatMap is shown to be more informative than the traditional heat map with the help of two well-known microarray datasets. NeatMap thus preserves many of the strengths of the clustered heat map while addressing some of its deficiencies. It is hoped that NeatMap will spur the adoption of non-clustering dimension-reduction algorithms.
Clustering execution in a processing system to increase power savings
Bose, Pradip; Buyuktosunoglu, Alper; Jacobson, Hans M.; Vega, Augusto J.
2018-03-20
Embodiments relate to clustering execution in a processing system. An aspect includes accessing a control flow graph that defines a data dependency and an execution sequence of a plurality of tasks of an application that executes on a plurality of system components. The execution sequence of the tasks in the control flow graph is modified as a clustered control flow graph that clusters active and idle phases of a system component while maintaining the data dependency. The clustered control flow graph is sent to an operating system, where the operating system utilizes the clustered control flow graph for scheduling the tasks.
Pang, Yuanjie; Peng, Roger D; Jones, Miranda R; Francesconi, Kevin A; Goessler, Walter; Howard, Barbara V; Umans, Jason G; Best, Lyle G; Guallar, Eliseo; Post, Wendy S; Kaufman, Joel D; Vaidya, Dhananjay; Navas-Acien, Ana
2016-05-01
Natural and anthropogenic sources of metal exposure differ for urban and rural residents. We searched to identify patterns of metal mixtures which could suggest common environmental sources and/or metabolic pathways of different urinary metals, and compared metal-mixtures in two population-based studies from urban/sub-urban and rural/town areas in the US: the Multi-Ethnic Study of Atherosclerosis (MESA) and the Strong Heart Study (SHS). We studied a random sample of 308 White, Black, Chinese-American, and Hispanic participants in MESA (2000-2002) and 277 American Indian participants in SHS (1998-2003). We used principal component analysis (PCA), cluster analysis (CA), and linear discriminant analysis (LDA) to evaluate nine urinary metals (antimony [Sb], arsenic [As], cadmium [Cd], lead [Pb], molybdenum [Mo], selenium [Se], tungsten [W], uranium [U] and zinc [Zn]). For arsenic, we used the sum of inorganic and methylated species (∑As). All nine urinary metals were higher in SHS compared to MESA participants. PCA and CA revealed the same patterns in SHS, suggesting 4 distinct principal components (PC) or clusters (∑As-U-W, Pb-Sb, Cd-Zn, Mo-Se). In MESA, CA showed 2 large clusters (∑As-Mo-Sb-U-W, Cd-Pb-Se-Zn), while PCA showed 4 PCs (Sb-U-W, Pb-Se-Zn, Cd-Mo, ∑As). LDA indicated that ∑As, U, W, and Zn were the most discriminant variables distinguishing MESA and SHS participants. In SHS, the ∑As-U-W cluster and PC might reflect groundwater contamination in rural areas, and the Cd-Zn cluster and PC could reflect common sources from meat products or metabolic interactions. Among the metals assayed, ∑As, U, W and Zn differed the most between MESA and SHS, possibly reflecting disproportionate exposure from drinking water and perhaps food in rural Native communities compared to urban communities around the US. Copyright © 2016 Elsevier Inc. All rights reserved.
Bessette, Katie L; Jenkins, Lisanne M; Skerrett, Kristy A; Gowins, Jennifer R; DelDonno, Sophie R; Zubieta, Jon-Kar; McInnis, Melvin G; Jacobs, Rachel H; Ajilore, Olusola; Langenecker, Scott A
2018-01-01
There is substantial variability across studies of default mode network (DMN) connectivity in major depressive disorder, and reliability and time-invariance are not reported. This study evaluates whether DMN dysconnectivity in remitted depression (rMDD) is reliable over time and symptom-independent, and explores convergent relationships with cognitive features of depression. A longitudinal study was conducted with 82 young adults free of psychotropic medications (47 rMDD, 35 healthy controls) who completed clinical structured interviews, neuropsychological assessments, and 2 resting-state fMRI scans across 2 study sites. Functional connectivity analyses from bilateral posterior cingulate and anterior hippocampal formation seeds in DMN were conducted at both time points within a repeated-measures analysis of variance to compare groups and evaluate reliability of group-level connectivity findings. Eleven hyper- (from posterior cingulate) and 6 hypo- (from hippocampal formation) connectivity clusters in rMDD were obtained with moderate to adequate reliability in all but one cluster (ICC's range = 0.50 to 0.76 for 16 of 17). The significant clusters were reduced with a principle component analysis (5 components obtained) to explore these connectivity components, and were then correlated with cognitive features (rumination, cognitive control, learning and memory, and explicit emotion identification). At the exploratory level, for convergent validity, components consisting of posterior cingulate with cognitive control network hyperconnectivity in rMDD were related to cognitive control (inverse) and rumination (positive). Components consisting of anterior hippocampal formation with social emotional network and DMN hypoconnectivity were related to memory (inverse) and happy emotion identification (positive). Thus, time-invariant DMN connectivity differences exist early in the lifespan course of depression and are reliable. The nuanced results suggest a ventral within-network hypoconnectivity associated with poor memory and a dorsal cross-network hyperconnectivity linked to poorer cognitive control and elevated rumination. Study of early course remitted depression with attention to reliability and symptom independence could lead to more readily translatable clinical assessment tools for biomarkers.
Constable, Fiona E.; Nancarrow, Narelle; Plummer, Kim M.; Rodoni, Brendan
2017-01-01
PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored. PMID:28632759
NASA Technical Reports Server (NTRS)
Li, Z. K.
1985-01-01
A specialized program was developed for flow cytometric list-mode data using an heirarchical tree method for identifying and enumerating individual subpopulations, the method of principal components for a two-dimensional display of 6-parameter data array, and a standard sorting algorithm for characterizing subpopulations. The program was tested against a published data set subjected to cluster analysis and experimental data sets from controlled flow cytometry experiments using a Coulter Electronics EPICS V Cell Sorter. A version of the program in compiled BASIC is usable on a 16-bit microcomputer with the MS-DOS operating system. It is specialized for 6 parameters and up to 20,000 cells. Its two-dimensional display of Euclidean distances reveals clusters clearly, as does its 1-dimensional display. The identified subpopulations can, in suitable experiments, be related to functional subpopulations of cells.
Genetic diversity studies in pea (Pisum sativum L.) using simple sequence repeat markers.
Kumari, P; Basal, N; Singh, A K; Rai, V P; Srivastava, C P; Singh, P K
2013-03-13
The genetic diversity among 28 pea (Pisum sativum L.) genotypes was analyzed using 32 simple sequence repeat markers. A total of 44 polymorphic bands, with an average of 2.1 bands per primer, were obtained. The polymorphism information content ranged from 0.657 to 0.309 with an average of 0.493. The variation in genetic diversity among these cultivars ranged from 0.11 to 0.73. Cluster analysis based on Jaccard's similarity coefficient using the unweighted pair-group method with arithmetic mean (UPGMA) revealed 2 distinct clusters, I and II, comprising 6 and 22 genotypes, respectively. Cluster II was further differentiated into 2 subclusters, IIA and IIB, with 12 and 10 genotypes, respectively. Principal component (PC) analysis revealed results similar to those of UPGMA. The first, second, and third PCs contributed 21.6, 16.1, and 14.0% of the variation, respectively; cumulative variation of the first 3 PCs was 51.7%.
NuSTAR observations of the bullet cluster: constraints on inverse Compton emission
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wik, Daniel R.; Hornstrup, A.; Molendi, S.
2014-08-13
Here, the search for diffuse non-thermal inverse Compton (IC) emission from galaxy clusters at hard X-ray energies has been undertaken with many instruments, with most detections being either of low significance or controversial. Because all prior telescopes sensitive at E > 10 keV do not focus light and have degree-scale fields of view, their backgrounds are both high and difficult to characterize. The associated uncertainties result in lower sensitivity to IC emission and a greater chance of false detection. In this work, we present 266 ks NuSTAR observations of the Bullet cluster, which is detected in the energy range 3-30more » keV. NuSTAR's unprecedented hard X-ray focusing capability largely eliminates confusion between diffuse IC and point sources; however, at the highest energies, the background still dominates and must be well understood. To this end, we have developed a complete background model constructed of physically inspired components constrained by extragalactic survey field observations, the specific parameters of which are derived locally from data in non-source regions of target observations. Applying the background model to the Bullet cluster data, we find that the spectrum is well—but not perfectly—described as an isothermal plasma with kT = 14.2 ± 0.2 keV. To slightly improve the fit, a second temperature component is added, which appears to account for lower temperature emission from the cool core, pushing the primary component to kT ~ 15.3 keV. We see no convincing need to invoke an IC component to describe the spectrum of the Bullet cluster, and instead argue that it is dominated at all energies by emission from purely thermal gas. The conservatively derived 90% upper limit on the IC flux of 1.1 × 10 –12 erg s –1 cm –2 (50-100 keV), implying a lower limit on B ≳ 0.2 μG, is barely consistent with detected fluxes previously reported. In addition to discussing the possible origin of this discrepancy, we remark on the potential implications of this analysis for the prospects for detecting IC in galaxy clusters in the future.« less
NASA Technical Reports Server (NTRS)
Charlton, Jane C.; Laguna, Pablo
1995-01-01
The globular clusters that we observe in galaxies may be only a fraction of the initial population. Among the evolutionary influences on the population is the destruction of globular clusters by tidal forces as the cluster moves through the field of influence of a disk, a bulge, and/or a putative nuclear component (black hole). We have conducted a series of N-body simulations of globular clusters on bound and marginally bound orbits through poetentials that include black hole and speroidal components. The degree of concentration of the spheroidal component can have a considerable impact on the extent to which a globular cluster is disrupted. If half the mass of a 10(exp 10) solar mass spheroid is concentrated within 800 pc, then only black holes with masses greater than 10(exp 9) solar mass can have a significant tidal influence over that already exerted by the bulge. However, if the matter in the spheroidal component is not so strongly concentrated toward the center of the galaxy, a more modest central black hole (down to 10(exp 8) solar mass) could have a dominant influence on the globular cluster distribution, particularly if many of the clusters were initially on highly radial orbits. Our simulations show that the stars that are stripped from a globular cluster follow orbits with roughly the same eccentricity as the initial cluster orbit, spreading out along the orbit like a 'string of pearls.' Since only clusters on close to radial orbits will suffer substantial disruption, the population of stripped stars will be on orbits of high eccentricity.
Ling, Y H; Zhang, X D; Yao, N; Ding, J P; Chen, H Q; Zhang, Z J; Zhang, Y H; Ren, C H; Ma, Y H; Zhang, X R
2012-02-01
To investigate the genetic diversity of seven Chinese indigenous meat goat breeds (Tibet goat, Guizhou white goat, Shannan white goat, Yichang white goat, Matou goat, Changjiangsanjiaozhou white goat and Anhui white goat), explain their genetic relationship and assess their integrity and degree of admixture, 302 individuals from these breeds and 42 Boer goats introduced from Africa as reference samples were genotyped for 11 microsatellite markers. Results indicated that the genetic diversity of Chinese indigenous meat goats was rich. The mean heterozygosity and the mean allelic richness (AR) for the 8 goat breeds varied from 0.697 to 0.738 and 6.21 to 7.35, respectively. Structure analysis showed that Tibet goat breed was genetically distinct and was the first to separate and the other Chinese goats were then divided into two sub-clusters: Shannan white goat and Yichang white goat in one cluster; and Guizhou white goat, Matou goat, Changjiangsanjiaozhou white goat and Anhui white goat in the other cluster. This grouping pattern was further supported by clustering analysis and Principal component analysis. These results may provide a scientific basis for the characteristization, conservation and utilization of Chinese meat goats.
Consanguinity and family clustering of male factor infertility in Lebanon.
Inhorn, Marcia C; Kobeissi, Loulou; Nassar, Zaher; Lakkis, Da'ad; Fakih, Michael H
2009-04-01
To investigate the influence of consanguineous marriage on male factor infertility in Lebanon, where rates of consanguineous marriage remain high (29.6% among Muslims, 16.5% among Christians). Clinic-based, case-control study, using reproductive history, risk factor interview, and laboratory-based semen analysis. Two IVF clinics in Beirut, Lebanon, during an 8-month period (January-August 2003). One hundred twenty infertile male patients and 100 fertile male controls, distinguished by semen analysis and reproductive history. None. Standard clinical semen analysis. The rates of consanguineous marriage were relatively high among the study sample. Patients (46%) were more likely than controls (37%) to report first-degree (parental) and second-degree (grandparental) consanguinity. The study demonstrated a clear pattern of family clustering of male factor infertility, with patients significantly more likely than controls to report infertility among close male relatives (odds ratio = 2.58). Men with azoospermia and severe oligospermia showed high rates of both consanguinity (50%) and family clustering (41%). Consanguineous marriage is a socially supported institution throughout the Muslim world, yet its relationship to infertility is poorly understood. This study demonstrated a significant association between consanguinity and family clustering of male factor infertility cases, suggesting a strong genetic component.
Li, Dongsheng; Yang, Wei; Zhang, Wenyao
2017-05-01
Stress corrosion is the major failure type of bridge cable damage. The acoustic emission (AE) technique was applied to monitor the stress corrosion process of steel wires used in bridge cable structures. The damage evolution of stress corrosion in bridge cables was obtained according to the AE characteristic parameter figure. A particle swarm optimization cluster method was developed to determine the relationship between the AE signal and stress corrosion mechanisms. Results indicate that the main AE sources of stress corrosion in bridge cables included four types: passive film breakdown and detachment of the corrosion product, crack initiation, crack extension, and cable fracture. By analyzing different types of clustering data, the mean value of each damage pattern's AE characteristic parameters was determined. Different corrosion damage source AE waveforms and the peak frequency were extracted. AE particle swarm optimization cluster analysis based on principal component analysis was also proposed. This method can completely distinguish the four types of damage sources and simplifies the determination of the evolution process of corrosion damage and broken wire signals. Copyright © 2017. Published by Elsevier B.V.
Potashev, Konstantin; Sharonova, Natalia; Breus, Irina
2014-07-01
Clustering was employed for the analysis of obtained experimental data set (42 plants in total) on seed germination in leached chernozem contaminated with kerosene. Among investigated plants were 31 cultivated plants from 11 families (27 species and 20 varieties) and 11 wild plant species from 7 families, 23 annual and 19 perennial/biannual plant species, 11 monocotyledonous and 31 dicotyledonous plants. Two-dimensional (two-parameter) clustering approach, allowing the estimation of tolerance of germinating seeds using a pair of independent parameters (С75%, V7%) was found to be most effective. These parameters characterized the ability of seeds to both withstand high concentrations of contaminants without the significant reduction of the germination, and maintain high germination rate within certain contaminant concentrations. The performed clustering revealed a number of plant features, which define the relation of a particular plant to a particular tolerance cluster; it has also demonstrated the possibility of generalizing the kerosene results for n-tridecane, which is one of the typical kerosene components. In contrast to the "manual" plant ranking based on the assessment of germination at discrete concentrations of the contaminant, the proposed clustering approach allowed a generalized characterization of the seed tolerance/sensitivity to hydrocarbon contaminants. Copyright © 2014 Elsevier B.V. All rights reserved.
Real Time Intelligent Target Detection and Analysis with Machine Vision
NASA Technical Reports Server (NTRS)
Howard, Ayanna; Padgett, Curtis; Brown, Kenneth
2000-01-01
We present an algorithm for detecting a specified set of targets for an Automatic Target Recognition (ATR) application. ATR involves processing images for detecting, classifying, and tracking targets embedded in a background scene. We address the problem of discriminating between targets and nontarget objects in a scene by evaluating 40x40 image blocks belonging to an image. Each image block is first projected onto a set of templates specifically designed to separate images of targets embedded in a typical background scene from those background images without targets. These filters are found using directed principal component analysis which maximally separates the two groups. The projected images are then clustered into one of n classes based on a minimum distance to a set of n cluster prototypes. These cluster prototypes have previously been identified using a modified clustering algorithm based on prior sensed data. Each projected image pattern is then fed into the associated cluster's trained neural network for classification. A detailed description of our algorithm will be given in this paper. We outline our methodology for designing the templates, describe our modified clustering algorithm, and provide details on the neural network classifiers. Evaluation of the overall algorithm demonstrates that our detection rates approach 96% with a false positive rate of less than 0.03%.
NASA Astrophysics Data System (ADS)
Ebeling, H.; Qi, J.; Richard, J.
2017-11-01
We present the results of a multiwavelength investigation of the very X-ray luminous galaxy cluster MACSJ0553.4-3342 (z = 0.4270; hereafter MACSJ0553). Combining high-resolution data obtained with the Hubble Space Telescope and the Chandra X-ray Observatory with ground-based galaxy spectroscopy, our analysis establishes the system unambiguously as a binary, post-collision merger of massive clusters. Key characteristics include perfect alignment of luminous and dark matter for one component, a separation of almost 650 kpc (in projection) between the dark-matter peak of the other subcluster and the second X-ray peak, extremely hot gas (kT > 15 keV) at either end of the merger axis, a potential cold front in the east, an unusually low gas mass fraction of approximately 0.075 for the western component, a velocity dispersion of 1490_{-130}^{+104} km s-1, and no indication of significant substructure along the line of sight. We propose that the MACSJ0553 merger proceeds not in the plane of the sky, but at a large inclination angle, is observed very close to turnaround, and that the eastern X-ray peak is the cool core of the slightly less massive western component that was fully stripped and captured by the eastern subcluster during the collision. If correct, this hypothesis would make MACSJ0553 a superb target for a competitive study of ram-pressure stripping and the collisional behaviour of luminous and dark matter during cluster formation.
Goekoop, Rutger; Goekoop, Jaap G.
2014-01-01
Introduction The vast number of psychopathological syndromes that can be observed in clinical practice can be described in terms of a limited number of elementary syndromes that are differentially expressed. Previous attempts to identify elementary syndromes have shown limitations that have slowed progress in the taxonomy of psychiatric disorders. Aim To examine the ability of network community detection (NCD) to identify elementary syndromes of psychopathology and move beyond the limitations of current classification methods in psychiatry. Methods 192 patients with unselected mental disorders were tested on the Comprehensive Psychopathological Rating Scale (CPRS). Principal component analysis (PCA) was performed on the bootstrapped correlation matrix of symptom scores to extract the principal component structure (PCS). An undirected and weighted network graph was constructed from the same matrix. Network community structure (NCS) was optimized using a previously published technique. Results In the optimal network structure, network clusters showed a 89% match with principal components of psychopathology. Some 6 network clusters were found, including "DEPRESSION", "MANIA", “ANXIETY”, "PSYCHOSIS", "RETARDATION", and "BEHAVIORAL DISORGANIZATION". Network metrics were used to quantify the continuities between the elementary syndromes. Conclusion We present the first comprehensive network graph of psychopathology that is free from the biases of previous classifications: a ‘Psychopathology Web’. Clusters within this network represent elementary syndromes that are connected via a limited number of bridge symptoms. Many problems of previous classifications can be overcome by using a network approach to psychopathology. PMID:25427156
Measuring the Indonesian provinces competitiveness by using PCA technique
NASA Astrophysics Data System (ADS)
Runita, Ditha; Fajriyah, Rohmatul
2017-12-01
Indonesia is a country which has vast teritoty. It has 34 provinces. Building local competitiveness is critical to enhance the long-term national competitiveness especially for a country as diverse as Indonesia. A competitive local government can attract and maintain successful firms and increase living standards for its inhabitants, because investment and skilled workers gravitate from uncompetitive regions to more competitive ones. Altough there are other methods to measuring competitiveness, but here we have demonstrated a simple method using principal component analysis (PCA). It can directly be applied to correlated, multivariate data. The analysis on Indonesian provinces provides 3 clusters based on the competitiveness measurement and the clusters are Bad, Good and Best perform provinces.
A Multivariate Analysis of Galaxy Cluster Properties
NASA Astrophysics Data System (ADS)
Ogle, P. M.; Djorgovski, S.
1993-05-01
We have assembled from the literature a data base on on 394 clusters of galaxies, with up to 16 parameters per cluster. They include optical and x-ray luminosities, x-ray temperatures, galaxy velocity dispersions, central galaxy and particle densities, optical and x-ray core radii and ellipticities, etc. In addition, derived quantities, such as the mass-to-light ratios and x-ray gas masses are included. Doubtful measurements have been identified, and deleted from the data base. Our goal is to explore the correlations between these parameters, and interpret them in the framework of our understanding of evolution of clusters and large-scale structure, such as the Gott-Rees scaling hierarchy. Among the simple, monovariate correlations we found, the most significant include those between the optical and x-ray luminosities, x-ray temperatures, cluster velocity dispersions, and central galaxy densities, in various mutual combinations. While some of these correlations have been discussed previously in the literature, generally smaller samples of objects have been used. We will also present the results of a multivariate statistical analysis of the data, including a principal component analysis (PCA). Such an approach has not been used previously for studies of cluster properties, even though it is much more powerful and complete than the simple monovariate techniques which are commonly employed. The observed correlations may lead to powerful constraints for theoretical models of formation and evolution of galaxy clusters. P.M.O. was supported by a Caltech graduate fellowship. S.D. acknowledges a partial support from the NASA contract NAS5-31348 and the NSF PYI award AST-9157412.
Clustering multilayer omics data using MuNCut.
Teran Hidalgo, Sebastian J; Ma, Shuangge
2018-03-14
Omics profiling is now a routine component of biomedical studies. In the analysis of omics data, clustering is an essential step and serves multiple purposes including for example revealing the unknown functionalities of omics units, assisting dimension reduction in outcome model building, and others. In the most recent omics studies, a prominent trend is to conduct multilayer profiling, which collects multiple types of genetic, genomic, epigenetic and other measurements on the same subjects. In the literature, clustering methods tailored to multilayer omics data are still limited. Directly applying the existing clustering methods to multilayer omics data and clustering each layer first and then combing across layers are both "suboptimal" in that they do not accommodate the interconnections within layers and across layers in an informative way. In this study, we develop the MuNCut (Multilayer NCut) clustering approach. It is tailored to multilayer omics data and sufficiently accounts for both across- and within-layer connections. It is based on the novel NCut technique and also takes advantages of regularized sparse estimation. It has an intuitive formulation and is computationally very feasible. To facilitate implementation, we develop the function muncut in the R package NcutYX. Under a wide spectrum of simulation settings, it outperforms competitors. The analysis of TCGA (The Cancer Genome Atlas) data on breast cancer and cervical cancer shows that MuNCut generates biologically meaningful results which differ from those using the alternatives. We propose a more effective clustering analysis of multiple omics data. It provides a new venue for jointly analyzing genetic, genomic, epigenetic and other measurements.
CROSS-CORRELATING THE γ-RAY SKY WITH CATALOGS OF GALAXY CLUSTERS
Branchini, Enzo; Camera, Stefano; Cuoco, Alessandro; ...
2017-01-18
In this article, we report the detection of a cross-correlation signal between Fermi Large Area Telescope diffuse γ-ray maps and catalogs of clusters. In our analysis, we considered three different catalogs: WHL12, redMaPPer, and PlanckSZ. They all show a positive correlation with different amplitudes, related to the average mass of the objects in each catalog, which also sets the catalog bias. The signal detection is confirmed by the results of a stacking analysis. The cross-correlation signal extends to rather large angular scales, around 1°, that correspond, at the typical redshift of the clusters in these catalogs, to a few tomore » tens of megaparsecs, i.e., the typical scale-length of the large-scale structures in the universe. Most likely this signal is contributed by the cumulative emission from active galactic nuclei (AGNs) associated with the filamentary structures that converge toward the high peaks of the matter density field in which galaxy clusters reside. In addition, our analysis reveals the presence of a second component, more compact in size and compatible with a point-like emission from within individual clusters. At present, we cannot distinguish between the two most likely interpretations for such a signal, i.e., whether it is produced by AGNs inside clusters or if it is a diffuse γ-ray emission from the intracluster medium. Lastly, we argue that this latter, intriguing, hypothesis might be tested by applying this technique to a low-redshift large-mass cluster sample.« less
FRONTIER FIELDS CLUSTERS: DEEP CHANDRA OBSERVATIONS OF THE COMPLEX MERGER MACS J1149.6+2223
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ogrean, G. A.; Weeren, R. J. van; Jones, C.
2016-03-10
The Hubble Space Telescope Frontier Fields cluster MACS J1149.6+2223 is one of the most complex merging clusters, believed to consist of four dark matter halos. We present results from deep (365 ks) Chandra observations of the cluster, which reveal the most distant cold front (z = 0.544) discovered to date. In the cluster outskirts, we also detect hints of a surface brightness edge that could be the bow shock preceding the cold front. The substructure analysis of the cluster identified several components with large relative radial velocities, thus indicating that at least some collisions occur almost along the line of sight.more » The inclination of the mergers with respect to the plane of the sky poses significant observational challenges at X-ray wavelengths. MACS J1149.6+2223 possibly hosts a steep-spectrum radio halo. If the steepness of the radio halo is confirmed, then the radio spectrum, combined with the relatively regular ICM morphology, could indicate that MACS J1149.6+2223 is an old merging cluster.« less
Frontier Fields Clusters: Deep Chandra Observations of the Complex Merger MACS J1149.6+2223
Ogrean, G. A.; Weeren, R. J. van; Jones, C.; ...
2016-03-04
The Hubble Space Telescope Frontier Fields cluster MACS J1149.6+2223 is one of the most complex merging clusters, believed to consist of four dark matter halos. Here, we present results from deep (365 ks) Chandra observations of the cluster, which reveal the most distant cold front (z = 0.544) discovered to date. In the cluster outskirts, we also detect hints of a surface brightness edge that could be the bow shock preceding the cold front. The substructure analysis of the cluster identified several components with large relative radial velocities, thus indicating that at least some collisions occur almost along the linemore » of sight. The inclination of the mergers with respect to the plane of the sky poses significant observational challenges at X-ray wavelengths. MACS J1149.6+2223 possibly hosts a steep-spectrum radio halo. Lastly, if the steepness of the radio halo is confirmed, then the radio spectrum, combined with the relatively regular ICM morphology, could indicate that MACS J1149.6+2223 is an old merging cluster.« less
NASA Astrophysics Data System (ADS)
Schellenberger, G.; Reiprich, T. H.
2017-08-01
The X-ray regime, where the most massive visible component of galaxy clusters, the intracluster medium, is visible, offers directly measured quantities, like the luminosity, and derived quantities, like the total mass, to characterize these objects. The aim of this project is to analyse a complete sample of galaxy clusters in detail and constrain cosmological parameters, like the matter density, Ωm, or the amplitude of initial density fluctuations, σ8. The purely X-ray flux-limited sample (HIFLUGCS) consists of the 64 X-ray brightest galaxy clusters, which are excellent targets to study the systematic effects, that can bias results. We analysed in total 196 Chandra observations of the 64 HIFLUGCS clusters, with a total exposure time of 7.7 Ms. Here, we present our data analysis procedure (including an automated substructure detection and an energy band optimization for surface brightness profile analysis) that gives individually determined, robust total mass estimates. These masses are tested against dynamical and Planck Sunyaev-Zeldovich (SZ) derived masses of the same clusters, where good overall agreement is found with the dynamical masses. The Planck SZ masses seem to show a mass-dependent bias to our hydrostatic masses; possible biases in this mass-mass comparison are discussed including the Planck selection function. Furthermore, we show the results for the (0.1-2.4) keV luminosity versus mass scaling relation. The overall slope of the sample (1.34) is in agreement with expectations and values from literature. Splitting the sample into galaxy groups and clusters reveals, even after a selection bias correction, that galaxy groups exhibit a significantly steeper slope (1.88) compared to clusters (1.06).
Holmes, Sean T; Iuliucci, Robbie J; Mueller, Karl T; Dybowski, Cecil
2015-11-10
Calculations of the principal components of magnetic-shielding tensors in crystalline solids require the inclusion of the effects of lattice structure on the local electronic environment to obtain significant agreement with experimental NMR measurements. We assess periodic (GIPAW) and GIAO/symmetry-adapted cluster (SAC) models for computing magnetic-shielding tensors by calculations on a test set containing 72 insulating molecular solids, with a total of 393 principal components of chemical-shift tensors from 13C, 15N, 19F, and 31P sites. When clusters are carefully designed to represent the local solid-state environment and when periodic calculations include sufficient variability, both methods predict magnetic-shielding tensors that agree well with experimental chemical-shift values, demonstrating the correspondence of the two computational techniques. At the basis-set limit, we find that the small differences in the computed values have no statistical significance for three of the four nuclides considered. Subsequently, we explore the effects of additional DFT methods available only with the GIAO/cluster approach, particularly the use of hybrid-GGA functionals, meta-GGA functionals, and hybrid meta-GGA functionals that demonstrate improved agreement in calculations on symmetry-adapted clusters. We demonstrate that meta-GGA functionals improve computed NMR parameters over those obtained by GGA functionals in all cases, and that hybrid functionals improve computed results over the respective pure DFT functional for all nuclides except 15N.
On the coherent rotation of diffuse matter in numerical simulations of clusters of galaxies
NASA Astrophysics Data System (ADS)
Baldi, Anna Silvia; De Petris, Marco; Sembolini, Federico; Yepes, Gustavo; Lamagna, Luca; Rasia, Elena
2017-03-01
We present a study on the coherent rotation of the intracluster medium and dark matter components of simulated galaxy clusters extracted from a volume-limited sample of the MUSIC project. The set is re-simulated with three different recipes for the gas physics: (I) non-radiative, (II) radiative without active galactic nuclei (AGN) feedback and (III) radiative with AGN feedback. Our analysis is based on the 146 most massive clusters identified as relaxed, 57 per cent of the total sample. We classify these objects as rotating and non-rotating according to the gas spin parameter, a quantity that can be related to cluster observations. We find that 4 per cent of the relaxed sample is rotating according to our criterion. By looking at the radial profiles of their specific angular momentum vector, we find that the solid body model is not a suitable description of rotational motions. The radial profiles of the velocity of the dark matter show a prevalence of the random velocity dispersion. Instead, the intracluster medium profiles are characterized by a comparable contribution from the tangential velocity and the dispersion. In general, the dark matter component dominates the dynamics of the clusters, as suggested by the correlation between its angular momentum and the gas one, and by the lack of relevant differences among the three sets of simulations.
AFLP-based genetic diversity assessment of commercially important tea germplasm in India.
Sharma, R K; Negi, M S; Sharma, S; Bhardwaj, P; Kumar, R; Bhattachrya, E; Tripathi, S B; Vijayan, D; Baruah, A R; Das, S C; Bera, B; Rajkumar, R; Thomas, J; Sud, R K; Muraleedharan, N; Hazarika, M; Lakshmikumaran, M; Raina, S N; Ahuja, P S
2010-08-01
India has a large repository of important tea accessions and, therefore, plays a major role in improving production and quality of tea across the world. Using seven AFLP primer combinations, we analyzed 123 commercially important tea accessions representing major populations in India. The overall genetic similarity recorded was 51%. No significant differences were recorded in average genetic similarity among tea populations cultivated in various geographic regions (northwest 0.60, northeast and south both 0.59). UPGMA cluster analysis grouped the tea accessions according to geographic locations, with a bias toward China or Assam/Cambod types. Cluster analysis results were congruent with principal component analysis. Further, analysis of molecular variance detected a high level of genetic variation (85%) within and limited genetic variation (15%) among the populations, suggesting their origin from a similar genetic pool.
NASA Technical Reports Server (NTRS)
Fabbiano, G.
1995-01-01
X-ray studies of galaxies by the Smithsonian Astrophysical Observatory (SAO) and MIT are described. Activities at SAO include ROSAT PSPC x-ray data reduction and analysis pipeline; x-ray sources in nearby Sc galaxies; optical, x-ray, and radio study of ongoing galactic merger; a radio, far infrared, optical, and x-ray study of the Sc galaxy NGC247; and a multiparametric analysis of the Einstein sample of early-type galaxies. Activities at MIT included continued analysis of observations with ROSAT and ASCA, and continued development of new approaches to spectral analysis with ASCA and AXAF. Also, a new method for characterizing structure in galactic clusters was developed and applied to ROSAT images of a large sample of clusters. An appendix contains preprints generated by the research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Branchini, Enzo; Camera, Stefano; Cuoco, Alessandro
In this article, we report the detection of a cross-correlation signal between Fermi Large Area Telescope diffuse γ-ray maps and catalogs of clusters. In our analysis, we considered three different catalogs: WHL12, redMaPPer, and PlanckSZ. They all show a positive correlation with different amplitudes, related to the average mass of the objects in each catalog, which also sets the catalog bias. The signal detection is confirmed by the results of a stacking analysis. The cross-correlation signal extends to rather large angular scales, around 1°, that correspond, at the typical redshift of the clusters in these catalogs, to a few tomore » tens of megaparsecs, i.e., the typical scale-length of the large-scale structures in the universe. Most likely this signal is contributed by the cumulative emission from active galactic nuclei (AGNs) associated with the filamentary structures that converge toward the high peaks of the matter density field in which galaxy clusters reside. In addition, our analysis reveals the presence of a second component, more compact in size and compatible with a point-like emission from within individual clusters. At present, we cannot distinguish between the two most likely interpretations for such a signal, i.e., whether it is produced by AGNs inside clusters or if it is a diffuse γ-ray emission from the intracluster medium. Lastly, we argue that this latter, intriguing, hypothesis might be tested by applying this technique to a low-redshift large-mass cluster sample.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Branchini, Enzo; Camera, Stefano; Cuoco, Alessandro
We report the detection of a cross-correlation signal between Fermi Large Area Telescope diffuse γ -ray maps and catalogs of clusters. In our analysis, we considered three different catalogs: WHL12, redMaPPer, and PlanckSZ. They all show a positive correlation with different amplitudes, related to the average mass of the objects in each catalog, which also sets the catalog bias. The signal detection is confirmed by the results of a stacking analysis. The cross-correlation signal extends to rather large angular scales, around 1°, that correspond, at the typical redshift of the clusters in these catalogs, to a few to tens ofmore » megaparsecs, i.e., the typical scale-length of the large-scale structures in the universe. Most likely this signal is contributed by the cumulative emission from active galactic nuclei (AGNs) associated with the filamentary structures that converge toward the high peaks of the matter density field in which galaxy clusters reside. In addition, our analysis reveals the presence of a second component, more compact in size and compatible with a point-like emission from within individual clusters. At present, we cannot distinguish between the two most likely interpretations for such a signal, i.e., whether it is produced by AGNs inside clusters or if it is a diffuse γ -ray emission from the intracluster medium. We argue that this latter, intriguing, hypothesis might be tested by applying this technique to a low-redshift large-mass cluster sample.« less
Strong-lensing analysis of A2744 with MUSE and Hubble Frontier Fields images
NASA Astrophysics Data System (ADS)
Mahler, G.; Richard, J.; Clément, B.; Lagattuta, D.; Schmidt, K.; Patrício, V.; Soucail, G.; Bacon, R.; Pello, R.; Bouwens, R.; Maseda, M.; Martinez, J.; Carollo, M.; Inami, H.; Leclercq, F.; Wisotzki, L.
2018-01-01
We present an analysis of Multi Unit Spectroscopic Explorer (MUSE) observations obtained on the massive Frontier Fields (FFs) cluster A2744. This new data set covers the entire multiply imaged region around the cluster core. The combined catalogue consists of 514 spectroscopic redshifts (with 414 new identifications). We use this redshift information to perform a strong-lensing analysis revising multiple images previously found in the deep FF images, and add three new MUSE-detected multiply imaged systems with no obvious Hubble Space Telescope counterpart. The combined strong-lensing constraints include a total of 60 systems producing 188 images altogether, out of which 29 systems and 83 images are spectroscopically confirmed, making A2744 one of the most well-constrained clusters to date. Thanks to the large amount of spectroscopic redshifts, we model the influence of substructures at larger radii, using a parametrization including two cluster-scale components in the cluster core and several group scale in the outskirts. The resulting model accurately reproduces all the spectroscopic multiple systems, reaching an rms of 0.67 arcsec in the image plane. The large number of MUSE spectroscopic redshifts gives us a robust model, which we estimate reduces the systematic uncertainty on the 2D mass distribution by up to ∼2.5 times the statistical uncertainty in the cluster core. In addition, from a combination of the parametrization and the set of constraints, we estimate the relative systematic uncertainty to be up to 9 per cent at 200 kpc.
Vasilaki, V; Volcke, E I P; Nandi, A K; van Loosdrecht, M C M; Katsou, E
2018-04-26
Multivariate statistical analysis was applied to investigate the dependencies and underlying patterns between N 2 O emissions and online operational variables (dissolved oxygen and nitrogen component concentrations, temperature and influent flow-rate) during biological nitrogen removal from wastewater. The system under study was a full-scale reactor, for which hourly sensor data were available. The 15-month long monitoring campaign was divided into 10 sub-periods based on the profile of N 2 O emissions, using Binary Segmentation. The dependencies between operating variables and N 2 O emissions fluctuated according to Spearman's rank correlation. The correlation between N 2 O emissions and nitrite concentrations ranged between 0.51 and 0.78. Correlation >0.7 between N 2 O emissions and nitrate concentrations was observed at sub-periods with average temperature lower than 12 °C. Hierarchical k-means clustering and principal component analysis linked N 2 O emission peaks with precipitation events and ammonium concentrations higher than 2 mg/L, especially in sub-periods characterized by low N 2 O fluxes. Additionally, the highest ranges of measured N 2 O fluxes belonged to clusters corresponding with NO 3 -N concentration less than 1 mg/L in the upstream plug-flow reactor (middle of oxic zone), indicating slow nitrification rates. The results showed that the range of N 2 O emissions partially depends on the prior behavior of the system. The principal component analysis validated the findings from the clustering analysis and showed that ammonium, nitrate, nitrite and temperature explained a considerable percentage of the variance in the system for the majority of the sub-periods. The applied statistical methods, linked the different ranges of emissions with the system variables, provided insights on the effect of operating conditions on N 2 O emissions in each sub-period and can be integrated into N 2 O emissions data processing at wastewater treatment plants. Copyright © 2018. Published by Elsevier Ltd.
Zhang, Shan; Xu, Lu; Liu, Yang-Xi; Fu, Hai-Yan; Xiao, Zuo-Bing; She, Yuan-Bin
2018-04-01
E-jiao (Colla Corii Asini, CCA) has been widely used as a healthy food and Chinese medicine. Although authentic CCA is characterized by its typical sweet and neutral fragrance, its aroma components have been rarely investigated. This work investigated the aroma-active components and antioxidant activity of 19 CCAs from different geographical origins. CCA extracts obtained by simultaneous distillation and extraction were analyzed by gas chromatography-mass spectrometry (GC-MS), gas chromatography-olfactometry (GC-O) and sensory analysis. The antioxidant activity of CCAs was determined by ABTS and DPPH assays. A total of 65 volatile compounds were identified and quantified by GC-MS and 23 aroma-active compounds were identified by GC-O and aroma extract dilution analysis. The most powerful aroma-active compounds were identified based on the flavor dilution factor and their contents were compared among the 19 CCAs. Principal component analysis of the 23 aroma-active components showed 3 significant clusters. Canonical correlation analysis between antioxidant assays and the 23 aroma-active compounds indicates strong correlation (r = 0.9776, p = 0.0281). Analysis of aroma-active components shows potential for quality evaluation and discrimination of CCAs from different geographical origins.
Valdespino-Gómez, Víctor Manuel; Valdespino-Castillo, Patricia Margarita; Valdespino-Castillo, Víctor Edmundo
2015-01-01
Nowadays, cellular physiology is best understood by analysing their interacting molecular components. Proteins are the major components of the cells. Different proteins are organised in the form of functional clusters, pathways or networks. These molecules are ordered in clusters of receptor molecules of extracellular signals, transducers, sensors and biological response effectors. The identification of these intracellular signaling pathways in different cellular types has required a long journey of experimental work. More than 300 intracellular signaling pathways have been identified in human cells. They participate in cell homeostasis processes for structural and functional maintenance. Some of them participate simultaneously or in a nearly-consecutive progression to generate a cellular phenotypic change. In this review, an analysis is performed on the main intracellular signaling pathways that take part in the cellular proliferation process, and the potential use of some components of these pathways as target for therapeutic interventionism are also underlined. Copyright © 2015 Academia Mexicana de Cirugía A.C. Published by Masson Doyma México S.A. All rights reserved.
Spatial and temporal characterizations of water quality in Kuwait Bay.
Al-Mutairi, N; Abahussain, A; El-Battay, A
2014-06-15
The spatial and temporal patterns of water quality in Kuwait Bay have been investigated using data from six stations between 2009 and 2011. The results showed that most of water quality parameters such as phosphorus (PO4), nitrate (NO3), dissolved oxygen (DO), and Total Suspended Solids (TSS) fluctuated over time and space. Based on Water Quality Index (WQI) data, six stations were significantly clustered into two main classes using cluster analysis, one group located in western side of the Bay, and other in eastern side. Three principal components are responsible for water quality variations in the Bay. The first component included DO and pH. The second included PO4, TSS and NO3, and the last component contained seawater temperature and turbidity. The spatial and temporal patterns of water quality in Kuwait Bay are mainly controlled by seasonal variations and discharges from point sources of pollution along Kuwait Bay's coast as well as from Shatt Al-Arab River. Copyright © 2014 Elsevier Ltd. All rights reserved.
Clustering execution in a processing system to increase power savings
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bose, Pradip; Buyuktosunoglu, Alper; Jacobson, Hans M.
Embodiments relate to clustering execution in a processing system. An aspect includes accessing a control flow graph that defines a data dependency and an execution sequence of a plurality of tasks of an application that executes on a plurality of system components. The execution sequence of the tasks in the control flow graph is modified as a clustered control flow graph that clusters active and idle phases of a system component while maintaining the data dependency. The clustered control flow graph is sent to an operating system, where the operating system utilizes the clustered control flow graph for scheduling themore » tasks.« less
A new approach for computing a flood vulnerability index using cluster analysis
NASA Astrophysics Data System (ADS)
Fernandez, Paulo; Mourato, Sandra; Moreira, Madalena; Pereira, Luísa
2016-08-01
A Flood Vulnerability Index (FloodVI) was developed using Principal Component Analysis (PCA) and a new aggregation method based on Cluster Analysis (CA). PCA simplifies a large number of variables into a few uncorrelated factors representing the social, economic, physical and environmental dimensions of vulnerability. CA groups areas that have the same characteristics in terms of vulnerability into vulnerability classes. The grouping of the areas determines their classification contrary to other aggregation methods in which the areas' classification determines their grouping. While other aggregation methods distribute the areas into classes, in an artificial manner, by imposing a certain probability for an area to belong to a certain class, as determined by the assumption that the aggregation measure used is normally distributed, CA does not constrain the distribution of the areas by the classes. FloodVI was designed at the neighbourhood level and was applied to the Portuguese municipality of Vila Nova de Gaia where several flood events have taken place in the recent past. The FloodVI sensitivity was assessed using three different aggregation methods: the sum of component scores, the first component score and the weighted sum of component scores. The results highlight the sensitivity of the FloodVI to different aggregation methods. Both sum of component scores and weighted sum of component scores have shown similar results. The first component score aggregation method classifies almost all areas as having medium vulnerability and finally the results obtained using the CA show a distinct differentiation of the vulnerability where hot spots can be clearly identified. The information provided by records of previous flood events corroborate the results obtained with CA, because the inundated areas with greater damages are those that are identified as high and very high vulnerability areas by CA. This supports the fact that CA provides a reliable FloodVI.
Least-dependent-component analysis based on mutual information
NASA Astrophysics Data System (ADS)
Stögbauer, Harald; Kraskov, Alexander; Astakhov, Sergey A.; Grassberger, Peter
2004-12-01
We propose to use precise estimators of mutual information (MI) to find the least dependent components in a linearly mixed signal. On the one hand, this seems to lead to better blind source separation than with any other presently available algorithm. On the other hand, it has the advantage, compared to other implementations of “independent” component analysis (ICA), some of which are based on crude approximations for MI, that the numerical values of the MI can be used for (i) estimating residual dependencies between the output components; (ii) estimating the reliability of the output by comparing the pairwise MIs with those of remixed components; and (iii) clustering the output according to the residual interdependencies. For the MI estimator, we use a recently proposed k -nearest-neighbor-based algorithm. For time sequences, we combine this with delay embedding, in order to take into account nontrivial time correlations. After several tests with artificial data, we apply the resulting MILCA (mutual-information-based least dependent component analysis) algorithm to a real-world dataset, the ECG of a pregnant woman.
Kmeans-ICA based automatic method for ocular artifacts removal in a motorimagery classification.
Bou Assi, Elie; Rihana, Sandy; Sawan, Mohamad
2014-01-01
Electroencephalogram (EEG) recordings aroused as inputs of a motor imagery based BCI system. Eye blinks contaminate the spectral frequency of the EEG signals. Independent Component Analysis (ICA) has been already proved for removing these artifacts whose frequency band overlap with the EEG of interest. However, already ICA developed methods, use a reference lead such as the ElectroOculoGram (EOG) to identify the ocular artifact components. In this study, artifactual components were identified using an adaptive thresholding by means of Kmeans clustering. The denoised EEG signals have been fed into a feature extraction algorithm extracting the band power, the coherence and the phase locking value and inserted into a linear discriminant analysis classifier for a motor imagery classification.
Grouping of Bulgarian wines according to grape variety by using statistical methods
NASA Astrophysics Data System (ADS)
Milev, M.; Nikolova, Kr.; Ivanova, Ir.; Minkova, St.; Evtimov, T.; Krustev, St.
2017-12-01
68 different types of Bulgarian wines were studied in accordance with 9 optical parameters as follows: color parameters in XYZ and SIE Lab color systems, lightness, Hue angle, chroma, fluorescence intensity and emission wavelength. The main objective of this research is using hierarchical cluster analysis to evaluate the similarity and the distance between examined different types of Bulgarian wines and their grouping based on physical parameters. We have found that wines are grouped in clusters on the base of the degree of identity between them. There are two main clusters each one with two subclusters. The first one contains white wines and Sira, the second contains red wines and rose. The results from cluster analysis are presented graphically by a dendrogram. The other statistical technique used is factor analysis performed by the Method of Principal Components (PCA). The aim is to reduce the large number of variables to a few factors by grouping the correlated variables into one factor and subdividing the noncorrelated variables into different factors. Moreover the factor analysis provided the possibility to determine the parameters with the greatest influence over the distribution of samples in different clusters. In our study after the rotation of the factors with Varimax method the parameters were combined into two factors, which explain about 80 % of the total variation. The first one explains the 61.49% and correlates with color characteristics, the second one explains 18.34% from the variation and correlates with the parameters connected with fluorescence spectroscopy.
Assessment of Depression in a Rodent Model of Spinal Cord Injury
Luedtke, Kelsey; Bouchard, Sioui Maldonado; Woller, Sarah A.; Funk, Mary Katherine; Aceves, Miriam
2014-01-01
Abstract Despite an increased incidence of depression in patients after spinal cord injury (SCI), there is no animal model of depression after SCI. To address this, we used a battery of established tests to assess depression after a rodent contusion injury. Subjects were acclimated to the tasks, and baseline scores were collected before SCI. Testing was conducted on days 9–10 (acute) and 19–20 (chronic) postinjury. To categorize depression, subjects' scores on each behavioral measure were averaged across the acute and chronic stages of injury and subjected to a principal component analysis. This analysis revealed a two-component structure, which explained 72.2% of between-subjects variance. The data were then analyzed with a hierarchical cluster analysis, identifying two clusters that differed significantly on the sucrose preference, open field, social exploration, and burrowing tasks. One cluster (9 of 26 subjects) displayed characteristics of depression. Using these data, a discriminant function analysis was conducted to derive an equation that could classify subjects as “depressed” on days 9–10. The discriminant function was used in a second experiment examining whether the depression-like symptoms could be reversed with the antidepressant, fluoxetine. Fluoxetine significantly decreased immobility in the forced swim test (FST) in depressed subjects identified with the equation. Subjects that were depressed and treated with saline displayed significantly increased immobility on the FST, relative to not depressed, saline-treated controls. These initial experiments validate our tests of depression, generating a powerful model system for further understanding the relationships between molecular changes induced by SCI and the development of depression. PMID:24564232
Clustering of food and activity preferences in primary school children.
Rodenburg, Gerda; Oenema, Anke; Pasma, Marleen; Kremers, Stef P J; van de Mheen, Dike
2013-01-01
This study examined clustering of food and activity preferences in Dutch primary school children. It also explored whether the preference clusters are associated with child and parental background characteristics and with parenting practices. Data were used from 1480 parent-child dyads participating in the IVO Nutrition and Physical Activity Child cohort (INPACT). Children aged 8-11years reported their preferences for food (e.g. fruit and sweet snacks) and activities (e.g. biking and watching television) at school with a newly-developed, visual instrument designed for primary school children. Parents completed a questionnaire at home. Principal component analysis was used to identify preference clusters. Backward regression analyses were used to examine the relationship between child and parental characteristics with cluster scores. We found (1) a clustering of preferences for unhealthy foods and unhealthy drinks, (2) a clustering of preferences for various physical activity behaviours, and (3) a clustering of preferences for unhealthy drinks and sedentary behaviour. Boys had a higher cluster score than girls on all three preference clusters. In addition, physical activity-related parenting practices were negatively related to unhealthy preference clusters and positively to the physical-activity-preference cluster. The next step is to relate our preference clusters to child dietary and activity behaviours, with special attention to gender differences. This may help in the development of interventions aimed at improving children's food and activity preferences. Copyright © 2012 Elsevier Ltd. All rights reserved.
Transforming Graph Data for Statistical Relational Learning
2012-10-01
Jordan, 2003), PLSA (Hofmann, 1999), ? Classification via RMN (Taskar et al., 2003) or SVM (Hasan, Chaoji, Salem , & Zaki, 2006) ? Hierarchical...dimensionality reduction methods such as Principal 407 Rossi, McDowell, Aha, & Neville Component Analysis (PCA), Principal Factor Analysis ( PFA ), and...clustering algorithm. Journal of the Royal Statistical Society. Series C, Applied statistics, 28, 100–108. Hasan, M. A., Chaoji, V., Salem , S., & Zaki, M
Katayama, K; Sato, T; Arai, T; Amao, H; Ohta, Y; Ozawa, T; Kenyon, P R; Hickson, R E; Tazaki, H
2013-02-01
Simple liquid chromatography-mass spectrometry (LC-MS) was applied to non-targeted metabolic analyses to discover new metabolic markers in animal plasma. Principle component analysis (PCA) and partial least squares-discriminate analysis (PLS-DA) were used to analyse LC-MS multivariate data. PCA clearly generated two separate clusters for artificially induced diabetic mice and healthy control mice. PLS-DA of time-course changes in plasma metabolites of chicks after feeding generated three clusters (pre- and immediately after feeding, 0.5-3 h after feeding and 4 h after feeding). Two separate clusters were also generated for plasma metabolites of pregnant Angus heifers with differing live-weight change profiles (gaining or losing). The accompanying PLS-DA loading plot detailed the metabolites that contribute the most to the cluster separation. In each case, the same highly hydrophilic metabolite was strongly correlated to the group separation. The metabolite was identified as betaine by LC-MS/MS. This result indicates that betaine and its metabolic precursor, choline, may be useful biomarkers to evaluate the nutritional and metabolic status of animals. © 2011 Blackwell Verlag GmbH.
Mishra, K K; Pal, R S; Arunkumar, R; Chandrashekara, C; Jain, S K; Bhatt, J C
2013-06-01
Total phenolics, radical scavenging activity (RSA) on DPPH, ascorbic acid content and chelating activity on Fe(2+) of Pleurotus citrinopileatus, Pleurotus djamor, Pleurotus eryngii, Pleurotus flabellatus, Pleurotus florida, Pleurotus ostreatus, Pleurotus sajor-caju and Hypsizygus ulmarius have been evaluated. The assayed mushrooms contained 3.94-21.67 mg TAE of phenolics, 13.63-69.67% DPPH scavenging activity, 3.76-6.76 mg ascorbic acid and 60.25-82.7% chelating activity. Principal Component Analysis (PCA) revealed that significantly higher total phenolics, RSA on DPPH and growth/day was present in P. eryngii whereas P. citrinopileatus showed higher ascorbic acid and chelating activity. Agglomerative hierarchical clustering analysis revealed that studied mushroom species fall into two clusters; Cluster I included P. djamor, P. eryngii and P. flabellatus, while Cluster II included H. ulmarius, P. sajor-caju, P. citrinopileatus, P. ostreatus and P. florida. Enhanced yield of P. eryngii was achieved on spent compost casing material. Use of casing materials enhanced yield by 21-107% over non-cased substrate. Copyright © 2012 Elsevier Ltd. All rights reserved.
Seasonal and spatial variations of water quality and trophic status in Daya Bay, South China Sea.
Wu, Mei-Lin; Wang, You-Shao; Wang, Yu-Tu; Sun, Fu-Lin; Sun, Cui-Ci; Cheng, Hao; Dong, Jun-De
2016-11-15
Coastal water quality and trophic status are subject to intensive environmental stress induced by human activities and climate change. Quarterly cruises were conducted to identify environmental characteristics in Daya Bay in 2013. Water quality is spatially and temporally dynamic in the bay. Cluster analysis (CA) groups 12 monitoring stations into two clusters. Cluster I consists of stations (S1, S2, S4-S7, S9, and S12) located in the central, eastern, and southern parts of the bay, representing less polluted regions. Cluster II includes stations (S3, S8, S10, and S11) located in the western and northern parts of the bay, indicating the highly polluted regions receiving a high amount of wastewater and freshwater discharge. Principal component analysis (PCA) identified that water quality experience seasonal change (summer, winter, and spring-autumn seasons) because of two monsoons in the study area. Eutrophication in the bay is graded as high by Assessment of Estuarine Trophic Status (ASSETS). Copyright © 2016 Elsevier Ltd. All rights reserved.
Conserved DNA motifs in the type II-A CRISPR leader region.
Van Orden, Mason J; Klein, Peter; Babu, Kesavan; Najar, Fares Z; Rajan, Rakhi
2017-01-01
The Clustered Regularly Interspaced Short Palindromic Repeats associated (CRISPR-Cas) systems consist of RNA-protein complexes that provide bacteria and archaea with sequence-specific immunity against bacteriophages, plasmids, and other mobile genetic elements. Bacteria and archaea become immune to phage or plasmid infections by inserting short pieces of the intruder DNA (spacer) site-specifically into the leader-repeat junction in a process called adaptation. Previous studies have shown that parts of the leader region, especially the 3' end of the leader, are indispensable for adaptation. However, a comprehensive analysis of leader ends remains absent. Here, we have analyzed the leader, repeat, and Cas proteins from 167 type II-A CRISPR loci. Our results indicate two distinct conserved DNA motifs at the 3' leader end: ATTTGAG (noted previously in the CRISPR1 locus of Streptococcus thermophilus DGCC7710) and a newly defined CTRCGAG, associated with the CRISPR3 locus of S. thermophilus DGCC7710. A third group with a very short CG DNA conservation at the 3' leader end is observed mostly in lactobacilli. Analysis of the repeats and Cas proteins revealed clustering of these CRISPR components that mirrors the leader motif clustering, in agreement with the coevolution of CRISPR-Cas components. Based on our analysis of the type II-A CRISPR loci, we implicate leader end sequences that could confer site-specificity for the adaptation-machinery in the different subsets of type II-A CRISPR loci.
Conserved DNA motifs in the type II-A CRISPR leader region
Babu, Kesavan; Najar, Fares Z.
2017-01-01
The Clustered Regularly Interspaced Short Palindromic Repeats associated (CRISPR-Cas) systems consist of RNA-protein complexes that provide bacteria and archaea with sequence-specific immunity against bacteriophages, plasmids, and other mobile genetic elements. Bacteria and archaea become immune to phage or plasmid infections by inserting short pieces of the intruder DNA (spacer) site-specifically into the leader-repeat junction in a process called adaptation. Previous studies have shown that parts of the leader region, especially the 3′ end of the leader, are indispensable for adaptation. However, a comprehensive analysis of leader ends remains absent. Here, we have analyzed the leader, repeat, and Cas proteins from 167 type II-A CRISPR loci. Our results indicate two distinct conserved DNA motifs at the 3′ leader end: ATTTGAG (noted previously in the CRISPR1 locus of Streptococcus thermophilus DGCC7710) and a newly defined CTRCGAG, associated with the CRISPR3 locus of S. thermophilus DGCC7710. A third group with a very short CG DNA conservation at the 3′ leader end is observed mostly in lactobacilli. Analysis of the repeats and Cas proteins revealed clustering of these CRISPR components that mirrors the leader motif clustering, in agreement with the coevolution of CRISPR-Cas components. Based on our analysis of the type II-A CRISPR loci, we implicate leader end sequences that could confer site-specificity for the adaptation-machinery in the different subsets of type II-A CRISPR loci. PMID:28392985
Impact of SZ cluster residuals in CMB maps and CMB-LSS cross-correlations
NASA Astrophysics Data System (ADS)
Chen, T.; Remazeilles, M.; Dickinson, C.
2018-06-01
Residual foreground contamination in cosmic microwave background (CMB) maps, such as the residual contamination from thermal Sunyaev-Zeldovich (SZ) effect in the direction of galaxy clusters, can bias the cross-correlation measurements between CMB and large-scale structure optical surveys. It is thus essential to quantify those residuals and, if possible, to null out SZ cluster residuals in CMB maps. We quantify for the first time the amount of SZ cluster contamination in the released Planck 2015 CMB maps through (i) the stacking of CMB maps in the direction of the clusters, and (ii) the computation of cross-correlation power spectra between CMB maps and the SDSS-IV large-scale structure data. Our cross-power spectrum analysis yields a 30σ detection at the cluster scale (ℓ = 1500-2500) and a 39σ detection on larger scales (ℓ = 500-1500) due to clustering of SZ clusters, giving an overall 54σ detection of SZ cluster residuals in the Planck CMB maps. The Planck 2015 NILC CMB map is shown to have 44 ± 4% of thermal SZ foreground emission left in it. Using the 'Constrained ILC' component separation technique, we construct an alternative Planck CMB map, the 2D-ILC map, which is shown to have negligible SZ contamination, at the cost of being slightly more contaminated by Galactic foregrounds and noise. We also discuss the impact of the SZ residuals in CMB maps on the measurement of the ISW effect, which is shown to be negligible based on our analysis.
Zautner, Andreas Erich; Masanta, Wycliffe Omurwa; Tareen, Abdul Malik; Weig, Michael; Lugert, Raimond; Groß, Uwe; Bader, Oliver
2013-11-07
Campylobacter jejuni, the most common bacterial pathogen causing gastroenteritis, shows a wide genetic diversity. Previously, we demonstrated by the combination of multi locus sequence typing (MLST)-based UPGMA-clustering and analysis of 16 genetic markers that twelve different C. jejuni subgroups can be distinguished. Among these are two prominent subgroups. The first subgroup contains the majority of hyperinvasive strains and is characterized by a dimeric form of the chemotaxis-receptor Tlp7(m+c). The second has an extended amino acid metabolism and is characterized by the presence of a periplasmic asparaginase (ansB) and gamma-glutamyl-transpeptidase (ggt). Phyloproteomic principal component analysis (PCA) hierarchical clustering of MALDI-TOF based intact cell mass spectrometry (ICMS) spectra was able to group particular C. jejuni subgroups of phylogenetic related isolates in distinct clusters. Especially the aforementioned Tlp7(m+c)(+) and ansB+/ ggt+ subgroups could be discriminated by PCA. Overlay of ICMS spectra of all isolates led to the identification of characteristic biomarker ions for these specific C. jejuni subgroups. Thus, mass peak shifts can be used to identify the C. jejuni subgroup with an extended amino acid metabolism. Although the PCA hierarchical clustering of ICMS-spectra groups the tested isolates into a different order as compared to MLST-based UPGMA-clustering, the isolates of the indicator-groups form predominantly coherent clusters. These clusters reflect phenotypic aspects better than phylogenetic clustering, indicating that the genes corresponding to the biomarker ions are phylogenetically coupled to the tested marker genes. Thus, PCA clustering could be an additional tool for analyzing the relatedness of bacterial isolates.
A Typology of Students Based on Academic Entitlement
ERIC Educational Resources Information Center
Luckett, Michael; Trocchia, Philip J.; Noel, Noel Mark; Marlin, Dan
2017-01-01
Two hundred ninety-three university business students were surveyed using an academic entitlement (AE) scale updated to include new technologies. Using factor analysis, three components of AE were identified: grade entitlement, behavioral entitlement, and service entitlement. A k-means clustering procedure was then applied to identify four groups…
SELF-ORGANIZING MAPS FOR INTEGRATED ASSESSMENT OF THE MID-ATLANTIC REGION
A. new method was developed to perform an environmental assessment for the
Mid-Atlantic Region (MAR). This was a combination of the self-organizing map (SOM) neural network and principal component analysis (PCA). The method is capable of clustering ecosystems in terms of envi...
Human mitochondrial MIA40 (CHCHD4) is a component of the Fe-S cluster export machinery.
Murari, Anjaneyulu; Thiriveedi, Venkata Ramana; Mohammad, Fareed; Vengaldas, Viswamithra; Gorla, Madhavi; Tammineni, Prasad; Krishnamoorthy, Thanuja; Sepuri, Naresh Babu V
2015-10-15
Mitochondria play an essential role in synthesis and export of iron-sulfur (Fe-S) clusters to other sections of a cell. Although the mechanism of Fe-S cluster synthesis is well elucidated, information on the identity of the proteins involved in the export pathway is limited. The present study identifies hMIA40 (human mitochondrial intermembrane space import and assembly protein 40), also known as CHCHD4 (coiled-coil-helix-coiled-coil-helix domain-containing 4), as a component of the mitochondrial Fe-S cluster export machinery. hMIA40 is an iron-binding protein with the ability to bind iron in vivo and in vitro. hMIA40 harbours CPC (Cys-Pro-Cys) motif-dependent Fe-S clusters that are sensitive to oxidation. Depletion of hMIA40 results in accumulation of iron in mitochondria concomitant with decreases in the activity and stability of Fe-S-containing cytosolic enzymes. Intriguingly, overexpression of either the mitochondrial export component or cytosolic the Fe-S cluster assembly component does not have any effect on the phenotype of hMIA40-depleted cells. Taken together, our results demonstrate an indispensable role for hMIA40 for the export of Fe-S clusters from mitochondria. © 2015 Authors; published by Portland Press Limited.
Analysis of cytokine release assay data using machine learning approaches.
Xiong, Feiyu; Janko, Marco; Walker, Mindi; Makropoulos, Dorie; Weinstock, Daniel; Kam, Moshe; Hrebien, Leonid
2014-10-01
The possible onset of Cytokine Release Syndrome (CRS) is an important consideration in the development of monoclonal antibody (mAb) therapeutics. In this study, several machine learning approaches are used to analyze CRS data. The analyzed data come from a human blood in vitro assay which was used to assess the potential of mAb-based therapeutics to produce cytokine release similar to that induced by Anti-CD28 superagonistic (Anti-CD28 SA) mAbs. The data contain 7 mAbs and two negative controls, a total of 423 samples coming from 44 donors. Three (3) machine learning approaches were applied in combination to observations obtained from that assay, namely (i) Hierarchical Cluster Analysis (HCA); (ii) Principal Component Analysis (PCA) followed by K-means clustering; and (iii) Decision Tree Classification (DTC). All three approaches were able to identify the treatment that caused the most severe cytokine response. HCA was able to provide information about the expected number of clusters in the data. PCA coupled with K-means clustering allowed classification of treatments sample by sample, and visualizing clusters of treatments. DTC models showed the relative importance of various cytokines such as IFN-γ, TNF-α and IL-10 to CRS. The use of these approaches in tandem provides better selection of parameters for one method based on outcomes from another, and an overall improved analysis of the data through complementary approaches. Moreover, the DTC analysis showed in addition that IL-17 may be correlated with CRS reactions, although this correlation has not yet been corroborated in the literature. Copyright © 2014 Elsevier B.V. All rights reserved.
Busch, Vincent; Van Stel, Henk F; Schrijvers, Augustinus J P; de Leeuw, Johannes R J
2013-12-04
Recent studies show several health-related behaviors to cluster in adolescents. This has important implications for public health. Interrelated behaviors have been shown to be most effectively targeted by multimodal interventions addressing wider-ranging improvements in lifestyle instead of via separate interventions targeting individual behaviors. However, few previous studies have taken into account a broad, multi-disciplinary range of health-related behaviors and connected these behavioral patterns to health-related outcomes. This paper presents an analysis of the clustering of a broad range of health-related behaviors with relevant demographic factors and several health-related outcomes in adolescents. Self-report questionnaire data were collected from a sample of 2,690 Dutch high school adolescents. Behavioral patterns were deducted via Principal Components Analysis. Subsequently a Two-Step Cluster Analysis was used to identify groups of adolescents with similar behavioral patterns and health-related outcomes. Four distinct behavioral patterns describe the analyzed individual behaviors: 1- risk-prone behavior, 2- bully behavior, 3- problematic screen time use, and 4- sedentary behavior. Subsequent cluster analysis identified four clusters of adolescents. Multi-problem behavior was associated with problematic physical and psychosocial health outcomes, as opposed to those exerting relatively few unhealthy behaviors. These associations were relatively independent of demographics such as ethnicity, gender and socio-economic status. The results show that health-related behaviors tend to cluster, indicating that specific behavioral patterns underlie individual health behaviors. In addition, specific patterns of health-related behaviors were associated with specific health outcomes and demographic factors. In general, unhealthy behavior on account of multiple health-related behaviors was associated with both poor psychosocial and physical health. These findings have significant meaning for future public health programs, which should be more tailored with use of such knowledge on behavioral clustering via e.g. Transfer Learning.
2013-01-01
Background Recent studies show several health-related behaviors to cluster in adolescents. This has important implications for public health. Interrelated behaviors have been shown to be most effectively targeted by multimodal interventions addressing wider-ranging improvements in lifestyle instead of via separate interventions targeting individual behaviors. However, few previous studies have taken into account a broad, multi-disciplinary range of health-related behaviors and connected these behavioral patterns to health-related outcomes. This paper presents an analysis of the clustering of a broad range of health-related behaviors with relevant demographic factors and several health-related outcomes in adolescents. Methods Self-report questionnaire data were collected from a sample of 2,690 Dutch high school adolescents. Behavioral patterns were deducted via Principal Components Analysis. Subsequently a Two-Step Cluster Analysis was used to identify groups of adolescents with similar behavioral patterns and health-related outcomes. Results Four distinct behavioral patterns describe the analyzed individual behaviors: 1- risk-prone behavior, 2- bully behavior, 3- problematic screen time use, and 4- sedentary behavior. Subsequent cluster analysis identified four clusters of adolescents. Multi-problem behavior was associated with problematic physical and psychosocial health outcomes, as opposed to those exerting relatively few unhealthy behaviors. These associations were relatively independent of demographics such as ethnicity, gender and socio-economic status. Conclusions The results show that health-related behaviors tend to cluster, indicating that specific behavioral patterns underlie individual health behaviors. In addition, specific patterns of health-related behaviors were associated with specific health outcomes and demographic factors. In general, unhealthy behavior on account of multiple health-related behaviors was associated with both poor psychosocial and physical health. These findings have significant meaning for future public health programs, which should be more tailored with use of such knowledge on behavioral clustering via e.g. Transfer Learning. PMID:24305509
Butaciu, Sinziana; Senila, Marin; Sarbu, Costel; Ponta, Michaela; Tanaselia, Claudiu; Cadar, Oana; Roman, Marius; Radu, Emil; Sima, Mihaela; Frentiu, Tiberiu
2017-04-01
The study proposes a combined model based on diagrams (Gibbs, Piper, Stuyfzand Hydrogeochemical Classification System) and unsupervised statistical approaches (Cluster Analysis, Principal Component Analysis, Fuzzy Principal Component Analysis, Fuzzy Hierarchical Cross-Clustering) to describe natural enrichment of inorganic arsenic and co-occurring species in groundwater in the Banat Plain, southwestern Romania. Speciation of inorganic As (arsenite, arsenate), ion concentrations (Na + , K + , Ca 2+ , Mg 2+ , HCO 3 - , Cl - , F - , SO 4 2- , PO 4 3- , NO 3 - ), pH, redox potential, conductivity and total dissolved substances were performed. Classical diagrams provided the hydrochemical characterization, while statistical approaches were helpful to establish (i) the mechanism of naturally occurring of As and F - species and the anthropogenic one for NO 3 - , SO 4 2- , PO 4 3- and K + and (ii) classification of groundwater based on content of arsenic species. The HCO 3 - type of local groundwater and alkaline pH (8.31-8.49) were found to be responsible for the enrichment of arsenic species and occurrence of F - but by different paths. The PO 4 3- -AsO 4 3- ion exchange, water-rock interaction (silicates hydrolysis and desorption from clay) were associated to arsenate enrichment in the oxidizing aquifer. Fuzzy Hierarchical Cross-Clustering was the strongest tool for the rapid simultaneous classification of groundwaters as a function of arsenic content and hydrogeochemical characteristics. The approach indicated the Na + -F - -pH cluster as marker for groundwater with naturally elevated As and highlighted which parameters need to be monitored. A chemical conceptual model illustrating the natural and anthropogenic paths and enrichment of As and co-occurring species in the local groundwater supported by mineralogical analysis of rocks was established. Copyright © 2016 Elsevier Ltd. All rights reserved.
Software system for data management and distributed processing of multichannel biomedical signals.
Franaszczuk, P J; Jouny, C C
2004-01-01
The presented software is designed for efficient utilization of cluster of PC computers for signal analysis of multichannel physiological data. The system consists of three main components: 1) a library of input and output procedures, 2) a database storing additional information about location in a storage system, 3) a user interface for selecting data for analysis, choosing programs for analysis, and distributing computing and output data on cluster nodes. The system allows for processing multichannel time series data in multiple binary formats. The description of data format, channels and time of recording are included in separate text files. Definition and selection of multiple channel montages is possible. Epochs for analysis can be selected both manually and automatically. Implementation of a new signal processing procedures is possible with a minimal programming overhead for the input/output processing and user interface. The number of nodes in cluster used for computations and amount of storage can be changed with no major modification to software. Current implementations include the time-frequency analysis of multiday, multichannel recordings of intracranial EEG of epileptic patients as well as evoked response analyses of repeated cognitive tasks.
NASA Astrophysics Data System (ADS)
Beerenwinkel, Anne; von Arx, Matthias
2017-04-01
For the last three decades, moderate constructivism has become an increasingly prominent perspective in science education. Researchers have defined characteristics of constructivist-oriented science classrooms, but the implementation of such science teaching in daily classroom practice seems difficult. Against this background, we conducted a sub-study within the tri-national research project Quality of Instruction in Physics (QuIP) analysing 60 videotaped physics classes involving a large sample of students ( N = 1192) from Finland, Germany and Switzerland in order to investigate the kinds of constructivist components and teaching patterns that can be found in regular classrooms without any intervention. We applied a newly developed coding scheme to capture constructivist facets of science teaching and conducted principal component and cluster analyses to explore which components and patterns were most prominent in the classes observed. Two underlying components were found, resulting in two scales—Structured Knowledge Acquisition and Fostering Autonomy—which describe key aspects of constructivist teaching. Only the first scale was rather well established in the lessons investigated. Classes were clustered based on these scales. The analysis of the different clusters suggested that teaching physics in a structured way combined with fostering students' autonomy contributes to students' motivation. However, our regression models indicated that content knowledge is a more important predictor for students' motivation, and there was no homogeneous pattern for all gender- and country-specific subgroups investigated. The results are discussed in light of recent discussions on the feasibility of constructivism in practice.
Yennurajalingam, Sriram; Williams, Janet L; Chisholm, Gary; Bruera, Eduardo
2016-03-01
Advanced cancer patients frequently experience debilitating symptoms that occur in clusters, but few pharmacological studies have targeted symptom clusters. Our objective was to examine the effects of dexamethasone on symptom clusters in patients with advanced cancer. We reviewed the data from a previous randomized clinical trial to determine the effects of dexamethasone on cancer symptoms. Symptom clusters were identified according to baseline symptoms by using principal component analysis. Correlations and change in the severity of symptom clusters were analyzed after study treatment. A total of 114 participants were included in this study. Three clusters were identified: fatigue/anorexia-cachexia/depression (FAD), sleep/anxiety/drowsiness (SAD), and pain/dyspnea (PD). Changes in severity of FAD and PD significantly correlated over time (at baseline, day 8, and day 15). The FAD cluster was associated with significant improvement in severity at day 8 and day 15, whereas no significant change was observed with the SAD cluster or PD cluster after dexamethasone treatment. The results of this preliminary study suggest significant correlation over time and improvement in the FAD cluster at day 8 and day 15 after treatment with dexamethasone. These findings suggest that fatigue, anorexia-cachexia, and depression may share a common pathophysiologic basis. Further studies are needed to investigate this cluster and target anti-inflammatory therapies. ©AlphaMed Press.
FRONTIER FIELDS CLUSTERS: CHANDRA AND JVLA VIEW OF THE PRE-MERGING CLUSTER MACS J0416.1-2403
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ogrean, G. A.; Weeren, R. J. van; Jones, C.
2015-10-20
Merging galaxy clusters leave long-lasting signatures on the baryonic and non-baryonic cluster constituents, including shock fronts, cold fronts, X-ray substructure, radio halos, and offsets between the dark matter (DM) and the gas components. Using observations from Chandra, the Jansky Very Large Array, the Giant Metrewave Radio Telescope, and the Hubble Space Telescope, we present a multiwavelength analysis of the merging Frontier Fields cluster MACS J0416.1-2403 (z = 0.396), which consists of NE and SW subclusters whose cores are separated on the sky by ∼250 kpc. We find that the NE subcluster has a compact core and hosts an X-ray cavity,more » yet it is not a cool core. Approximately 450 kpc south–southwest of the SW subcluster, we detect a density discontinuity that corresponds to a compression factor of ∼1.5. The discontinuity was most likely caused by the interaction of the SW subcluster with a less massive structure detected in the lensing maps SW of the subcluster's center. For both the NE and the SW subclusters, the DM and the gas components are well-aligned, suggesting that MACS J0416.1-2403 is a pre-merging system. The cluster also hosts a radio halo, which is unusual for a pre-merging system. The halo has a 1.4 GHz power of (1.3 ± 0.3) × 10{sup 24} W Hz{sup −1}, which is somewhat lower than expected based on the X-ray luminosity of the cluster if the spectrum of the halo is not ultra-steep. We suggest that we are either witnessing the birth of a radio halo, or have discovered a rare ultra-steep spectrum halo.« less
Optical polarimetric and near-infrared photometric study of the RCW95 Galactic H II region
NASA Astrophysics Data System (ADS)
Vargas-González, J.; Roman-Lopes, A.; Santos, F. P.; Franco, G. A. P.; Santos, J. F. C.; Maia, F. F. S.; Sanmartim, D.
2018-02-01
We carried out an optical polarimetric study in the direction of the RCW 95 star-forming region in order to probe the sky-projected magnetic field structure by using the distribution of linear polarization segments which seem to be well aligned with the more extended cloud component. A mean polarization angle of θ = 49.8° ± 7.7°7 was derived. Through the spectral dependence analysis of polarization it was possible to obtain the total-to-selective extinction ratio (RV) by fitting the Serkowski function, resulting in a mean value of RV = 2.93 ± 0.47. The foreground polarization component was estimated and is in agreement with previous studies in this direction of the Galaxy. Further, near-infrared (NIR) images from Vista Variables in the Via Láctea (VVV) survey were collected to improve the study of the stellar population associated with the H II region. The Automated Stellar Cluster Analysis algorithm was employed to derive structural parameters for two clusters in the region, and a set of PAdova and TRieste Stellar Evolution Code (PARSEC) isochrones was superimposed on the decontaminated colour-magnitude diagrams to estimate an age of about 3 Myr for both clusters. Finally, from the NIR photometry study combined with spectra obtained with the Ohio State Infrared Imager and Spectrometer mounted at the Southern Astrophysics Research Telescope we derived the spectral classification of the main ionizing sources in the clusters associated with IRAS 15408-5356 and IRAS 15412-5359, both objects classified as O4V stars.
The fine-scale genetic structure and evolution of the Japanese population
Katsuya, Tomohiro; Kimura, Ryosuke; Nabika, Toru; Isomura, Minoru; Ohkubo, Takayoshi; Tabara, Yasuharu; Yamamoto, Ken; Yokota, Mitsuhiro; Liu, Xuanyao; Saw, Woei-Yuh; Mamatyusupu, Dolikun; Yang, Wenjun; Xu, Shuhua
2017-01-01
The contemporary Japanese populations largely consist of three genetically distinct groups—Hondo, Ryukyu and Ainu. By principal-component analysis, while the three groups can be clearly separated, the Hondo people, comprising 99% of the Japanese, form one almost indistinguishable cluster. To understand fine-scale genetic structure, we applied powerful haplotype-based statistical methods to genome-wide single nucleotide polymorphism data from 1600 Japanese individuals, sampled from eight distinct regions in Japan. We then combined the Japanese data with 26 other Asian populations data to analyze the shared ancestry and genetic differentiation. We found that the Japanese could be separated into nine genetic clusters in our dataset, showing a marked concordance with geography; and that major components of ancestry profile of Japanese were from the Korean and Han Chinese clusters. We also detected and dated admixture in the Japanese. While genetic differentiation between Ryukyu and Hondo was suggested to be caused in part by positive selection, genetic differentiation among the Hondo clusters appeared to result principally from genetic drift. Notably, in Asians, we found the possibility that positive selection accentuated genetic differentiation among distant populations but attenuated genetic differentiation among close populations. These findings are significant for studies of human evolution and medical genetics. PMID:29091727
A modified procedure for mixture-model clustering of regional geochemical data
Ellefsen, Karl J.; Smith, David B.; Horton, John D.
2014-01-01
A modified procedure is proposed for mixture-model clustering of regional-scale geochemical data. The key modification is the robust principal component transformation of the isometric log-ratio transforms of the element concentrations. This principal component transformation and the associated dimension reduction are applied before the data are clustered. The principal advantage of this modification is that it significantly improves the stability of the clustering. The principal disadvantage is that it requires subjective selection of the number of clusters and the number of principal components. To evaluate the efficacy of this modified procedure, it is applied to soil geochemical data that comprise 959 samples from the state of Colorado (USA) for which the concentrations of 44 elements are measured. The distributions of element concentrations that are derived from the mixture model and from the field samples are similar, indicating that the mixture model is a suitable representation of the transformed geochemical data. Each cluster and the associated distributions of the element concentrations are related to specific geologic and anthropogenic features. In this way, mixture model clustering facilitates interpretation of the regional geochemical data.
Kenzaka, Tsuneaki; Kumabe, Ayako; Kosami, Koki; Matsuoka, Yasufumi; Minami, Kensuke; Ninomiya, Daisuke; Noda, Ayako; Okayama, Masanobu
2017-05-01
To investigate the items that are considered by physicians when making decisions regarding the resumption of oral intake among patients with aspiration pneumonia who have undergone short-term fasting. We surveyed 2490 Japanese hospitals that had internal medicine and respiratory medicine departments. We mailed questionnaires that contained 24 items related to oral intake resumption after aspiration pneumonia to the head of the department at each hospital. Cronbach statistics, principal component analysis and cluster analysis were used to analyze the results. We received responses from 350 hospitals; 89.7% of the respondents answered that they "Strongly agree" that "level of consciousness" is a useful criterion for resuming oral intake. Furthermore, 66%, 66%, 63.4%, 58.5% and 51% of the respondents answered that they "strongly agree" regarding the use of SpO 2 , the discretion of the attending physician, body temperature, swallowing function test results, mental state and respiratory rate, respectively. In the cluster analysis, level of consciousness, body temperature, SpO 2 , respiratory rate, mental state and the discretion of the attending physician belonged to the first cluster. The second cluster consisted of the patient's request, the family's request, the opinions of the medical staff and non-physician healthcare providers, and performance status. Physicians consider several criteria during decision-making regarding oral intake resumption, which can be assigned to two clusters. Future studies are required to develop generalizable and objective criteria. Geriatr Gerontol Int 2017; 17: 810-818. © 2016 The Authors. Geriatrics & Gerontology International published by John Wiley & Sons Australia, Ltd on behalf of Japan Geriatrics Society.
Huang, Chih-Sheng; Yang, Wen-Yu; Chuang, Chun-Hsiang; Wang, Yu-Kai
2018-01-01
Electroencephalogram (EEG) signals are usually contaminated with various artifacts, such as signal associated with muscle activity, eye movement, and body motion, which have a noncerebral origin. The amplitude of such artifacts is larger than that of the electrical activity of the brain, so they mask the cortical signals of interest, resulting in biased analysis and interpretation. Several blind source separation methods have been developed to remove artifacts from the EEG recordings. However, the iterative process for measuring separation within multichannel recordings is computationally intractable. Moreover, manually excluding the artifact components requires a time-consuming offline process. This work proposes a real-time artifact removal algorithm that is based on canonical correlation analysis (CCA), feature extraction, and the Gaussian mixture model (GMM) to improve the quality of EEG signals. The CCA was used to decompose EEG signals into components followed by feature extraction to extract representative features and GMM to cluster these features into groups to recognize and remove artifacts. The feasibility of the proposed algorithm was demonstrated by effectively removing artifacts caused by blinks, head/body movement, and chewing from EEG recordings while preserving the temporal and spectral characteristics of the signals that are important to cognitive research. PMID:29599950
Giorio, Chiara; Tapparo, Andrea; Dall'Osto, Manuel; Beddows, David C S; Esser-Gietl, Johanna K; Healy, Robert M; Harrison, Roy M
2015-03-17
Positive matrix factorization (PMF) has been applied to single particle ATOFMS spectra collected on a six lane heavily trafficked road in central London (Marylebone Road), which well represents an urban street canyon. PMF analysis successfully extracted 11 factors from mass spectra of about 700,000 particles as a complement to information on particle types (from K-means cluster analysis). The factors were associated with specific sources and represent the contribution of different traffic related components (i.e., lubricating oils, fresh elemental carbon, organonitrogen and aromatic compounds), secondary aerosol locally produced (i.e., nitrate, oxidized organic aerosol and oxidized organonitrogen compounds), urban background together with regional transport (aged elemental carbon and ammonium) and fresh sea spray. An important result from this study is the evidence that rapid chemical processes occur in the street canyon with production of secondary particles from road traffic emissions. These locally generated particles, together with aging processes, dramatically affected aerosol composition producing internally mixed particles. These processes may become important with stagnant air conditions and in countries where gasoline vehicles are predominant and need to be considered when quantifying the impact of traffic emissions.
Hydrochemical and multivariate analysis of groundwater quality in the northwest of Sinai, Egypt.
El-Shahat, M F; Sadek, M A; Salem, W M; Embaby, A A; Mohamed, F A
2017-08-01
The northwestern coast of Sinai is home to many economic activities and development programs, thus evaluation of the potentiality and vulnerability of water resources is important. The present work has been conducted on the groundwater resources of this area for describing the major features of groundwater quality and the principal factors that control salinity evolution. The major ionic content of 39 groundwater samples collected from the Quaternary aquifer shows high coefficients of variation reflecting asymmetry of aquifer recharge. The groundwater samples have been classified into four clusters (using hierarchical cluster analysis), these match the variety of total dissolvable solids, water types and ionic orders. The principal component analysis combined the ionic parameters of the studied groundwater samples into two principal components. The first represents about 56% of the whole sample variance reflecting a salinization due to evaporation, leaching, dissolution of marine salts and/or seawater intrusion. The second represents about 15.8% reflecting dilution with rain water and the El-Salam Canal. Most groundwater samples were not suitable for human consumption and about 41% are suitable for irrigation. However, all groundwater samples are suitable for cattle, about 69% and 15% are suitable for horses and poultry, respectively.
Hwang, Hyundoo; Barnes, Dawn E; Matsunaga, Yohei; Benian, Guy M; Ono, Shoichiro; Lu, Hang
2016-01-29
The sarcomere, the fundamental unit of muscle contraction, is a highly-ordered complex of hundreds of proteins. Despite decades of genetics work, the functional relationships and the roles of those sarcomeric proteins in animal behaviors remain unclear. In this paper, we demonstrate that optogenetic activation of the motor neurons that induce muscle contraction can facilitate quantitative studies of muscle kinetics in C. elegans. To increase the throughput of the study, we trapped multiple worms in parallel in a microfluidic device and illuminated for photoactivation of channelrhodopsin-2 to induce contractions in body wall muscles. Using image processing, the change in body size was quantified over time. A total of five parameters including rate constants for contraction and relaxation were extracted from the optogenetic assay as descriptors of sarcomere functions. To potentially relate the genes encoding the sarcomeric proteins functionally, a hierarchical clustering analysis was conducted on the basis of those parameters. Because it assesses physiological output different from conventional assays, this method provides a complement to the phenotypic analysis of C. elegans muscle mutants currently performed in many labs; the clusters may provide new insights and drive new hypotheses for functional relationships among the many sarcomere components.
NASA Astrophysics Data System (ADS)
Hwang, Hyundoo; Barnes, Dawn E.; Matsunaga, Yohei; Benian, Guy M.; Ono, Shoichiro; Lu, Hang
2016-01-01
The sarcomere, the fundamental unit of muscle contraction, is a highly-ordered complex of hundreds of proteins. Despite decades of genetics work, the functional relationships and the roles of those sarcomeric proteins in animal behaviors remain unclear. In this paper, we demonstrate that optogenetic activation of the motor neurons that induce muscle contraction can facilitate quantitative studies of muscle kinetics in C. elegans. To increase the throughput of the study, we trapped multiple worms in parallel in a microfluidic device and illuminated for photoactivation of channelrhodopsin-2 to induce contractions in body wall muscles. Using image processing, the change in body size was quantified over time. A total of five parameters including rate constants for contraction and relaxation were extracted from the optogenetic assay as descriptors of sarcomere functions. To potentially relate the genes encoding the sarcomeric proteins functionally, a hierarchical clustering analysis was conducted on the basis of those parameters. Because it assesses physiological output different from conventional assays, this method provides a complement to the phenotypic analysis of C. elegans muscle mutants currently performed in many labs; the clusters may provide new insights and drive new hypotheses for functional relationships among the many sarcomere components.
Ergatis: a web interface and scalable software system for bioinformatics workflows
Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.
2010-01-01
Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634
R, GeethaRamani; Balasubramanian, Lakshmi
2018-07-01
Macula segmentation and fovea localization is one of the primary tasks in retinal analysis as they are responsible for detailed vision. Existing approaches required segmentation of retinal structures viz. optic disc and blood vessels for this purpose. This work avoids knowledge of other retinal structures and attempts data mining techniques to segment macula. Unsupervised clustering algorithm is exploited for this purpose. Selection of initial cluster centres has a great impact on performance of clustering algorithms. A heuristic based clustering in which initial centres are selected based on measures defining statistical distribution of data is incorporated in the proposed methodology. The initial phase of proposed framework includes image cropping, green channel extraction, contrast enhancement and application of mathematical closing. Then, the pre-processed image is subjected to heuristic based clustering yielding a binary map. The binary image is post-processed to eliminate unwanted components. Finally, the component which possessed the minimum intensity is finalized as macula and its centre constitutes the fovea. The proposed approach outperforms existing works by reporting that 100%,of HRF, 100% of DRIVE, 96.92% of DIARETDB0, 97.75% of DIARETDB1, 98.81% of HEI-MED, 90% of STARE and 99.33% of MESSIDOR images satisfy the 1R criterion, a standard adopted for evaluating performance of macula and fovea identification. The proposed system thus helps the ophthalmologists in identifying the macula thereby facilitating to identify if any abnormality is present within the macula region. Copyright © 2018 Elsevier B.V. All rights reserved.
Efficient generation of low-energy folded states of a model protein
NASA Astrophysics Data System (ADS)
Gordon, Heather L.; Kwan, Wai Kei; Gong, Chunhang; Larrass, Stefan; Rothstein, Stuart M.
2003-01-01
A number of short simulated annealing runs are performed on a highly-frustrated 46-"residue" off-lattice model protein. We perform, in an iterative fashion, a principal component analysis of the 946 nonbonded interbead distances, followed by two varieties of cluster analyses: hierarchical and k-means clustering. We identify several distinct sets of conformations with reasonably consistent cluster membership. Nonbonded distance constraints are derived for each cluster and are employed within a distance geometry approach to generate many new conformations, previously unidentified by the simulated annealing experiments. Subsequent analyses suggest that these new conformations are members of the parent clusters from which they were generated. Furthermore, several novel, previously unobserved structures with low energy were uncovered, augmenting the ensemble of simulated annealing results, and providing a complete distribution of low-energy states. The computational cost of this approach to generating low-energy conformations is small when compared to the expense of further Monte Carlo simulated annealing runs.
NASA Technical Reports Server (NTRS)
Goldenberg, Stanley B.; Houze, Robert A., Jr.; Churchill, Dean D.
1990-01-01
The horizontal precipitation structure of cloud clusters observed over the South China Sea during the Winter Monsoon Experiment (WMONEX) is analyzed using a convective-stratiform technique (CST) developed by Adler and Negri (1988). The technique was modified by altering the method for identifying convective cells in the satellite data, accounting for the extremely cold cloud tops characteristic of the WMONEX region, and modifying the threshold infrared temperature for the boundary of the stratiform rain area. The precipitation analysis was extended to the entire history of the cloud cluster by applying the modified CST to IR imagery from geosynchronous-satellite observations. The ship and aircraft data from the later period of the cluster's lifetime make it possible to check the locations of convective and stratiform precipitation identified by the CST using in situ observations. The extended CST is considered to be effective for determining the climatology of the convective-stratiform structure of tropical cloud clusters.
Spatial Clustering of Occupational Injuries in Communities
Friedman, Lee; Chin, Brian; Madigan, Dana
2015-01-01
Objectives. Using the social-ecological model, we hypothesized that the home residences of injured workers would be clustered predictably and geographically. Methods. We linked health care and publicly available datasets by home zip code for traumatically injured workers in Illinois from 2000 to 2009. We calculated numbers and rates of injuries, determined the spatial relationships, and developed 3 models. Results. Among the 23 200 occupational injuries, 80% of cases were located in 20% of zip codes and clustered in 10 locations. After component analysis, numbers and clusters of injuries correlated directly with immigrants; injury rates inversely correlated with urban poverty. Conclusions. Traumatic occupational injuries were clustered spatially by home location of the affected workers and in a predictable way. This put an inequitable burden on communities and provided evidence for the possible value of community-based interventions for prevention of occupational injuries. Work should be included in health disparities research. Stakeholders should determine whether and how to intervene at the community level to prevent occupational injuries. PMID:25905838
Dual beam organic depth profiling using large argon cluster ion beams
Holzweber, M; Shard, AG; Jungnickel, H; Luch, A; Unger, WES
2014-01-01
Argon cluster sputtering of an organic multilayer reference material consisting of two organic components, 4,4′-bis[N-(1-naphthyl-1-)-N-phenyl- amino]-biphenyl (NPB) and aluminium tris-(8-hydroxyquinolate) (Alq3), materials commonly used in organic light-emitting diodes industry, was carried out using time-of-flight SIMS in dual beam mode. The sample used in this study consists of a ∽400-nm-thick NPB matrix with 3-nm marker layers of Alq3 at depth of ∽50, 100, 200 and 300 nm. Argon cluster sputtering provides a constant sputter yield throughout the depth profiles, and the sputter yield volumes and depth resolution are presented for Ar-cluster sizes of 630, 820, 1000, 1250 and 1660 atoms at a kinetic energy of 2.5 keV. The effect of cluster size in this material and over this range is shown to be negligible. © 2014 The Authors. Surface and Interface Analysis published by John Wiley & Sons Ltd. PMID:25892830
Study of BenW (n = 1-12) clusters: An electron collision perspective
NASA Astrophysics Data System (ADS)
Modak, Paresh; Kaur, Jaspreet; Antony, Bobby
2017-08-01
This article explores electron scattering cross sections by Beryllium-Tungsten clusters (BenW). Beryllium and tungsten are important elements for plasma facing wall components, especially for the deuterium/tritium phase of ITER and in the recently installed JET. The present study focuses on different electron impact interactions in terms of elastic cross section (Qel), inelastic cross section (Qinel), ionization cross section (Qion), and momentum transfer cross section (Qmtcs) for the first twelve clusters belonging to the BenW family. It also predicts the evolution of the cross section with the size of the cluster. These cross sections are used as an input to model processes in plasma. The ionization cross section presented here is compared with the available reported data. This is the first comprehensive report on cross section data for all the above-mentioned scattering channels, to the best of our knowledge. Such broad analysis of cross section data gives vital insight into the study of local chemistry of electron interactions with BenW (n = 1-12) clusters in plasma.
Yang, Lu; Cheng, Ping; Wang, Jin-Hui; Li, Hong
2017-10-23
This study investigated the volatile flavor compounds and antioxidant properties of the essential oil of chrysanthemums that was extracted from the fresh flowers of 10 taxa of Chrysanthemum morifolium from three species; namely Dendranthema morifolium (Ramat.) Yellow, Dendranthema morifolium (Ramat.) Red, Dendranthema morifolium (Ramat.) Pink, Dendranthema morifolium (Ramat.) White, Pericallis hybrid Blue, Pericallis hybrid Pink, Pericallis hybrid Purple, Bellis perennis Pink, Bellis perennis Yellow, and Bellis perennis White. The antioxidant capacity of the essential oil was assayed by spectrophotometric analysis. The volatile flavor compounds from the fresh flowers were collected using dynamic headspace collection, analyzed using auto thermal desorber-gas chromatography/mass spectrometry, and identified with quantification using the external standard method. The antioxidant activities of Chrysanthemum morifolium were evaluated by DPPH and FRAP assays, and the results showed that the antioxidant activity of each sample was not the same. The different varieties of fresh Chrysanthemum morifolium flowers were distinguished and classified by fingerprint similarity evaluation, principle component analysis (PCA), and cluster analysis. The results showed that the floral volatile component profiles were significantly different among the different Chrysanthemum morifolium varieties. A total of 36 volatile flavor compounds were identified with eight functional groups: hydrocarbons, terpenoids, aromatic compounds, alcohols, ketones, ethers, aldehydes, and esters. Moreover, the variability among Chrysanthemum morifolium in basis to the data, and the first three principal components (PC1, PC2, and PC3) accounted for 96.509% of the total variance (55.802%, 30.599%, and 10.108%, respectively). PCA indicated that there were marked differences among Chrysanthemum morifolium varieties. The cluster analysis confirmed the results of the PCA analysis. In conclusion, the results of this study provide a basis for breeding Chrysanthemum cultivars with desirable floral scents, and they further support the view that some plants are promising sources of natural antioxidants.
Sakhteman, Amirhossein; Faridi, Pouya; Daneshamouz, Saeid; Akbarizadeh, Amin Reza; Borhani-Haghighi, Afshin; Mohagheghzadeh, Abdolali
2017-01-01
Herbal oils have been widely used in Iran as medicinal compounds dating back to thousands of years in Iran. Chamomile oil is widely used as an example of traditional oil. We remade chamomile oils and tried to modify it with current knowledge and facilities. Six types of oil (traditional and modified) were prepared. Microbial limit tests and physicochemical tests were performed on them. Also, principal component analysis, hierarchical cluster analysis, and partial least squares discriminant analysis were done on the spectral data of attenuated total reflectance–infrared in order to obtain insight based on classification pattern of the samples. The results show that we can use modified versions of the chamomile oils (modified Clevenger-type apparatus method and microwave method) with the same content of traditional ones and with less microbial contaminations and better physicochemical properties. PMID:28585466
Zargaran, Arman; Sakhteman, Amirhossein; Faridi, Pouya; Daneshamouz, Saeid; Akbarizadeh, Amin Reza; Borhani-Haghighi, Afshin; Mohagheghzadeh, Abdolali
2017-10-01
Herbal oils have been widely used in Iran as medicinal compounds dating back to thousands of years in Iran. Chamomile oil is widely used as an example of traditional oil. We remade chamomile oils and tried to modify it with current knowledge and facilities. Six types of oil (traditional and modified) were prepared. Microbial limit tests and physicochemical tests were performed on them. Also, principal component analysis, hierarchical cluster analysis, and partial least squares discriminant analysis were done on the spectral data of attenuated total reflectance-infrared in order to obtain insight based on classification pattern of the samples. The results show that we can use modified versions of the chamomile oils (modified Clevenger-type apparatus method and microwave method) with the same content of traditional ones and with less microbial contaminations and better physicochemical properties.
Roshan, Abdul-Rahman A; Gad, Haidy A; El-Ahmady, Sherweit H; Khanbash, Mohamed S; Abou-Shoer, Mohamed I; Al-Azizi, Mohamed M
2013-08-14
This work describes a simple model developed for the authentication of monofloral Yemeni Sidr honey using UV spectroscopy together with chemometric techniques of hierarchical cluster analysis (HCA), principal component analysis (PCA), and soft independent modeling of class analogy (SIMCA). The model was constructed using 13 genuine Sidr honey samples and challenged with 25 honey samples of different botanical origins. HCA and PCA were successfully able to present a preliminary clustering pattern to segregate the genuine Sidr samples from the lower priced local polyfloral and non-Sidr samples. The SIMCA model presented a clear demarcation of the samples and was used to identify genuine Sidr honey samples as well as detect admixture with lower priced polyfloral honey by detection limits >10%. The constructed model presents a simple and efficient method of analysis and may serve as a basis for the authentication of other honey types worldwide.
Self-organization in a bimotility mixture of model microswimmers
NASA Astrophysics Data System (ADS)
Agrawal, Adyant; Babu, Sujin B.
2018-02-01
We study the cooperation and segregation dynamics in a bimotility mixture of microorganisms which swim at low Reynolds numbers via periodic deformations along the body. We employ a multiparticle collision dynamics method to simulate a two component mixture of artificial swimmers, termed as Taylor lines, which differ from each other only in the propulsion speed. The analysis reveals that a contribution of slower swimmers towards clustering, on average, is much larger as compared to the faster ones. We notice distinctive self-organizing dynamics, depending on the percentage difference in the speed of the two kinds. If this difference is large, the faster ones fragment the clusters of the slower ones in order to reach the boundary and form segregated clusters. Contrarily, when it is small, both kinds mix together at first, the faster ones usually leading the cluster and then gradually the slower ones slide out thereby also leading to segregation.
2016-01-01
The aim of this study was to determine how representative wear scars of simulator-tested polyethylene (PE) inserts compare with retrieved PE inserts from total knee replacement (TKR). By means of a nonparametric self-organizing feature map (SOFM), wear scar images of 21 postmortem- and 54 revision-retrieved components were compared with six simulator-tested components that were tested either in displacement or in load control according to ISO protocols. The SOFM network was then trained with the wear scar images of postmortem-retrieved components since those are considered well-functioning at the time of retrieval. Based on this training process, eleven clusters were established, suggesting considerable variability among wear scars despite an uncomplicated loading history inside their hosts. The remaining components (revision-retrieved and simulator-tested) were then assigned to these established clusters. Six out of five simulator components were clustered together, suggesting that the network was able to identify similarities in loading history. However, the simulator-tested components ended up in a cluster at the fringe of the map containing only 10.8% of retrieved components. This may suggest that current ISO testing protocols were not fully representative of this TKR population, and protocols that better resemble patients' gait after TKR containing activities other than walking may be warranted. PMID:27597955
Bowers, Andrew; Saltuklaroglu, Tim; Harkrider, Ashley; Cuellar, Megan
2013-01-01
Background Constructivist theories propose that articulatory hypotheses about incoming phonetic targets may function to enhance perception by limiting the possibilities for sensory analysis. To provide evidence for this proposal, it is necessary to map ongoing, high-temporal resolution changes in sensorimotor activity (i.e., the sensorimotor μ rhythm) to accurate speech and non-speech discrimination performance (i.e., correct trials.) Methods Sixteen participants (15 female and 1 male) were asked to passively listen to or actively identify speech and tone-sweeps in a two-force choice discrimination task while the electroencephalograph (EEG) was recorded from 32 channels. The stimuli were presented at signal-to-noise ratios (SNRs) in which discrimination accuracy was high (i.e., 80–100%) and low SNRs producing discrimination performance at chance. EEG data were decomposed using independent component analysis and clustered across participants using principle component methods in EEGLAB. Results ICA revealed left and right sensorimotor µ components for 14/16 and 13/16 participants respectively that were identified on the basis of scalp topography, spectral peaks, and localization to the precentral and postcentral gyri. Time-frequency analysis of left and right lateralized µ component clusters revealed significant (pFDR<.05) suppression in the traditional beta frequency range (13–30 Hz) prior to, during, and following syllable discrimination trials. No significant differences from baseline were found for passive tasks. Tone conditions produced right µ beta suppression following stimulus onset only. For the left µ, significant differences in the magnitude of beta suppression were found for correct speech discrimination trials relative to chance trials following stimulus offset. Conclusions Findings are consistent with constructivist, internal model theories proposing that early forward motor models generate predictions about likely phonemic units that are then synthesized with incoming sensory cues during active as opposed to passive processing. Future directions and possible translational value for clinical populations in which sensorimotor integration may play a functional role are discussed. PMID:23991030
Bowers, Andrew; Saltuklaroglu, Tim; Harkrider, Ashley; Cuellar, Megan
2013-01-01
Constructivist theories propose that articulatory hypotheses about incoming phonetic targets may function to enhance perception by limiting the possibilities for sensory analysis. To provide evidence for this proposal, it is necessary to map ongoing, high-temporal resolution changes in sensorimotor activity (i.e., the sensorimotor μ rhythm) to accurate speech and non-speech discrimination performance (i.e., correct trials.). Sixteen participants (15 female and 1 male) were asked to passively listen to or actively identify speech and tone-sweeps in a two-force choice discrimination task while the electroencephalograph (EEG) was recorded from 32 channels. The stimuli were presented at signal-to-noise ratios (SNRs) in which discrimination accuracy was high (i.e., 80-100%) and low SNRs producing discrimination performance at chance. EEG data were decomposed using independent component analysis and clustered across participants using principle component methods in EEGLAB. ICA revealed left and right sensorimotor µ components for 14/16 and 13/16 participants respectively that were identified on the basis of scalp topography, spectral peaks, and localization to the precentral and postcentral gyri. Time-frequency analysis of left and right lateralized µ component clusters revealed significant (pFDR<.05) suppression in the traditional beta frequency range (13-30 Hz) prior to, during, and following syllable discrimination trials. No significant differences from baseline were found for passive tasks. Tone conditions produced right µ beta suppression following stimulus onset only. For the left µ, significant differences in the magnitude of beta suppression were found for correct speech discrimination trials relative to chance trials following stimulus offset. Findings are consistent with constructivist, internal model theories proposing that early forward motor models generate predictions about likely phonemic units that are then synthesized with incoming sensory cues during active as opposed to passive processing. Future directions and possible translational value for clinical populations in which sensorimotor integration may play a functional role are discussed.
First principles study of vibrational dynamics of ceria-titania hybrid clusters
NASA Astrophysics Data System (ADS)
Majid, Abdul; Bibi, Maryam
2017-04-01
Density functional theory based calculations were performed to study vibrational properties of ceria, titania, and ceria-titania hybrid clusters. The findings revealed the dominance of vibrations related to oxygen when compared to those of metallic atoms in the clusters. In case of hybrid cluster, the softening of normal modes related to exterior oxygen atoms in ceria and softening/hardening of high/low frequency modes related to titania dimmers are observed. The results calculated for monomers conform to symmetry predictions according to which three IR and three Raman active modes were detected for TiO2, whereas two IR active and one Raman active modes were observed for CeO2. The comparative analysis indicates that the hybrid cluster CeTiO4 contains simultaneous vibrational fingerprints of the component dimmers. The symmetry, nature of vibrations, IR and Raman activity, intensities, and atomic involvement in different modes of the clusters are described in detail. The study points to engineering of CeTiO4 to tailor its properties for technological visible region applications in photocatalytic and electrochemical devices.
Osborne, Peter W; Benoit, Gérard; Laudet, Vincent; Schubert, Michael; Ferrier, David E K
2009-03-01
The ParaHox cluster is the evolutionary sister to the Hox cluster. Like the Hox cluster, the ParaHox cluster displays spatial and temporal regulation of the component genes along the anterior/posterior axis in a manner that correlates with the gene positions within the cluster (a feature called collinearity). The ParaHox cluster is however a simpler system to study because it is composed of only three genes. We provide a detailed analysis of the amphioxus ParaHox cluster and, for the first time in a single species, examine the regulation of the cluster in response to a single developmental signalling molecule, retinoic acid (RA). Embryos treated with either RA or RA antagonist display altered ParaHox gene expression: AmphiGsx expression shifts in the neural tube, and the endodermal boundary between AmphiXlox and AmphiCdx shifts its anterior/posterior position. We identified several putative retinoic acid response elements and in vitro assays suggest some may participate in RA regulation of the ParaHox genes. By comparison to vertebrate ParaHox gene regulation we explore the evolutionary implications. This work highlights how insights into the regulation and evolution of more complex vertebrate arrangements can be obtained through studies of a simpler, unduplicated amphioxus gene cluster.
ERIC Educational Resources Information Center
Moss, S. C.; Hogg, J.
1990-01-01
Principal components analysis was employed on the Adaptive Behavior Scales with scores of 122 older (mean age 63.5) individuals with severe intellectual impairment living in England. The study found the structure of adaptive skills and interpersonal maladaptive behaviors similar to that found for younger retarded adults. Two factors, personal…
The properties of small Ag clusters bound to DNA bases.
Soto-Verdugo, Víctor; Metiu, Horia; Gwinn, Elisabeth
2010-05-21
We study the binding of neutral silver clusters, Ag(n) (n=1-6), to the DNA bases adenine (A), cytosine (C), guanine (G), and thymine (T) and the absorption spectra of the silver cluster-base complexes. Using density functional theory (DFT), we find that the clusters prefer to bind to the doubly bonded ring nitrogens and that binding to T is generally much weaker than to C, G, and A. Ag(3) and Ag(4) make the stronger bonds. Bader charge analysis indicates a mild electron transfer from the base to the clusters for all bases, except T. The donor bases (C, G, and A) bind to the sites on the cluster where the lowest unoccupied molecular orbital has a pronounced protrusion. The site where cluster binds to the base is controlled by the shape of the higher occupied states of the base. Time-dependent DFT calculations show that different base-cluster isomers may have very different absorption spectra. In particular, we find new excitations in base-cluster molecules, at energies well below those of the isolated components, and with strengths that depend strongly on the orientations of planar clusters with respect to the base planes. Our results suggest that geometric constraints on binding, imposed by designed DNA structures, may be a feasible route to engineering the selection of specific cluster-base assemblies.
The differentiation of camel breeds based on meat measurements using discriminant analysis.
Al-Atiyat, Raed Mahmoud; Suliman, Gamal; AlSuhaibani, Entissar; El-Waziry, Ahmad; Al-Owaimer, Abdullah; Basmaeil, Saeid
2016-06-01
The meat productivity of camel in the tropics is still under investigation for identification of better meat breed or type. Therefore, four one-humped Saudi Arabian (SA) camel breeds, Majaheem, Maghateer, Hamrah, and Safrah were experimented in order to differentiate them from each other based on meat measurements. The measurements were biometrical meat traits measured on six intact males from each breed. The results showed higher values of the Majaheem breed than that obtained for the other breeds except few cases such dressing percentage and rib-eye area. In differentiation analysis, the most discriminating meat variables were myofibrillar protein index, meat color components (L* and a*, b*), and cooking loss. Consequently, the Safrah and the Majaheem breeds presented the largest dissimilarity as evidenced by their multivariate means. The canonical discriminant analysis allowed an additional understanding of the differentiation between breeds. Furthermore, two large clusters, one formed by Hamrah and Maghateer in one group along with Safrah. These classifications may assign each breed into one cluster considering they are better as meat producers. The Majaheem was clustered alone in another cluster that might be a result of being better as milk producers. Nevertheless, the productivity type of the camel breeds of SA needs further morphology and genetic descriptions.
Raman spectroscopy of normal oral buccal mucosa tissues: study on intact and incised biopsies
NASA Astrophysics Data System (ADS)
Deshmukh, Atul; Singh, S. P.; Chaturvedi, Pankaj; Krishna, C. Murali
2011-12-01
Oral squamous cell carcinoma is one of among the top 10 malignancies. Optical spectroscopy, including Raman, is being actively pursued as alternative/adjunct for cancer diagnosis. Earlier studies have demonstrated the feasibility of classifying normal, premalignant, and malignant oral ex vivo tissues. Spectral features showed predominance of lipids and proteins in normal and cancer conditions, respectively, which were attributed to membrane lipids and surface proteins. In view of recent developments in deep tissue Raman spectroscopy, we have recorded Raman spectra from superior and inferior surfaces of 10 normal oral tissues on intact, as well as incised, biopsies after separation of epithelium from connective tissue. Spectral variations and similarities among different groups were explored by unsupervised (principal component analysis) and supervised (linear discriminant analysis, factorial discriminant analysis) methodologies. Clusters of spectra from superior and inferior surfaces of intact tissues show a high overlap; whereas spectra from separated epithelium and connective tissue sections yielded clear clusters, though they also overlap on clusters of intact tissues. Spectra of all four groups of normal tissues gave exclusive clusters when tested against malignant spectra. Thus, this study demonstrates that spectra recorded from the superior surface of an intact tissue may have contributions from deeper layers but has no bearing from the classification of a malignant tissues point of view.
Galaxy clusters in the SDSS Stripe 82 based on photometric redshifts
Durret, F.; Adami, C.; Bertin, E.; ...
2015-06-10
Based on a recent photometric redshift galaxy catalogue, we have searched for galaxy clusters in the Stripe ~82 region of the Sloan Digital Sky Survey by applying the Adami & MAzure Cluster FInder (AMACFI). Extensive tests were made to fine-tune the AMACFI parameters and make the cluster detection as reliable as possible. The same method was applied to the Millennium simulation to estimate our detection efficiency and the approximate masses of the detected clusters. Considering all the cluster galaxies (i.e. within a 1 Mpc radius of the cluster to which they belong and with a photoz differing by less thanmore » 0.05 from that of the cluster), we stacked clusters in various redshift bins to derive colour-magnitude diagrams and galaxy luminosity functions (GLFs). For each galaxy with absolute magnitude brighter than -19.0 in the r band, we computed the disk and spheroid components by applying SExtractor, and by stacking clusters we determined how the disk-to-spheroid flux ratio varies with cluster redshift and mass. We also detected 3663 clusters in the redshift range 0.1513 and a few 10 14 solar masses. Furthermore, by stacking the cluster galaxies in various redshift bins, we find a clear red sequence in the (g'-r') versus r' colour-magnitude diagrams, and the GLFs are typical of clusters, though with a possible contamination from field galaxies. The morphological analysis of the cluster galaxies shows that the fraction of late-type to early-type galaxies shows an increase with redshift (particularly in high mass clusters) and a decrease with detection level, i.e. cluster mass. From the properties of the cluster galaxies, the majority of the candidate clusters detected here seem to be real clusters with typical cluster properties.« less
Galaxy clusters in the SDSS Stripe 82 based on photometric redshifts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Durret, F.; Adami, C.; Bertin, E.
Based on a recent photometric redshift galaxy catalogue, we have searched for galaxy clusters in the Stripe ~82 region of the Sloan Digital Sky Survey by applying the Adami & MAzure Cluster FInder (AMACFI). Extensive tests were made to fine-tune the AMACFI parameters and make the cluster detection as reliable as possible. The same method was applied to the Millennium simulation to estimate our detection efficiency and the approximate masses of the detected clusters. Considering all the cluster galaxies (i.e. within a 1 Mpc radius of the cluster to which they belong and with a photoz differing by less thanmore » 0.05 from that of the cluster), we stacked clusters in various redshift bins to derive colour-magnitude diagrams and galaxy luminosity functions (GLFs). For each galaxy with absolute magnitude brighter than -19.0 in the r band, we computed the disk and spheroid components by applying SExtractor, and by stacking clusters we determined how the disk-to-spheroid flux ratio varies with cluster redshift and mass. We also detected 3663 clusters in the redshift range 0.1513 and a few 10 14 solar masses. Furthermore, by stacking the cluster galaxies in various redshift bins, we find a clear red sequence in the (g'-r') versus r' colour-magnitude diagrams, and the GLFs are typical of clusters, though with a possible contamination from field galaxies. The morphological analysis of the cluster galaxies shows that the fraction of late-type to early-type galaxies shows an increase with redshift (particularly in high mass clusters) and a decrease with detection level, i.e. cluster mass. From the properties of the cluster galaxies, the majority of the candidate clusters detected here seem to be real clusters with typical cluster properties.« less
Kalgin, Igor V; Caflisch, Amedeo; Chekmarev, Sergei F; Karplus, Martin
2013-05-23
A new analysis of the 20 μs equilibrium folding/unfolding molecular dynamics simulations of the three-stranded antiparallel β-sheet miniprotein (beta3s) in implicit solvent is presented. The conformation space is reduced in dimensionality by introduction of linear combinations of hydrogen bond distances as the collective variables making use of a specially adapted principal component analysis (PCA); i.e., to make structured conformations more pronounced, only the formed bonds are included in determining the principal components. It is shown that a three-dimensional (3D) subspace gives a meaningful representation of the folding behavior. The first component, to which eight native hydrogen bonds make the major contribution (four in each beta hairpin), is found to play the role of the reaction coordinate for the overall folding process, while the second and third components distinguish the structured conformations. The representative points of the trajectory in the 3D space are grouped into conformational clusters that correspond to locally stable conformations of beta3s identified in earlier work. A simplified kinetic network based on the three components is constructed, and it is complemented by a hydrodynamic analysis. The latter, making use of "passive tracers" in 3D space, indicates that the folding flow is much more complex than suggested by the kinetic network. A 2D representation of streamlines shows there are vortices which correspond to repeated local rearrangement, not only around minima of the free energy surface but also in flat regions between minima. The vortices revealed by the hydrodynamic analysis are apparently not evident in folding pathways generated by transition-path sampling. Making use of the fact that the values of the collective hydrogen bond variables are linearly related to the Cartesian coordinate space, the RMSD between clusters is determined. Interestingly, the transition rates show an approximate exponential correlation with distance in the hydrogen bond subspace. Comparison with the many published studies shows good agreement with the present analysis for the parts that can be compared, supporting the robust character of our understanding of this "hydrogen atom" of protein folding.
Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.
Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed
2015-11-05
Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (P<0.001), independent of age, height, weight, and running speed. When these two groups were compared to PFP runners, one cluster exhibited greater while the other exhibited reduced peak knee abduction angles (P<0.05). The variability observed in running patterns across this sample could be the result of different gait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners. Copyright © 2015 Elsevier Ltd. All rights reserved.
[A study of Boletus bicolor from different areas using Fourier transform infrared spectrometry].
Zhou, Zai-Jin; Liu, Gang; Ren, Xian-Pei
2010-04-01
It is hard to differentiate the same species of wild growing mushrooms from different areas by macromorphological features. In this paper, Fourier transform infrared (FTIR) spectroscopy combined with principal component analysis was used to identify 58 samples of boletus bicolor from five different areas. Based on the fingerprint infrared spectrum of boletus bicolor samples, principal component analysis was conducted on 58 boletus bicolor spectra in the range of 1 350-750 cm(-1) using the statistical software SPSS 13.0. According to the result, the accumulated contributing ratio of the first three principal components accounts for 88.87%. They included almost all the information of samples. The two-dimensional projection plot using first and second principal component is a satisfactory clustering effect for the classification and discrimination of boletus bicolor. All boletus bicolor samples were divided into five groups with a classification accuracy of 98.3%. The study demonstrated that wild growing boletus bicolor at species level from different areas can be identified by FTIR spectra combined with principal components analysis.
Differential gene expression profiles of peripheral blood mononuclear cells in childhood asthma.
Kong, Qian; Li, Wen-Jing; Huang, Hua-Rong; Zhong, Ying-Qiang; Fang, Jian-Pei
2015-05-01
Asthma is a common childhood disease with strong genetic components. This study compared whole-genome expression differences between asthmatic young children and healthy controls to identify gene signatures of childhood asthma. Total RNA extracted from peripheral blood mononuclear cells (PBMC) was subjected to microarray analysis. QRT-PCR was performed to verify the microarray results. Classification and functional characterization of differential genes were illustrated by hierarchical clustering and gene ontology analysis. Multiple logistic regression (MLR) analysis, receiver operating characteristic (ROC) curve analysis, and discriminate power were used to scan asthma-specific diagnostic markers. For fold-change>2 and p < 0.05, there were 758 named differential genes. The results of QRT-PCR confirmed successfully the array data. Hierarchical clustering divided 29 highly possible genes into seven categories and the genes in the same cluster were likely to possess similar expression patterns or functions. Gene ontology analysis presented that differential genes primarily enriched in immune response, response to stress or stimulus, and regulation of apoptosis in biological process. MLR and ROC curve analysis revealed that the combination of ADAM33, Smad7, and LIGHT possessed excellent discriminating power. The combination of ADAM33, Smad7, and LIGHT would be a reliable and useful childhood asthma model for prediction and diagnosis.
Power Analysis for Models of Change in Cluster Randomized Designs
ERIC Educational Resources Information Center
Li, Wei; Konstantopoulos, Spyros
2017-01-01
Field experiments in education frequently assign entire groups such as schools to treatment or control conditions. These experiments incorporate sometimes a longitudinal component where for example students are followed over time to assess differences in the average rate of linear change, or rate of acceleration. In this study, we provide methods…
Mpc-scale diffuse radio emission in two massive cool-core clusters of galaxies
NASA Astrophysics Data System (ADS)
Sommer, Martin W.; Basu, Kaustuv; Intema, Huib; Pacaud, Florian; Bonafede, Annalisa; Babul, Arif; Bertoldi, Frank
2017-04-01
Radio haloes are diffuse synchrotron sources on scales of ˜1 Mpc that are found in merging clusters of galaxies, and are believed to be powered by electrons re-accelerated by merger-driven turbulence. We present measurements of extended radio emission on similarly large scales in two clusters of galaxies hosting cool cores: Abell 2390 and Abell 2261. The analysis is based on interferometric imaging with the Karl G. Jansky Very Large Array, Very Large Array and Giant Metrewave Radio Telescope. We present detailed radio images of the targets, subtract the compact emission components and measure the spectral indices for the diffuse components. The radio emission in A2390 extends beyond a known sloshing-like brightness discontinuity, and has a very steep in-band spectral slope at 1.5 GHz that is similar to some known ultrasteep spectrum radio haloes. The diffuse signal in A2261 is more extended than in A2390 but has lower luminosity. X-ray morphological indicators, derived from XMM-Newton X-ray data, place these clusters in the category of relaxed or regular systems, although some asymmetric features that can indicate past minor mergers are seen in the X-ray brightness images. If these two Mpc-scale radio sources are categorized as giant radio haloes, they question the common assumption of radio haloes occurring exclusively in clusters undergoing violent merging activity, in addition to commonly used criteria for distinguishing between radio haloes and minihaloes.
Koželj, Vesna; Vegnuti, Miljana; Drevenšek, Martina; Hortis-Dzierzbicka, Maria; Gonzalez-Landa, Gonzalo; Hanstein, Siiri; Klimova, Irena; Kobus, Kazimierz; Kobus-Zaleśna, Katarzyna; Semb, Gunvor; Shaw, Bill
2012-11-01
To compare palatal dimensions in 6-year-old children with unilateral cleft lip and palate (UCLP) treated by different protocols with those of noncleft children. Retrospective intercenter outcome study. Patients : Upper dental casts from 129 children with repaired UCLP and 30 controls were analyzed by the trigonometric method. Six European cleft centers. Main outcome measures : Sagittal, transverse, and vertical dimensions of the palate were observed. Palate variables were analyzed with descriptive methods and nonparametric tests. Regarding several various characteristics measured on a relatively small number of subjects, hierarchical, k-means clustering, and principal component analyses were used. Mean values of the observed dimensions for five cleft groups differed significantly from the control (p < .05). The group with one-stage closure of the cleft differed significantly from all other cleft groups in most variables (p < .05). Principal component analysis of all 159 cases identified three clusters with specific morphologic characteristics of the palate. A similar number of treated children were classified into each cluster, while all children without clefts were classified in the same cluster. The percentage of treated children from a particular group that fit this cluster ranged from 0% to 70% and increased with age at palatal closure and number of primary surgical procedures. At 6 years of age, children with stepwise repair and hard palate closure after the age of two more frequently result in palatal dimensions of noncleft control than children with earlier palatal closure and one-stage cleft repair.
Worldwide Topology of the Scientific Subject Profile: A Macro Approach in the Country Level
Moya-Anegón, Félix; Herrero-Solana, Víctor
2013-01-01
Background Models for the production of knowledge and systems of innovation and science are key elements for characterizing a country in view of its scientific thematic profile. With regard to scientific output and publication in journals of international visibility, the countries of the world may be classified into three main groups according to their thematic bias. Methodology/Principal Findings This paper aims to classify the countries of the world in several broad groups, described in terms of behavioural models that attempt to sum up the characteristics of their systems of knowledge and innovation. We perceive three clusters in our analysis: 1) the biomedical cluster, 2) the basic science & engineering cluster, and 3) the agricultural cluster. The countries are conceptually associated with the clusters via Principal Component Analysis (PCA), and a Multidimensional Scaling (MDS) map with all the countries is presented. Conclusions/Significance As we have seen, insofar as scientific output and publication in journals of international visibility is concerned, the countries of the world may be classified into three main groups according to their thematic profile. These groups can be described in terms of behavioral models that attempt to sum up the characteristics of their systems of knowledge and innovation. PMID:24349467
Van Cann, Joannes; Virgilio, Massimiliano; Jordaens, Kurt; De Meyer, Marc
2015-01-01
Previous attempts to resolve the Ceratitis FAR complex (Ceratitis fasciventris, Ceratitis anonae, Ceratitis rosa, Diptera, Tephritidae) showed contrasting results and revealed the occurrence of five microsatellite genotypic clusters (A, F1, F2, R1, R2). In this paper we explore the potential of wing morphometrics for the diagnosis of FAR morphospecies and genotypic clusters. We considered a set of 227 specimens previously morphologically identified and genotyped at 16 microsatellite loci. Seventeen wing landmarks and 6 wing band areas were used for morphometric analyses. Permutational multivariate analysis of variance detected significant differences both across morphospecies and genotypic clusters (for both males and females). Unconstrained and constrained ordinations did not properly resolve groups corresponding to morphospecies or genotypic clusters. However, posterior group membership probabilities (PGMPs) of the Discriminant Analysis of Principal Components (DAPC) allowed the consistent identification of a relevant proportion of specimens (but with performances differing across morphospecies and genotypic clusters). This study suggests that wing morphometrics and PGMPs might represent a possible tool for the diagnosis of species within the FAR complex. Here, we propose a tentative diagnostic method and provide a first reference library of morphometric measures that might be used for the identification of additional and unidentified FAR specimens.
NASA Astrophysics Data System (ADS)
Miyamoto, Yuki; Mizoguchi, Asao; Kanamori, Hideto
2017-03-01
The bleaching process in the C-F stretching mode (ν3 band) of CH3F-(ortho-H2)n [n = 0 and 1] clusters in solid para-H2 was monitored using pump and probe laser spectroscopy on the C-H stretching mode (ν1 and 2ν5 bands). From an analysis of the depleted spectral profiles, the transition frequency and linewidth of each cluster were directly determined. The results agree with the values previously derived from a deconvolution analysis of the broadened ν1/2ν5 spectrum observed by FTIR spectroscopy. The complementary increase and decrease between the n = 0 and 1 components were also verified through monitoring the ν1 and 2ν5 bands, which suggests a closed system among the CH3F-(ortho-H2)n clusters. These observations provide experimental verification of the CH3F-(ortho-H2)n cluster model. On the other hand, a trial to observe the bleaching process by pumping the C-H stretching mode was not successful. This result may be important for understanding the dynamics of vibrational relaxation processes in CH3F-(ortho-H2)n in solid para-H2.
Miyamoto, Yuki; Mizoguchi, Asao; Kanamori, Hideto
2017-03-21
The bleaching process in the C-F stretching mode (ν 3 band) of CH 3 F-(ortho-H 2 ) n [n = 0 and 1] clusters in solid para-H 2 was monitored using pump and probe laser spectroscopy on the C-H stretching mode (ν 1 and 2ν 5 bands). From an analysis of the depleted spectral profiles, the transition frequency and linewidth of each cluster were directly determined. The results agree with the values previously derived from a deconvolution analysis of the broadened ν 1 /2ν 5 spectrum observed by FTIR spectroscopy. The complementary increase and decrease between the n = 0 and 1 components were also verified through monitoring the ν 1 and 2ν 5 bands, which suggests a closed system among the CH 3 F-(ortho-H 2 ) n clusters. These observations provide experimental verification of the CH 3 F-(ortho-H 2 ) n cluster model. On the other hand, a trial to observe the bleaching process by pumping the C-H stretching mode was not successful. This result may be important for understanding the dynamics of vibrational relaxation processes in CH 3 F-(ortho-H 2 ) n in solid para-H 2 .
Local Prediction Models on Mid-Atlantic Ridge MORB by Principal Component Regression
NASA Astrophysics Data System (ADS)
Ling, X.; Snow, J. E.; Chin, W.
2017-12-01
The isotopic compositions of the daughter isotopes of long-lived radioactive systems (Sr, Nd, Hf and Pb ) can be used to map the scale and history of mantle heterogeneities beneath mid-ocean ridges. Our goal is to relate the multidimensional structure in the existing isotopic dataset with an underlying physical reality of mantle sources. The numerical technique of Principal Component Analysis is useful to reduce the linear dependence of the data to a minimum set of orthogonal eigenvectors encapsulating the information contained (cf Agranier et al 2005). The dataset used for this study covers almost all the MORBs along mid-Atlantic Ridge (MAR), from 54oS to 77oN and 8.8oW to -46.7oW, including replicating the dataset of Agranier et al., 2005 published plus 53 basalt samples dredged and analyzed since then (data from PetDB). The principal components PC1 and PC2 account for 61.56% and 29.21%, respectively, of the total isotope ratios variability. The samples with similar compositions to HIMU and EM and DM are identified to better understand the PCs. PC1 and PC2 are accountable for HIMU and EM whereas PC2 has limited control over the DM source. PC3 is more strongly controlled by the depleted mantle source than PC2. What this means is that all three principal components have a high degree of significance relevant to the established mantle sources. We also tested the relationship between mantle heterogeneity and sample locality. K-means clustering algorithm is a type of unsupervised learning to find groups in the data based on feature similarity. The PC factor scores of each sample are clustered into three groups. Cluster one and three are alternating on the north and south MAR. Cluster two exhibits on 45.18oN to 0.79oN and -27.9oW to -30.40oW alternating with cluster one. The ridge has been preliminarily divided into 16 sections considering both the clusters and ridge segments. The principal component regression models the section based on 6 isotope ratios and PCs. The prediction residual is about 1-2km. It means that the combined 5 isotopes are a strong predictor of geographic location along the ridge, a slightly surprising result. PCR is a robust and powerful method for both visualizing and manipulating the multidimensional representation of isotope data.
Applications of modern statistical methods to analysis of data in physical science
NASA Astrophysics Data System (ADS)
Wicker, James Eric
Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
NASA Astrophysics Data System (ADS)
Mantini, D.; Alleva, G.; Comani, S.
2005-10-01
Fetal magnetocardiography (fMCG) allows monitoring the fetal heart function through algorithms able to retrieve the fetal cardiac signal, but no standardized automatic model has become available so far. In this paper, we describe an automatic method that restores the fetal cardiac trace from fMCG recordings by means of a weighted summation of fetal components separated with independent component analysis (ICA) and identified through dedicated algorithms that analyse the frequency content and temporal structure of each source signal. Multichannel fMCG datasets of 66 healthy and 4 arrhythmic fetuses were used to validate the automatic method with respect to a classical procedure requiring the manual classification of fetal components by an expert investigator. ICA was run with input clusters of different dimensions to simulate various MCG systems. Detection rates, true negative and false positive component categorization, QRS amplitude, standard deviation and signal-to-noise ratio of reconstructed fetal signals, and real and per cent QRS differences between paired fetal traces retrieved automatically and manually were calculated to quantify the performances of the automatic method. Its robustness and reliability, particularly evident with the use of large input clusters, might increase the diagnostic role of fMCG during the prenatal period.
AGE AND DISTANCE FOR THE OLD OPEN CLUSTER NGC 188 FROM THE ECLIPSING BINARY MEMBER V 12
DOE Office of Scientific and Technical Information (OSTI.GOV)
Meibom, Soeren; Mathieu, Robert D.; Grundahl, Frank
2009-06-15
We present time series radial velocity, and photometric observations of a solar-type double-lined eclipsing binary star (V 12) in the old open cluster NGC 188. We use these data to determine the spectroscopic orbit and the photometric elements for V 12. From our analysis, we determine accurate masses (M{sub p} = 1.103 {+-} 0.007 M {sub sun}, M{sub s} = 1.081 {+-} 0.007 M {sub sun}) and radii (R{sub p} = 1.424 {+-} 0.019 R {sub sun}, R{sub s} = 1.373 {+-} 0.019 R {sub sun}) for the primary (p) and secondary (s) binary components. We adopt a reddening ofmore » E {sub B-V} = 0.087 for NGC 188, and derive component effective temperatures of 5900 {+-} 100 K and 5875 {+-} 100 K, respectively, for the primary and secondary stars. From their absolute dimensions, the two components of V 12 yield identical distance moduli of V {sub 0} - M{sub V} = 11fm24 {+-} 0fm09, corresponding to 1770 {+-} 75 pc. Both stars are near the end of their main-sequence evolutionary phase, and are located at the cluster turnoff in the color-magnitude diagram. We determine an age of 6.2 {+-} 0.2 Gyr for V 12 and NGC 188, from a comparison with theoretical isochrones in the mass-radius diagram. This age is independent of distance, reddening, and color-temperature transformations. We use isochrones from Victoria-Regina (VRSS) and Yonsei-Yale (Y {sup 2}) with [Fe/H] = -0.1 and [Fe/H] = 0.0. From the solar metallicity isochrones, an age of 6.4 Gyr provides the best fit to the binary components for both sets of models. For the isochrones with [Fe/H] = -0.1, ages of 6.0 Gyr and 5.9 Gyr provide the best fits for the (VRSS) and (Y {sup 2}) models, respectively. We use the distance and age estimates for V 12, together with best estimates for the metallicity and reddening of NGC 188, to investigate the locations of the corresponding VRSS and Y {sup 2} isochrones relative to cluster members in the color-magnitude diagram. Plausible changes in the model metallicity and distance to better match the isochrones to the cluster sequences, result in a range of ages for NGC 188 that is more than 3 times that resulting from our analysis of V 12.« less
Wang, Lei; Csallany, A Saari; Kerr, Brian J; Shurson, Gerald C; Chen, Chi
2016-05-18
In this study, the kinetics of aldehyde formation in heated frying oils was characterized by 2-hydrazinoquinoline derivatization, liquid chromatography-mass spectrometry (LC-MS) analysis, principal component analysis (PCA), and hierarchical cluster analysis (HCA). The aldehydes contributing to time-dependent separation of heated soybean oil (HSO) in a PCA model were grouped by the HCA into three clusters (A1, A2, and B) on the basis of their kinetics and fatty acid precursors. The increases of 4-hydroxynonenal (4-HNE) and the A2-to-B ratio in HSO were well-correlated with the duration of thermal stress. Chemometric and quantitative analysis of three frying oils (soybean, corn, and canola oils) and French fry extracts further supported the associations between aldehyde profiles and fatty acid precursors and also revealed that the concentrations of pentanal, hexanal, acrolein, and the A2-to-B ratio in French fry extracts were more comparable to their values in the frying oils than other unsaturated aldehydes. All of these results suggest the roles of specific aldehydes or aldehyde clusters as novel markers of the lipid oxidation status for frying oils or fried foods.
NASA Technical Reports Server (NTRS)
Storrie-Lombardi, Michael C.; Hoover, Richard B.
2005-01-01
Last year we presented techniques for the detection of fossils during robotic missions to Mars using both structural and chemical signatures[Storrie-Lombardi and Hoover, 2004]. Analyses included lossless compression of photographic images to estimate the relative complexity of a putative fossil compared to the rock matrix [Corsetti and Storrie-Lombardi, 2003] and elemental abundance distributions to provide mineralogical classification of the rock matrix [Storrie-Lombardi and Fisk, 2004]. We presented a classification strategy employing two exploratory classification algorithms (Principal Component Analysis and Hierarchical Cluster Analysis) and non-linear stochastic neural network to produce a Bayesian estimate of classification accuracy. We now present an extension of our previous experiments exploring putative fossil forms morphologically resembling cyanobacteria discovered in the Orgueil meteorite. Elemental abundances (C6, N7, O8, Na11, Mg12, Ai13, Si14, P15, S16, Cl17, K19, Ca20, Fe26) obtained for both extant cyanobacteria and fossil trilobites produce signatures readily distinguishing them from meteorite targets. When compared to elemental abundance signatures for extant cyanobacteria Orgueil structures exhibit decreased abundances for C6, N7, Na11, All3, P15, Cl17, K19, Ca20 and increases in Mg12, S16, Fe26. Diatoms and silicified portions of cyanobacterial sheaths exhibiting high levels of silicon and correspondingly low levels of carbon cluster more closely with terrestrial fossils than with extant cyanobacteria. Compression indices verify that variations in random and redundant textural patterns between perceived forms and the background matrix contribute significantly to morphological visual identification. The results provide a quantitative probabilistic methodology for discriminating putatitive fossils from the surrounding rock matrix and &om extant organisms using both structural and chemical information. The techniques described appear applicable to the geobiological analysis of meteoritic samples or in situ exploration of the Mars regolith. Keywords: cyanobacteria, microfossils, Mars, elemental abundances, complexity analysis, multifactor analysis, principal component analysis, hierarchical cluster analysis, artificial neural networks, paleo-biosignatures
The emergence of the galactic stellar mass function from a non-universal IMF in clusters
NASA Astrophysics Data System (ADS)
Dib, Sami; Basu, Shantanu
2018-06-01
We investigate the dependence of a single-generation galactic mass function (SGMF) on variations in the initial stellar mass functions (IMF) of stellar clusters. We show that cluster-to-cluster variations of the IMF lead to a multi-component SGMF where each component in a given mass range can be described by a distinct power-law function. We also show that a dispersion of ≈0.3 M⊙ in the characteristic mass of the IMF, as observed for young Galactic clusters, leads to a low-mass slope of the SGMF that matches the observed Galactic stellar mass function even when the IMFs in the low-mass end of individual clusters are much steeper.
[Study on HPLC fingerprint of Oldenlandia diffusa].
Chen, Yan; Yao, Zhi-Hong; Dai, Yi; Cheng, Hong; Wen, Li-Rong; Zhou, Guang-Xiong; Yao, Xin-Sheng
2012-06-01
To establish the HPLC fingerprint chromatogram of Oldenlandia diffusa coupled with chemometrics means for the quality control of multi-batches of medicinal material. The separation was developed on C18 column(4.6 mm x 250 mm, 5 microm) by gradient elution with acetonitrile-water(both containing 0.1 per thousand (V/V) ocetic acid) as mobile phase at a flow rate of 0.8 mL/min, the detection wavelength at 238 nm and column temperature at 30 degrees C. The HPLC fingerprint chromatogram of Oldenlandia diffusa was set up and the main characteristic peaks were identified by comparing with chemical reference substance. The quality of 22 batches of medicinal material was evaluated by similarity assay as well as principal component analysis (PCA) and cluster analysis. The established HPLC fingerprint chromatogram of Oldenlandia diffusa was specific, precise, reproducible and stable. 11 peaks were chemically identified. The similarity of 17 batches of Oldenlandia diffusa was obviously higher than 5 batches of adulterants. PCA showed that 17 batches of Oldenlandia diffusa were in a domain and 5 batches of adulterants were far apart from the domain. The cluster analysis of the 22 batches of medicinal material showed that 17 batches of Oldenlandia diffusa were in a cluster while 5 batches of adulterants were excluded. Further cluster analysis was carried out for the quality consistency of 17 batches of Oldenlandia diffusa and accordingly they were devided into 4 clusters. With the combination of chemometrics means, the HPLC fingerprint chromatogram provides a method for evaluation of authenticity and quality control of Oldenlandia diffusa, which is favorable to improve overall quality control of Oldenlandia diffusa.
Non Thermal Emission from Clusters of Galaxies: the Importance of a Joint LOFAR/Simbol-X View
NASA Astrophysics Data System (ADS)
Ferrari, C.
2009-05-01
Deep radio observations of galaxy clusters have revealed the existence of diffuse radio sources (``halos'' and ``relics'') related to the presence of relativistic electrons and weak magnetic fields in the intracluster volume. I will outline our current knowledge about the presence and properties of this non-thermal cluster component. Despite the recent progress made in observational and theoretical studies of the non-thermal emission in galaxy clusters, a number of open questions about its origin and its effects on the thermo-dynamical evolution of galaxy clusters need to be answered. I will show the importance of combining galaxy cluster observations by new-generation instruments such as LOFAR and Simbol-X. A deeper knowledge of the non-thermal cluster component, together with statistical studies of radio halos and relics, will allow to test the current cluster formation scenario and to better constrain the physics of large scale structure evolution.
Nakamura, Kengo; Kuwatani, Tatsu; Kawabe, Yoshishige; Komai, Takeshi
2016-02-01
Tsunami deposits accumulated on the Tohoku coastal area in Japan due to the impact of the Tohoku-oki earthquake. In the study reported in this paper, we applied principal component analysis (PCA) and cluster analysis (CA) to determine the concentrations of heavy metals in tsunami deposits that had been diluted with water or digested using 1 M HCl. The results suggest that the environmental risk is relatively low, evidenced by the following geometric mean concentrations: Pb, 16 mg kg(-1) and 0.003 ml L(-1); As, 1.8 mg kg(-1) and 0.004 ml L(-1); and Cd, 0.17 mg kg(-1) and 0.0001 ml L(-1). CA was performed after outliers were excluded using PCA. The analysis grouped the concentrations of heavy metals for leaching in water and acid. For the acid case, the first cluster contained Ni, Fe, Cd, Cu, Al, Cr, Zn, and Mn; while the second contained Pb, Sb, As, and Mo. For water, the first cluster contained Ni, Fe, Al, and Cr; and the second cluster contained Mo, Sb, As, Cu, Zn, Pb, and Mn. Statistical analysis revealed that the typical toxic elements, As, Pb, and Cd have steady correlations for acid leaching but are relatively sparse for water leaching. Pb and As from the tsunami deposits seemed to reveal a kind of redox elution mechanism using 1 M HCl. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Nijman, Henk; Simpson, Alan; Jones, Julia
2010-01-01
Background Conflict (aggression, substance use, absconding, etc.) and containment (coerced medication, manual restraint, etc.) threaten the safety of patients and staff on psychiatric wards. Previous work has suggested that staff variables may be significant in explaining differences between wards in their rates of these behaviours, and that structure (ward organisation, rules and daily routines) might be the most critical of these. This paper describes the exploration of a large dataset to assess the relationship between structure and other staff variables. Methods A multivariate cross-sectional design was utilised. Data were collected from staff on 136 acute psychiatric wards in 26 NHS Trusts in England, measuring leadership, teamwork, structure, burnout and attitudes towards difficult patients. Relationships between these variables were explored through principal components analysis (PCA), structural equation modelling and cluster analysis. Results Principal components analysis resulted in the identification of each questionnaire as a separate factor, indicating that the selected instruments assessed a number of non-overlapping items relevant for ward functioning. Structural equation modelling suggested a linear model in which leadership influenced teamwork, teamwork structure; structure burnout; and burnout feelings about difficult patients. Finally, cluster analysis identified two significantly distinct groups of wards: the larger of which had particularly good leadership, teamwork, structure, attitudes towards patients and low burnout; and the second smaller proportion which was poor on all variables and high on burnout. The better functioning cluster of wards had significantly lower rates of containment events. Conclusion The overall performance of staff teams is associated with differing rates of containment on wards. Interventions to reduce rates of containment on wards may need to address staff issues at every level, from leadership through to staff attitudes. PMID:20082064
Odoi, Agricola; Wray, Ron; Emo, Marion; Birch, Stephen; Hutchison, Brian; Eyles, John; Abernathy, Tom
2005-01-01
Background Population health planning aims to improve the health of the entire population and to reduce health inequities among population groups. Socioeconomic factors are increasingly being recognized as major determinants of many aspects of health and causes of health inequities. Knowledge of socioeconomic characteristics of neighbourhoods is necessary to identify their unique health needs and enhance identification of socioeconomically disadvantaged populations. Careful integration of this knowledge into health planning activities is necessary to ensure that health planning and service provision are tailored to unique neighbourhood population health needs. In this study, we identify unique neighbourhood socioeconomic characteristics and classify the neighbourhoods based on these characteristics. Principal components analysis (PCA) of 18 socioeconomic variables was used to identify the principal components explaining most of the variation in socioeconomic characteristics across the neighbourhoods. Cluster analysis was used to classify neighbourhoods based on their socioeconomic characteristics. Results Results of the PCA and cluster analysis were similar but the latter were more objective and easier to interpret. Five neighbourhood types with distinguishing socioeconomic and demographic characteristics were identified. The methodology provides a more complete picture of the neighbourhood socioeconomic characteristics than when a single variable (e.g. income) is used to classify neighbourhoods. Conclusion Cluster analysis is useful for generating neighbourhood population socioeconomic and demographic characteristics that can be useful in guiding neighbourhood health planning and service provision. This study is the first of a series of studies designed to investigate health inequalities at the neighbourhood level with a view to providing evidence-base for health planners, service providers and policy makers to help address health inequity issues at the neighbourhood level. Subsequent studies will investigate inequalities in health outcomes both within and across the neighbourhood types identified in the current study. PMID:16092969
Armitage, Emily G; Godzien, Joanna; Peña, Imanol; López-Gonzálvez, Ángeles; Angulo, Santiago; Gradillas, Ana; Alonso-Herranz, Vanesa; Martín, Julio; Fiandor, Jose M; Barrett, Michael P; Gabarro, Raquel; Barbas, Coral
2018-05-18
A lack of viable hits, increasing resistance, and limited knowledge on mode of action is hindering drug discovery for many diseases. To optimize prioritization and accelerate the discovery process, a strategy to cluster compounds based on more than chemical structure is required. We show the power of metabolomics in comparing effects on metabolism of 28 different candidate treatments for Leishmaniasis (25 from the GSK Leishmania box, two analogues of Leishmania box series, and amphotericin B as a gold standard treatment), tested in the axenic amastigote form of Leishmania donovani. Capillary electrophoresis-mass spectrometry was applied to identify the metabolic profile of Leishmania donovani, and principal components analysis was used to cluster compounds on potential mode of action, offering a medium throughput screening approach in drug selection/prioritization. The comprehensive and sensitive nature of the data has also made detailed effects of each compound obtainable, providing a resource to assist in further mechanistic studies and prioritization of these compounds for the development of new antileishmanial drugs.
Dynamic fractals in spatial evolutionary games
NASA Astrophysics Data System (ADS)
Kolotev, Sergei; Malyutin, Aleksandr; Burovski, Evgeni; Krashakov, Sergei; Shchur, Lev
2018-06-01
We investigate critical properties of a spatial evolutionary game based on the Prisoner's Dilemma. Simulations demonstrate a jump in the component densities accompanied by drastic changes in average sizes of the component clusters. We argue that the cluster boundary is a random fractal. Our simulations are consistent with the fractal dimension of the boundary being equal to 2, and the cluster boundaries are hence asymptotically space filling as the system size increases.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jee, M. James; Ng, Karen Y.; Hughes, John P.
2014-04-10
We present a Hubble Space Telescope weak-lensing study of the merging galaxy cluster 'El Gordo' (ACT-CL J0102–4915) at z = 0.87 discovered by the Atacama Cosmology Telescope (ACT) collaboration as the strongest Sunyaev-Zel'dovich decrement in its ∼1000 deg{sup 2} survey. Our weak-lensing analysis confirms that ACT-CL J0102–4915 is indeed an extreme system consisting of two massive (≳ 10{sup 15} M {sub ☉} each) subclusters with a projected separation of ∼0.7 h{sub 70}{sup −1} Mpc. This binary mass structure revealed by our lensing study is consistent with the cluster galaxy distribution and the dynamical study carried out with 89 spectroscopic members.more » We estimate the mass of ACT-CL J0102–4915 by simultaneously fitting two axisymmetric Navarro-Frenk-White (NFW) profiles allowing their centers to vary. We use only a single parameter for the NFW mass profile by enforcing the mass-concentration relation from numerical simulations. Our Markov-Chain-Monte-Carlo analysis shows that the masses of the northwestern (NW) and the southeastern (SE) components are M{sub 200c}=(1.38±0.22)×10{sup 15} h{sub 70}{sup −1} M{sub ⊙} and (0.78±0.20)×10{sup 15} h{sub 70}{sup −1} M{sub ⊙}, respectively, where the quoted errors include only 1σ statistical uncertainties determined by the finite number of source galaxies. These mass estimates are subject to additional uncertainties (20%-30%) due to the possible presence of triaxiality, correlated/uncorrelated large scale structure, and departure of the cluster profile from the NFW model. The lensing-based velocity dispersions are 1133{sub −61}{sup +58} km s{sup −1} and 1064{sub −66}{sup +62} km s{sup −1} for the NW and SE components, respectively, which are consistent with their spectroscopic measurements (1290 ± 134 km s{sup –1} and 1089 ± 200 km s{sup –1}, respectively). The centroids of both components are tightly constrained (∼4'') and close to the optical luminosity centers. The X-ray and mass peaks are spatially offset by ∼8'' (∼62 h{sub 70}{sup −1} kpc), which is significant at the ∼2σ confidence level. The mass peak, however, does not lead the gas peak in the direction expected if we are viewing the cluster soon after first core passage during a high speed merger. Under the assumption that the merger is happening in the plane of the sky, extrapolation of the two NFW halos to a radius r{sub 200a}=2.4 h{sub 70}{sup −1} Mpc yields a combined mass of M{sub 200a}=(3.13±0.56)×10{sup 15} h{sub 70}{sup −1} M{sub ⊙}. This extrapolated total mass is consistent with our two-component-based dynamical analysis and previous X-ray measurements, projecting ACT-CL J0102–4915 to be the most massive cluster at z > 0.6 known to date.« less
Vlasov Simulation of Electrostatic Solitary Structures in Multi-Component Plasmas
NASA Technical Reports Server (NTRS)
Umeda, Takayuki; Ashour-Abdalla, Maha; Pickett, Jolene S.; Goldstein, Melvyn L.
2012-01-01
Electrostatic solitary structures have been observed in the Earth's magnetosheath by the Cluster spacecraft. Recent theoretical work has suggested that these solitary structures are modeled by electron acoustic solitary waves existing in a four-component plasma system consisting of core electrons, two counter-streaming electron beams, and one species of background ions. In this paper, the excitation of electron acoustic waves and the formation of solitary structures are studied by means of a one-dimensional electrostatic Vlasov simulation. The present result first shows that either electron acoustic solitary waves with negative potential or electron phase-space holes with positive potential are excited in four-component plasma systems. However, these electrostatic solitary structures have longer duration times and higher wave amplitudes than the solitary structures observed in the magnetosheath. The result indicates that a high-speed and small free energy source may be needed as a fifth component. An additional simulation of a five-component plasma consisting of a stable four-component plasma and a weak electron beam shows the generation of small and fast electron phase-space holes by the bump-on-tail instability. The physical properties of the small and fast electron phase-space holes are very similar to those obtained by the previous theoretical analysis. The amplitude and duration time of solitary structures in the simulation are also in agreement with the Cluster observation.
VizieR Online Data Catalog: Photometric analysis of contact binaries (Lapasset+ 1996)
NASA Astrophysics Data System (ADS)
Lapasset, E.; Gomez, M.; Farinas, R.
1996-09-01
We present BV light-curve synthetic analyses of three short-period contact (W UMa) binaries: HY Pavonis (P=~0.35days), AW Virginis (P=~0.35days), and BP Velorum (P=~0.26days). Different possible configurations for wide range of the mass ratio were explored in each case making use of the Wilson-Devinney code. The photometric parameters of the systems were determined from the synthetic light-curve solutions that best fit the observations. AW Vir has two components of very similar temperatures and therefore the subtype (A or W) remains undetermined. HY Pav and BP Vel are best modeled by W-type configurations and the asymmetries in the light curves are reproduced by introducing cool spots on the more massive secondary components. Although BP Vel lies in the region of the open cluster Cr 173, its distance modulus, in principle, rules it out as a cluster member. (6 data files).
Xu, Liang; Liu, Haiping; Ma, Yucui; Wu, Cui; Li, Ruiqi; Chao, Zhimao
2018-06-13
The differences of volatile components in male (MFB) and female flower buds (FFB) of Populus × tomentosa were analysed and compared by HS-SPME with GC-MS for the first time. A total of 34 compounds were identified. Two clusters were clearly divided into male and female by hierarchical clustering analysis. Both the male and female flower buds showed methyl salicylate (22.83 and 24.09%, respectively) and 2-hydroxy-benzaldehyde (10.05 and 12.41%, respectively) as the main volatile constituents. The content of 2-cyclohexen-1-one, benzyl benzoate, and methyl benzoate in FFB was remarkably higher than in MFB. In contrast, the content of ethyl benzoate in MFB was greater than that in FFB. The phenomena showed the characteristic differences between MFB and FFB of P. × tomentosa, which enriched the basic studies on dioecious plant.
Cluster Physics with Merging Galaxy Clusters
NASA Astrophysics Data System (ADS)
Molnar, Sandor
Collisions between galaxy clusters provide a unique opportunity to study matter in a parameter space which cannot be explored in our laboratories on Earth. In the standard ΛCDM model, where the total density is dominated by the cosmological constant (Λ) and the matter density by cold dark matter (CDM), structure formation is hierarchical, and clusters grow mostly by merging. Mergers of two massive clusters are the most energetic events in the universe after the Big Bang, hence they provide a unique laboratory to study cluster physics. The two main mass components in clusters behave differently during collisions: the dark matter is nearly collisionless, responding only to gravity, while the gas is subject to pressure forces and dissipation, and shocks and turbulence are developed during collisions. In the present contribution we review the different methods used to derive the physical properties of merging clusters. Different physical processes leave their signatures on different wavelengths, thus our review is based on a multifrequency analysis. In principle, the best way to analyze multifrequency observations of merging clusters is to model them using N-body/HYDRO numerical simulations. We discuss the results of such detailed analyses. New high spatial and spectral resolution ground and space based telescopes will come online in the near future. Motivated by these new opportunities, we briefly discuss methods which will be feasible in the near future in studying merging clusters.
Lo, Kenneth
2011-01-01
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components. PMID:22125375
Lo, Kenneth; Gottardo, Raphael
2012-01-01
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.
THE GALAXY LUMINOSITY FUNCTIONS DOWN TO M{sub R} = -10 IN THE COMA CLUSTER
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yamanoi, Hitomi; Komiyama, Yutaka; Yagi, Masafumi
2012-08-15
We derived the luminosity function (LF) of dwarf galaxies in the Coma Cluster down to M{sub R} = -10 at three fields located at the center, intermediate, and outskirt of the cluster. The LF (-19 < M{sub R} < -10) shows no significant differences among the three fields. It shows a clear dip at M{sub R} {approx} -13 and is composed of two distinct components of different slopes; the bright component with -19 < M{sub R} < -13 has a flatter slope than the faint component with -13 < M{sub R} < -10 which has a steep slope. The brightmore » component (-19 < M{sub R} < -13) consists mostly of red extended galaxies including few blue galaxies whose colors are typical of late-type galaxies. On the other hand, the faint component (-13 < M{sub R} < -10) consists largely of point-spread-function-like compact galaxies. We found that both these compact galaxies and some extended galaxies are present in the center while only compact galaxies are seen in the outskirt. In the faint component, the fraction of blue galaxies is larger in the outskirt than in the center. We suggest that the dwarf galaxies in the Coma Cluster, which make up the two components in the LF, are heterogeneous with some different origins.« less
1988-01-01
We report the organization of the human genes encoding the complement components C4-binding protein (C4BP), C3b/C4b receptor (CR1), decay accelerating factor (DAF), and C3dg receptor (CR2) within the regulator of complement activation (RCA) gene cluster. Using pulsed field gel electrophoresis analysis these genes have been physically linked and aligned as CR1-CR2-DAF-C4BP in an 800-kb DNA segment. The very tight linkage between the CR1 and the C4BP loci, contrasted with the relative long DNA distance between these genes, suggests the existence of mechanisms interfering with recombination within the RCA gene cluster. PMID:2450163
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hajian, Amir; Alvarez, Marcelo A.; Bond, J. Richard, E-mail: ahajian@cita.utoronto.ca, E-mail: malvarez@cita.utoronto.ca, E-mail: bond@cita.utoronto.ca
Making mock simulated catalogs is an important component of astrophysical data analysis. Selection criteria for observed astronomical objects are often too complicated to be derived from first principles. However the existence of an observed group of objects is a well-suited problem for machine learning classification. In this paper we use one-class classifiers to learn the properties of an observed catalog of clusters of galaxies from ROSAT and to pick clusters from mock simulations that resemble the observed ROSAT catalog. We show how this method can be used to study the cross-correlations of thermal Sunya'ev-Zeldovich signals with number density maps ofmore » X-ray selected cluster catalogs. The method reduces the bias due to hand-tuning the selection function and is readily scalable to large catalogs with a high-dimensional space of astrophysical features.« less
Kinematic foot types in youth with equinovarus secondary to hemiplegia.
Krzak, Joseph J; Corcos, Daniel M; Damiano, Diane L; Graf, Adam; Hedeker, Donald; Smith, Peter A; Harris, Gerald F
2015-02-01
Elevated kinematic variability of the foot and ankle segments exists during gait among individuals with equinovarus secondary to hemiplegic cerebral palsy (CP). Clinicians have previously addressed such variability by developing classification schemes to identify subgroups of individuals based on their kinematics. To identify kinematic subgroups among youth with equinovarus secondary to CP using 3-dimensional multi-segment foot and ankle kinematics during locomotion as inputs for principal component analysis (PCA), and K-means cluster analysis. In a single assessment session, multi-segment foot and ankle kinematics using the Milwaukee Foot Model (MFM) were collected in 24 children/adolescents with equinovarus and 20 typically developing children/adolescents. PCA was used as a data reduction technique on 40 variables. K-means cluster analysis was performed on the first six principal components (PCs) which accounted for 92% of the variance of the dataset. The PCs described the location and plane of involvement in the foot and ankle. Five distinct kinematic subgroups were identified using K-means clustering. Participants with equinovarus presented with variable involvement ranging from primary hindfoot or forefoot deviations to deformtiy that included both segments in multiple planes. This study provides further evidence of the variability in foot characteristics associated with equinovarus secondary to hemiplegic CP. These findings would not have been detected using a single segment foot model. The identification of multiple kinematic subgroups with unique foot and ankle characteristics has the potential to improve treatment since similar patients within a subgroup are likely to benefit from the same intervention(s). Copyright © 2014 Elsevier B.V. All rights reserved.
Buononato, Elena Viola; De Luca, Daniela; Galeandro, Innocenzo Cataldo; Congedo, Maria Luisa; Cavone, Domenica; Intranuovo, Graziana; Guastadisegno, Chiara Monica; Corrado, Vincenzo; Ferri, Giovanni Maria
2016-06-01
The monitoring of heavy metals in industrialized areas to study their association with different occupational and environmental factors is carried out in different ways. In this study, scalp hair analysis was used for the assessment of exposure to these metals in the industrial city of Taranto, characterized by a severe environmental pollution. The highest median values were observed for aluminum, barium, cadmium, lead, mercury, and uranium. Moreover, in the industrial area of Taranto, high levels of barium, cadmium, lead, mercury, nickel, and silver were observed in comparison with other Apulia areas. The risk odds ratios (ORs) for observing values above the 50th percentile were elevated for mercury and fish consumption, uranium and milk consumption, lead and female sex, and aluminum and mineral water consumption. No significant increased risk was observed for occupational activities. In a dendrogram of a cluster analysis, three clusters were observed for the different areas of Taranto (Borgo, San Vito, and Statte). A scree plot and score variables plot underline the presence of two principal components: the first regarding antimony, lead, tin, aluminum and silver; the second regarding mercury and uranium. The observed clusters (Borgo, San Vito, and Statte) showed that lead, antimony, tin, aluminum, and silver were the main component. The highest values above the 50th percentile of these minerals, especially lead, were observed in the Borgo area. The observed metal concentration in the Borgo area is compatible with the presence in Taranto of a military dockyard and a reported increase of lung cancer risk among residents of that area.
Kinematic foot types in youth with equinovarus secondary to hemiplegia
Krzak, Joseph J.; Corcos, Daniel M.; Damiano, Diane L.; Graf, Adam; Hedeker, Donald; Smith, Peter A.; Harris, Gerald F.
2015-01-01
Background Elevated kinematic variability of the foot and ankle segments exists during gait among individuals with equinovarus secondary to hemiplegic cerebral palsy (CP). Clinicians have previously addressed such variability by developing classification schemes to identify subgroups of individuals based on their kinematics. Objective To identify kinematic subgroups among youth with equinovarus secondary to CP using 3-dimensional multi-segment foot and ankle kinematics during locomotion as inputs for principal component analysis (PCA), and K-means cluster analysis. Methods In a single assessment session, multi-segment foot and ankle kinematics using the Milwaukee Foot Model (MFM) were collected in 24 children/adolescents with equinovarus and 20 typically developing children/adolescents. Results PCA was used as a data reduction technique on 40 variables. K-means cluster analysis was performed on the first six principal components (PCs) which accounted for 92% of the variance of the dataset. The PCs described the location and plane of involvement in the foot and ankle. Five distinct kinematic subgroups were identified using K-means clustering. Participants with equinovarus presented with variable involvement ranging from primary hindfoot or forefoot deviations to deformtiy that included both segments in multiple planes. Conclusion This study provides further evidence of the variability in foot characteristics associated with equinovarus secondary to hemiplegic CP. These findings would not have been detected using a single segment foot model. The identification of multiple kinematic subgroups with unique foot and ankle characteristics has the potential to improve treatment since similar patients within a subgroup are likely to benefit from the same intervention(s). PMID:25467429
NASA Astrophysics Data System (ADS)
Farshadfar, M.; Farshadfar, E.
The present research was conducted to determine the genetic variability of 18 Lucerne cultivars, based on morphological and biochemical markers. The traits studied were plant height, tiller number, biomass, dry yield, dry yield/biomass, dry leaf/dry yield, macro and micro elements, crude protein, dry matter, crude fiber and ash percentage and SDS- PAGE in seed and leaf samples. Field experiments included 18 plots of two meter rows. Data based on morphological, chemical and SDS-PAGE markers were analyzed using SPSSWIN soft ware and the multivariate statistical procedures: cluster analysis (UPGMA), principal component. Analysis of analysis of variance and mean comparison for morphological traits reflected significant differences among genotypes. Genotype 13 and 15 had the greatest values for most traits. The Genotypic Coefficient of Variation (GCV), Phenotypic Coefficient of Variation (PCV) and Heritability (Hb) parameters for different characters raged from 12.49 to 26.58% for PCV, hence the GCV ranged from 6.84 to 18.84%. The greatest value of Hb was 0.94 for stem number. Lucerne genotypes could be classified, based on morphological traits, into four clusters and 94% of the variance among the genotypes was explained by two PCAs: Based on chemical traits they were classified into five groups and 73.492% of variance was explained by four principal components: Dry matter, protein, fiber, P, K, Na, Mg and Zn had higher variance. Genotypes based on the SDS-PAGE patterns all genotypes were classified into three clusters. The greatest genetic distance was between cultivar 10 and others, therefore they would be suitable parent in a breeding program.
Construction of an integrated social vulnerability index in urban areas prone to flash flooding
NASA Astrophysics Data System (ADS)
Aroca-Jimenez, Estefania; Bodoque, Jose Maria; Garcia, Juan Antonio; Diez-Herrero, Andres
2017-09-01
Among the natural hazards, flash flooding is the leading cause of weather-related deaths. Flood risk management (FRM) in this context requires a comprehensive assessment of the social risk component. In this regard, integrated social vulnerability (ISV) can incorporate spatial distribution and contribution and the combined effect of exposure, sensitivity and resilience to total vulnerability, although these components are often disregarded. ISV is defined by the demographic and socio-economic characteristics that condition a population's capacity to cope with, resist and recover from risk and can be expressed as the integrated social vulnerability index (ISVI). This study describes a methodological approach towards constructing the ISVI in urban areas prone to flash flooding in Castilla y León (Castile and León, northern central Spain, 94 223 km2, 2 478 376 inhabitants). A hierarchical segmentation analysis (HSA) was performed prior to the principal components analysis (PCA), which helped to overcome the sample size limitation inherent in PCA. ISVI was obtained from weighting vulnerability factors based on the tolerance statistic. In addition, latent class cluster analysis (LCCA) was carried out to identify spatial patterns of vulnerability within the study area. Our results show that the ISVI has high spatial variability. Moreover, the source of vulnerability in each urban area cluster can be identified from LCCA. These findings make it possible to design tailor-made strategies for FRM, thereby increasing the efficiency of plans and policies and helping to reduce the cost of mitigation measures.
A data fusion-based drought index
NASA Astrophysics Data System (ADS)
Azmi, Mohammad; Rüdiger, Christoph; Walker, Jeffrey P.
2016-03-01
Drought and water stress monitoring plays an important role in the management of water resources, especially during periods of extreme climate conditions. Here, a data fusion-based drought index (DFDI) has been developed and analyzed for three different locations of varying land use and climate regimes in Australia. The proposed index comprehensively considers all types of drought through a selection of indices and proxies associated with each drought type. In deriving the proposed index, weekly data from three different data sources (OzFlux Network, Asia-Pacific Water Monitor, and MODIS-Terra satellite) were employed to first derive commonly used individual standardized drought indices (SDIs), which were then grouped using an advanced clustering method. Next, three different multivariate methods (principal component analysis, factor analysis, and independent component analysis) were utilized to aggregate the SDIs located within each group. For the two clusters in which the grouped SDIs best reflected the water availability and vegetation conditions, the variables were aggregated based on an averaging between the standardized first principal components of the different multivariate methods. Then, considering those two aggregated indices as well as the classifications of months (dry/wet months and active/non-active months), the proposed DFDI was developed. Finally, the symbolic regression method was used to derive mathematical equations for the proposed DFDI. The results presented here show that the proposed index has revealed new aspects in water stress monitoring which previous indices were not able to, by simultaneously considering both hydrometeorological and ecological concepts to define the real water stress of the study areas.
Spatial assessment of water quality using chemometrics in the Pearl River Estuary, China
NASA Astrophysics Data System (ADS)
Wu, Meilin; Wang, Youshao; Dong, Junde; Sun, Fulin; Wang, Yutu; Hong, Yiguo
2017-03-01
A cruise was commissioned in the summer of 2009 to evaluate water quality in the Pearl River Estuary (PRE). Chemometrics such as Principal Component Analysis (PCA), Cluster analysis (CA) and Self-Organizing Map (SOM) were employed to identify anthropogenic and natural influences on estuary water quality. The scores of stations in the surface layer in the first principal component (PC1) were related to NH4-N, PO4-P, NO2-N, NO3-N, TP, and Chlorophyll a while salinity, turbidity, and SiO3-Si in the second principal component (PC2). Similarly, the scores of stations in the bottom layers in PC1 were related to PO4-P, NO2-N, NO3-N, and TP, while salinity, Chlorophyll a, NH4-N, and SiO3-Si in PC2. Results of the PCA identified the spatial distribution of the surface and bottom water quality, namely the Guangzhou urban reach, Middle reach, and Lower reach of the estuary. Both cluster analysis and PCA produced the similar results. Self-organizing map delineated the Guangzhou urban reach of the Pearl River that was mainly influenced by human activities. The middle and lower reaches of the PRE were mainly influenced by the waters in the South China Sea. The information extracted by PCA, CA, and SOM would be very useful to regional agencies in developing a strategy to carry out scientific plans for resource use based on marine system functions.
NASA Astrophysics Data System (ADS)
Lee, S.; Maharani, Y. N.; Ki, S. J.
2015-12-01
The application of Self-Organizing Map (SOM) to analyze social vulnerability to recognize the resilience within sites is a challenging tasks. The aim of this study is to propose a computational method to identify the sites according to their similarity and to determine the most relevant variables to characterize the social vulnerability in each cluster. For this purposes, SOM is considered as an effective platform for analysis of high dimensional data. By considering the cluster structure, the characteristic of social vulnerability of the sites identification can be fully understand. In this study, the social vulnerability variable is constructed from 17 variables, i.e. 12 independent variables which represent the socio-economic concepts and 5 dependent variables which represent the damage and losses due to Merapi eruption in 2010. These variables collectively represent the local situation of the study area, based on conducted fieldwork on September 2013. By using both independent and dependent variables, we can identify if the social vulnerability is reflected onto the actual situation, in this case, Merapi eruption 2010. However, social vulnerability analysis in the local communities consists of a number of variables that represent their socio-economic condition. Some of variables employed in this study might be more or less redundant. Therefore, SOM is used to reduce the redundant variable(s) by selecting the representative variables using the component planes and correlation coefficient between variables in order to find the effective sample size. Then, the selected dataset was effectively clustered according to their similarities. Finally, this approach can produce reliable estimates of clustering, recognize the most significant variables and could be useful for social vulnerability assessment, especially for the stakeholder as decision maker. This research was supported by a grant 'Development of Advanced Volcanic Disaster Response System considering Potential Volcanic Risk around Korea' [MPSS-NH-2015-81] from the Natural Hazard Mitigation Research Group, National Emergency Management Agency of Korea. Keywords: Self-organizing map, Component Planes, Correlation coefficient, Cluster analysis, Sites identification, Social vulnerability, Merapi eruption 2010
Zeng, Yanling; Lu, Yang; Chen, Zhao; Tan, Jiawei; Bai, Jie; Li, Pengyue; Wang, Zhixin; Du, Shouying
2018-05-11
Bolbostemma paniculatum is a traditional Chinese medicine (TCM) showed various therapeutic effects. Owing to its complex chemical composition, few investigations have acquired a comprehensive cognition for the chemical profiles of this herb and explicated the differences between samples collected from different places. In this study, a strategy based on UPLC tandem LTQ-Orbitrap MS n was established for characterizing chemical components of B. paniculatum . Through a systematic identification strategy, a total of 60 components in B. paniculatum were rapidly separated in 30 min and identified. Then based on peak intensities of all the characterized components, principle component analysis (PCA) and hierarchical cluster analysis (HCA) were employed to classify 18 batches of B. paniculatum into four groups, which were highly consistent with the four climate types of their original places. And five compounds were finally screened out as chemical markers to discriminate the internal quality of B. paniculatum . As the first study to systematically characterize the chemical components of B. paniculatum by UPLC-MS n , the above results could offer essential data for its pharmacological research. And the current strategy could provide useful reference for future investigations on discovery of important chemical constituents in TCM, as well as establishment of quality control and evaluation method.
Arciniega, Marcelino; Beck, Philipp; Lange, Oliver F.; Groll, Michael; Huber, Robert
2014-01-01
Two clusters of configurations of the main proteolytic subunit β5 were identified by principal component analysis of crystal structures of the yeast proteasome core particle (yCP). The apo-cluster encompasses unliganded species and complexes with nonpeptidic ligands, and the pep-cluster comprises complexes with peptidic ligands. The murine constitutive CP structures conform to the yeast system, with the apo-form settled in the apo-cluster and the PR-957 (a peptidic ligand) complex in the pep-cluster. In striking contrast, the murine immune CP classifies into the pep-cluster in both the apo and the PR-957–liganded species. The two clusters differ essentially by multiple small structural changes and a domain motion enabling enclosure of the peptidic ligand and formation of specific hydrogen bonds in the pep-cluster. The immune CP species is in optimal peptide binding configuration also in its apo form. This favors productive ligand binding and may help to explain the generally increased functional activity of the immunoproteasome. Molecular dynamics simulations of the representative murine species are consistent with the experimentally observed configurations. A comparison of all 28 subunits of the unliganded species with the peptidic liganded forms demonstrates a greatly enhanced plasticity of β5 and suggests specific signaling pathways to other subunits. PMID:24979800
Austin, Peter C
2010-04-22
Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita
2015-07-14
In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world. Copyright © 2015 Hadjithomas et al.
Xue, Gang; Song, Wen-qi; Li, Shu-chao
2015-01-01
In order to achieve the rapid identification of fire resistive coating for steel structure of different brands in circulating, a new method for the fast discrimination of varieties of fire resistive coating for steel structure by means of near infrared spectroscopy was proposed. The raster scanning near infrared spectroscopy instrument and near infrared diffuse reflectance spectroscopy were applied to collect the spectral curve of different brands of fire resistive coating for steel structure and the spectral data were preprocessed with standard normal variate transformation(standard normal variate transformation, SNV) and Norris second derivative. The principal component analysis (principal component analysis, PCA)was used to near infrared spectra for cluster analysis. The analysis results showed that the cumulate reliabilities of PC1 to PC5 were 99. 791%. The 3-dimentional plot was drawn with the scores of PC1, PC2 and PC3 X 10, which appeared to provide the best clustering of the varieties of fire resistive coating for steel structure. A total of 150 fire resistive coating samples were divided into calibration set and validation set randomly, the calibration set had 125 samples with 25 samples of each variety, and the validation set had 25 samples with 5 samples of each variety. According to the principal component scores of unknown samples, Mahalanobis distance values between each variety and unknown samples were calculated to realize the discrimination of different varieties. The qualitative analysis model for external verification of unknown samples is a 10% recognition ration. The results demonstrated that this identification method can be used as a rapid, accurate method to identify the classification of fire resistive coating for steel structure and provide technical reference for market regulation.
Polymorphisms and linkage analysis for ICAM-1 and the selectin gene cluster
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vora, D.K.; Rosenbloom, C.L.; Cottingham, R.W.
1994-06-01
Genetic polymorphisms in leukocyte and endothelial cell adhesion molecules may be important variables with regard to susceptibility to multifactorial disease processes that include an inflammatory component. For this reason, polymorphisms were sought for intercellular adhesion molecule-1 (ICAM-1; gene symbol ICAM1) and for the three genes in the selectin cluster, P-selectin, L-selectin, and E-selectin (gene symbols SELP, SELL, and SELE, respectively). Two amino acid polymorphisms were identified for ICAM-1; Gly or Arg at codon 241 and Lys or Glu at codon 469. Dinucleotide repeat polymorphisms were identified in the 3{prime}-untranslated region for ICAM-1 and in intron 9 for P-selectin. Restriction fragmentmore » length polymorphisms were found using cDNAs for each of the three selectin genes as probes; E-selectin with BglII, P-selectin with ScaI, and L-selectin with HincII. Linkage analysis was performed for the selectin gene cluster and for ICAM-1 using the CEPH families; ICAM-1 is very tightly linked to the LDL receptor on chromosome 19, and the selectin cluster is linked to markers at chromosome 1q23. 41 refs., 2 tabs.« less
Kandadai, Venk; Yang, Haodong; Jiang, Ling; Yang, Christopher C; Fleisher, Linda; Winston, Flaura Koplin
2016-05-05
Little is known about the ability of individual stakeholder groups to achieve health information dissemination goals through Twitter. This study aimed to develop and apply methods for the systematic evaluation and optimization of health information dissemination by stakeholders through Twitter. Tweet content from 1790 followers of @SafetyMD (July-November 2012) was examined. User emphasis, a new indicator of Twitter information dissemination, was defined and applied to retweets across two levels of retweeters originating from @SafetyMD. User interest clusters were identified based on principal component analysis (PCA) and hierarchical cluster analysis (HCA) of a random sample of 170 followers. User emphasis of keywords remained across levels but decreased by 9.5 percentage points. PCA and HCA identified 12 statistically unique clusters of followers within the @SafetyMD Twitter network. This study is one of the first to develop methods for use by stakeholders to evaluate and optimize their use of Twitter to disseminate health information. Our new methods provide preliminary evidence that individual stakeholders can evaluate the effectiveness of health information dissemination and create content-specific clusters for more specific targeted messaging.
Li, Chun-Hong; Zuo, Hua-Li; Zhang, Qian; Wang, Feng-Qin; Hu, Yuan-Jia; Qian, Zheng-Ming; Li, Wen-Jia; Xia, Zhi-Ning; Yang, Feng-Qing
2017-01-01
Background: As one of the bioactive components in Cordyceps sinensis (CS), proteins were rarely used as index components to study the correlation between the protein components and producing areas of natural CS. Objective: Protein components of 26 natural CS samples produced in Qinghai, Tibet, and Sichuan provinces were analyzed and compared to investigate the relationship among 26 different producing areas. Materials and Methods: Proteins from 26 different producing areas were extracted by Tris-HCl buffer with Triton X-100, and separated using sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and two-dimensional electrophoresis (2-DE). Results: The SDS-PAGE results indicated that the number of protein bands and optical density curves of proteins in 26 CS samples was a bit different. However, the 2-DE results showed that the numbers and abundance of protein spots in protein profiles of 26 samples were obviously different and showed certain association with producing areas. Conclusions: Based on the expression values of matched protein spots, 26 batches of CS samples can be divided into two main categories (Tibet and Qinghai) by hierarchical cluster analysis. SUMMARY The number of protein bands and optical density curves of proteins in 26 Cordyceps sinensis samples were a bit different on the sodium dodecyl sulfate-polyacrylamide gel electrophoresis protein profilesNumbers and abundance of protein spots in protein profiles of 26 samples were obvious different on two-dimensional electrophoresis mapsTwenty-six different producing areas of natural Cordyceps sinensis samples were divided into two main categories (Tibet and Qinghai) by Hierarchical cluster analysis based on the values of matched protein spots. Abbreviations Used: SDS-PAGE: Sodium dodecyl sulfate polyacrylamide gel electrophoresis, 2-DE: Two-dimensional electrophoresis, Cordyceps sinensis: CS, TCMs: Traditional Chinese medicines PMID:28250651
Li, Chun-Hong; Zuo, Hua-Li; Zhang, Qian; Wang, Feng-Qin; Hu, Yuan-Jia; Qian, Zheng-Ming; Li, Wen-Jia; Xia, Zhi-Ning; Yang, Feng-Qing
2017-01-01
As one of the bioactive components in Cordyceps sinensis (CS), proteins were rarely used as index components to study the correlation between the protein components and producing areas of natural CS. Protein components of 26 natural CS samples produced in Qinghai, Tibet, and Sichuan provinces were analyzed and compared to investigate the relationship among 26 different producing areas. Proteins from 26 different producing areas were extracted by Tris-HCl buffer with Triton X-100, and separated using sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and two-dimensional electrophoresis (2-DE). The SDS-PAGE results indicated that the number of protein bands and optical density curves of proteins in 26 CS samples was a bit different. However, the 2-DE results showed that the numbers and abundance of protein spots in protein profiles of 26 samples were obviously different and showed certain association with producing areas. Based on the expression values of matched protein spots, 26 batches of CS samples can be divided into two main categories (Tibet and Qinghai) by hierarchical cluster analysis. The number of protein bands and optical density curves of proteins in 26 Cordyceps sinensis samples were a bit different on the sodium dodecyl sulfate-polyacrylamide gel electrophoresis protein profilesNumbers and abundance of protein spots in protein profiles of 26 samples were obvious different on two-dimensional electrophoresis mapsTwenty-six different producing areas of natural Cordyceps sinensis samples were divided into two main categories (Tibet and Qinghai) by Hierarchical cluster analysis based on the values of matched protein spots. Abbreviations Used : SDS-PAGE: Sodium dodecyl sulfate polyacrylamide gel electrophoresis, 2-DE: Two-dimensional electrophoresis, Cordyceps sinensis : CS, TCMs: Traditional Chinese medicines.
Wang, Chao; Zhang, Chenxia; Kong, Yawen; Peng, Xiaopei; Li, Changwen; Liu, Shunhang; Du, Liping; Xiao, Dongguang; Xu, Yongquan
2017-10-01
Dianhong teas produced from fresh leaves of different tea cultivars (YK is Yunkang No. 10, XY is Xueya 100, CY is Changyebaihao, SS is Shishengmiao), were compared in terms of volatile compounds and descriptive sensory analysis. A total of 73 volatile compounds in 16 tea samples were tentatively identified. YK, XY, CY, and SS contained 55, 53, 49, and 51 volatile compounds, respectively. Partial least squares-discriminant analysis (PLS-DA) and hierarchical cluster analysis (HCA) were used to classify the samples, and 40 key components were selected based on variable importance in the projection. Moreover, 11 flavor attributes, namely, floral, fruity, grass/green, woody, sweet, roasty, caramel, mellow and thick, bitter, astringent, and sweet aftertaste were identified through descriptive sensory analysis (DSA). In generally, innate differences among the tea varieties significantly affected the intensities of most of the key sensory attributes of Dianhong teas possibly because of the different amounts of aroma-active and taste components in Dianhong teas. Copyright © 2017 Elsevier Ltd. All rights reserved.
Anderson, Laura N; Lebovic, Gerald; Hamilton, Jill; Hanley, Anthony J; McCrindle, Brian W; Maguire, Jonathon L; Parkin, Patricia C; Birken, Catherine S
2016-03-01
Obesity has its origins in early childhood; however, there is limited evidence of the association between anthropometric indicators and cardiometabolic risk factors in young children. Our aim was to evaluate the associations between body mass index (BMI) and waist circumference (WC) in relation to cardiometabolic risk factors and to explore the clustering of these factors. A cross-sectional study was conducted in children aged 1-5 years through TARGet Kids! (n = 2917). Logistic regression was used to evaluate associations between BMI and WC z-scores and individual traditional and possible non-traditional cardiometabolic risk factors. The underlying clustering of these measures was evaluated using principal components analysis (PCA). Child obesity (BMI z-score >2) was associated with high (>90th percentile) leptin [odds ratio (OR) 8.15, 95% confidence interval (CI) 4.56, 14.58] and insulin (OR = 1.76; 95% CI 1.05, 2.94). WC z-score >1 was associated with high insulin (OR 1.59, 95% CI 1.11, 2.28), leptin (OR 5.48, 95% CI 3.48, 8.63) and 25-hydroxyvitamin D < 75 nmol/L (OR 1.39, 95% CI 1.08, 1.79). BMI and WC were not associated with other traditional cardiometabolic risk factors, including non-High Density Lipoprotein (HDL) cholesterol, and glucose. Among children 3-5 years (n = 1035) the PCA of traditional risk factors identified three components: adiposity/blood pressure, metabolic, and lipids. The inclusion of non-traditional risk factors identified four additional components but contributed minimally to the total variation explained. Anthropometric indicators are associated with selected cardiometabolic risk factors in early childhood, although the clustering of risk factors suggests that adiposity is only one distinct component of cardiometabolic risk. The measurement of other risk factors beyond BMI and WC may be important in defining cardiometabolic risk in early childhood. © 2015 John Wiley & Sons Ltd.
The Size Distribution Of Cluster Galaxies
NASA Astrophysics Data System (ADS)
Kuchner, U.; Ziegler, B.; Bamford, S.; Verdugo, M.; Haeussler, B.
2017-06-01
We establish a sample of 560 spectroscopically confirmed cluster members of MACS J1206.2- 0847 at z = 0.45 and utilize multi-wavelength and multi-component Sersic profile fitting to provide luminosities and sizes for the key structural components bulge and disk. While the difference between field and cluster galaxy properties are mostly due to a preference for cluster members to be early-type (quiescent, bulge-dominated), we see evidence for an outer disk fading and a sharp rise in the number of red disks with smaller effective radii at the tidally active cluster region around R200. Even though red disks are already virialized according to their velocity distribution, they are clearly not part of the old population found in the innermost region; they represent an important population of transitional objects in clusters.
Shan, Mingqiu; Li, Sam Fong Yau; Yu, Sheng; Qian, Yan; Guo, Shuchen; Zhang, Li; Ding, Anwei
2018-01-01
Platycladi cacumen (dried twigs and leaves of Platycladus orientalis (L.) Franco) is a frequently utilized Chinese medicinal herb. To evaluate the quality of the phytomedcine, an ultra-performance liquid chromatographic method with diode array detection was established for chemical fingerprinting and quantitative analysis. In this study, 27 batches of P. cacumen from different regions were collected for analysis. A chemical fingerprint with 20 common peaks was obtained using Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine (Version 2004A). Among these 20 components, seven flavonoids (myricitrin, isoquercitrin, quercitrin, afzelin, cupressuflavone, amentoflavone and hinokiflavone) were identified and determined simultaneously. In the method validation, the seven analytes showed good regressions (R ≥ 0.9995) within linear ranges and good recoveries from 96.4% to 103.3%. Furthermore, with the contents of these seven flavonoids, hierarchical clustering analysis was applied to distinguish the 27 batches into five groups. The chemometric results showed that these groups were almost consistent with geographical positions and climatic conditions of the production regions. Integrating fingerprint analysis, simultaneous determination and hierarchical clustering analysis, the established method is rapid, sensitive, accurate and readily applicable, and also provides a significant foundation for quality control of P. cacumen efficiently. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kaluzny, J.; Rozyczka, M.; Schwarzenberg-Czerny, A.
2015-11-15
We use photometric and spectroscopic observations of the detached eclipsing binaries V40 and V41 in the globular cluster NGC 6362 to derive masses, radii, and luminosities of the component stars. The orbital periods of these systems are 5.30 and 17.89 days, respectively. The measured masses of the primary and secondary components (M{sub p}, M{sub s}) are (0.8337 ± 0.0063, 0.7947 ± 0.0048) M{sub ⊙} for V40 and (0.8215 ± 0.0058, 0.7280 ± 0.0047) M{sub ⊙} for V41. The measured radii (R{sub p}, R{sub s}) are (1.3253 ± 0.0075, 0.997 ± 0.013) R{sub ⊙} for V40 and (1.0739 ± 0.0048, 0.7307more » ± 0.0046) R{sub ⊙} for V41. Based on the derived luminosities, we find that the distance modulus of the cluster is 14.74 ± 0.04 mag—in good agreement with 14.72 mag obtained from color–magnitude diagram (CMD) fitting. We compare the absolute parameters of component stars with theoretical isochrones in mass–radius and mass–luminosity diagrams. For assumed abundances [Fe/H] = −1.07, [α/Fe] = 0.4, and Y = 0.25 we find the most probable age of V40 to be 11.7 ± 0.2 Gyr, compatible with the age of the cluster derived from CMD fitting (12.5 ± 0.5 Gyr). V41 seems to be markedly younger than V40. If independently confirmed, this result will suggest that V41 belongs to the younger of the two stellar populations recently discovered in NGC 6362. The orbits of both systems are eccentric. Given the orbital period and age of V40, its orbit should have been tidally circularized some ∼7 Gyr ago. The observed eccentricity is most likely the result of a relatively recent close stellar encounter.« less
USDA-ARS?s Scientific Manuscript database
The United State Department of Agriculture (USDA), Agricultural Research Service, (ARS), Plant Genetic Resources Conservation Unit’s (PGRCU) sunn hemp (Crotalaria juncea L.) germlasm collection consists of 22 accessions. Sixteen (16) accessions of the most seed productive were selected. These access...
1983-06-16
has been advocated by Gnanadesikan and ilk (1969), and others in the literature. This suggests that, if we use the formal signficance test type...American Statistical Asso., 62, 1159-1178. Gnanadesikan , R., and Wilk, M..B. (1969). Data Analytic Methods in Multi- variate Statistical Analysis. In
An analysis of infrared emission spectra from the regions near the Galactic Centre
NASA Astrophysics Data System (ADS)
Contini, Marcella
2009-11-01
We present consistent modelling of line and continuum infrared (IR) spectra in the region close to the Galactic Centre. The models account for the coupled effect of shocks and photoionization from an external source. The results show that the shock velocities range between ~65 and 80kms-1 and the pre-shock densities between 1cm-3 in the interstellar medium (ISM) to 200cm-3 in the filamentary structures. The pre-shock magnetic field increases from 5 × 10-6G in the surrounding ISM to ~8 × 10-5G in the arched filaments. The stellar temperatures are ~38000K in the Quintuplet cluster and ~27000K in the Arches Cluster. The ionization parameter is relatively low (<0.01) with the highest values near the clusters, reaching a maximum >0.01 near the Arches Cluster. Depletion from the gaseous phase of Si is found throughout the whole observed region, indicating the presence of silicate dust. Grains including iron are concentrated throughout the arched filaments. The modelling of the continuum spectral energy distribution in the IR range indicates that a component of dust at temperatures of ~100-200K is present in the central region of the Galaxy. Radio emission appears to be thermal bremsstrahlung in the E2-W1 filaments crossing strip; however, a synchrotron component is not excluded. More data are necessary to resolve these questions.
NASA Astrophysics Data System (ADS)
Sanchez, J. L.; Osipowicz, T.; Tang, S. M.; Tay, T. S.; Win, T. T.
1997-07-01
The trace element concentrations found in geological samples can shed light on the formation process. In the case of gemstones, which might be of artificial or natural origin, there is also considerable interest in the development of methods that provide identification of the origin of a sample. For rubies, trace element concentrations present in natural samples were shown previously to be significant indicators of the region of origin [S.M. Tang et al., Appl. Spectr. 42 (1988) 44, and 43 (1989) 219]. Here we report the results of micro-PIXE analyses of trace element (Ti, V, Cr, Fe, Cu and Ga) concentrations of a large set ( n = 130) of natural rough rubies from nine locations in Myanmar (Burma). The resulting concentrations are subjected to statistical analysis. Six of the nine groups form clusters when the data base is evaluated using tree clustering and principal component analysis.
Steindl, Theodora M; Crump, Carolyn E; Hayden, Frederick G; Langer, Thierry
2005-10-06
The development and application of a sophisticated virtual screening and selection protocol to identify potential, novel inhibitors of the human rhinovirus coat protein employing various computer-assisted strategies are described. A large commercially available database of compounds was screened using a highly selective, structure-based pharmacophore model generated with the program Catalyst. A docking study and a principal component analysis were carried out within the software package Cerius and served to validate and further refine the obtained results. These combined efforts led to the selection of six candidate structures, for which in vitro anti-rhinoviral activity could be shown in a biological assay.
Experimental nanocalorimetry of protonated and deprotonated water clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boulon, Julien; Braud, Isabelle; Zamith, Sébastien
2014-04-28
An experimental nanocalorimetric study of mass selected protonated (H{sub 2}O){sub n}H{sup +} and deprotonated (H{sub 2}O){sub n−1}OH{sup −} water clusters is reported in the size range n = 20–118. Water cluster's heat capacities exhibit a change of slope at size dependent temperatures varying from 90 to 140 K, which is ascribed to phase or structural transition. For both anionic and cationic species, these transition temperatures strongly vary at small sizes, with higher amplitude for protonated than for deprotonated clusters, and change more smoothly above roughly n ≈ 35. There is a correlation between bonding energies and transition temperatures, which ismore » split in two components for protonated clusters while only one component is observed for deprotonated clusters. These features are tentatively interpreted in terms of structural properties of water clusters.« less
Kimokoti, Ruth W.; Gona, Philimon; Zhu, Lei; Newby, P. K.; Millen, Barbara E.; Brown, Lisa S.; D’Agostino, Ralph B.; Fung, Teresa T.
2012-01-01
Data on the relationship between empirical dietary patterns and metabolic syndrome (MetS) and its components in prospective study designs are limited. In addition, demographic and lifestyle determinants of MetS may modify the association between dietary patterns and the syndrome. We prospectively examined the relationship between empirically derived patterns and MetS and MetS components among 1146 women in the Framingham Offspring/Spouse cohort. They were aged 25–77 y with BMI ≥18.5 kg/m2 and free of cardiovascular disease, diabetes, cancer, and MetS at baseline, and followed for a mean of 7 y. Five dietary patterns, Heart Healthier, Lighter Eating, Wine and Moderate Eating, Higher Fat, and Empty Calorie, were previously identified using cluster analysis from food intake collected using a FFQ. After adjusting for potential confounders, we observed lower odds for abdominal obesity for Higher Fat [OR = 0.48 (95% CI: 0.25, 0.91)] and Wine and Moderate Eating clusters [OR = 0.28 (95% CI: 0.11, 0.72)] compared with the Empty Calorie cluster. Additional adjustment for BMI somewhat attenuated these OR [Higher Fat OR = 0.52 (95% CI: 0.27, 1.00); Wine and Moderate Eating OR = 0.34 (95% CI: 0.13, 0.89)]. None of the clusters was associated with MetS or other MetS components. Baseline smoking status and age did not modify the relation between dietary patterns and MetS. The Higher Fat and Wine and Moderate Eating patterns showed an inverse association with abdominal obesity; certain foods might be targeted in these habitual patterns to achieve optimal dietary patterns for MetS prevention. PMID:22833658
2013-01-01
Background The paper presents the evaluation of soil contamination with total, water-available, mobile, semi-mobile and non-mobile Hg fractions in the surroundings of a former chlor-alkali plant in connection with several chemical soil characteristics. Principal Component Analysis and Cluster Analysis were used to evaluate the chemical composition variability of soil and factors influencing the fate of Hg in such areas. The sequential extraction EPA 3200-Method and the determination technique based on capacitively coupled microplasma optical emission spectrometry were checked. Results A case study was conducted in the Turda town, Romania. The results revealed a high contamination with Hg in the area of the former chlor-alkali plant and waste landfills, where soils were categorized as hazardous waste. The weight of the Hg fractions decreased in the order semi-mobile > non-mobile > mobile > water leachable. Principal Component Analysis revealed 7 factors describing chemical composition variability of soil, of which 3 attributed to Hg species. Total Hg, semi-mobile, non-mobile and mobile fractions were observed to have a strong influence, while the water leachable fraction a weak influence. The two-dimensional plot of PCs highlighted 3 groups of sites according to the Hg contamination factor. The statistical approach has shown that the Hg fate in soil is dependent on pH, content of organic matter, Ca, Fe, Mn, Cu and SO42- rather than natural components, such as aluminosilicates. Cluster analysis of soil characteristics revealed 3 clusters, one of which including Hg species. Soil contamination with Cu as sulfate and Zn as nitrate was also observed. Conclusions The approach based on speciation and statistical interpretation of data developed in this study could be useful in the investigation of other chlor-alkali contaminated areas. According to the Bland and Altman test the 3-step sequential extraction scheme is suitable for Hg speciation in soil, while the used determination method of Hg is appropriate. PMID:24252185
Underdetermined blind separation of three-way fluorescence spectra of PAHs in water
NASA Astrophysics Data System (ADS)
Yang, Ruifang; Zhao, Nanjing; Xiao, Xue; Zhu, Wei; Chen, Yunan; Yin, Gaofang; Liu, Jianguo; Liu, Wenqing
2018-06-01
In this work, underdetermined blind decomposition method is developed to recognize individual components from the three-way fluorescent spectra of their mixtures by using sparse component analysis (SCA). The mixing matrix is estimated from the mixtures using fuzzy data clustering algorithm together with the scatters corresponding to local energy maximum value in the time-frequency domain, and the spectra of object components are recovered by pseudo inverse technique. As an example, using this method three and four pure components spectra can be blindly extracted from two samples of their mixture, with similarities between resolved and reference spectra all above 0.80. This work opens a new and effective path to realize monitoring PAHs in water by three-way fluorescence spectroscopy technique.
Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue
2016-01-01
The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines.
Liu, Wei; Wang, Dongmei; Liu, Jianjun; Li, Dengwu; Yin, Dongxue
2016-01-01
The present study was performed to assess the quality of Potentilla fruticosa L. sampled from distinct regions of China using high performance liquid chromatography (HPLC) fingerprinting coupled with a suite of chemometric methods. For this quantitative analysis, the main active phytochemical compositions and the antioxidant activity in P. fruticosa were also investigated. Considering the high percentages and antioxidant activities of phytochemicals, P. fruticosa samples from Kangding, Sichuan were selected as the most valuable raw materials. Similarity analysis (SA) of HPLC fingerprints, hierarchical cluster analysis (HCA), principle component analysis (PCA), and discriminant analysis (DA) were further employed to provide accurate classification and quality estimates of P. fruticosa. Two principal components (PCs) were collected by PCA. PC1 separated samples from Kangding, Sichuan, capturing 57.64% of the variance, whereas PC2 contributed to further separation, capturing 18.97% of the variance. Two kinds of discriminant functions with a 100% discrimination ratio were constructed. The results strongly supported the conclusion that the eight samples from different regions were clustered into three major groups, corresponding with their morphological classification, for which HPLC analysis confirmed the considerable variation in phytochemical compositions and that P. fruticosa samples from Kangding, Sichuan were of high quality. The results of SA, HCA, PCA, and DA were in agreement and performed well for the quality assessment of P. fruticosa. Consequently, HPLC fingerprinting coupled with chemometric techniques provides a highly flexible and reliable method for the quality evaluation of traditional Chinese medicines. PMID:26890416
M Weerasekera, Manjula; H Sissons, Chris; Wong, Lisa; A Anderson, Sally; R Holmes, Ann; D Cannon, Richard
2017-10-01
The aim was to investigate the relationship between groups of bacteria identified by cluster analysis of the DGGE fingerprints and the amounts and diversity of yeast present. Bacterial and yeast populations in saliva samples from 24 adults were analysed using denaturing gradient gel electrophoresis (DGGE) of the bacteria present and by yeast culture. Eubacterial DGGE banding patterns showed considerable variation between individuals. Seventy one different amplicon bands were detected, the band number per saliva sample ranged from 21 to 39 (mean±SD=29.3±4.9). Cluster and principal component analysis of the bacterial DGGE patterns yielded three major clusters containing 20 of the samples. Seventeen of the 24 (71%) saliva samples were yeast positive with concentrations up to 10 3 cfu/mL. Candida albicans was the predominant species in saliva samples although six other yeast species, including Candida dubliniensis, Candida tropicalis, Candida krusei, Candida guilliermondii, Candida rugosa and Saccharomyces cerevisiae, were identified. The presence, concentration, and species of yeast in samples showed no clear relationship to the bacterial clusters. Despite indications of in vitro bacteria-yeast interactions, there was a lack of association between the presence, identity and diversity of yeasts and the bacterial DGGE fingerprint clusters in saliva. This suggests significant ecological individual-specificity of these associations in highly complex in vivo oral biofilm systems under normal oral conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Nagata, Tomoyuki; Shinagawa, Shunichiro; Nakajima, Shinichiro; Plitman, Eric; Mihashi, Yukiko; Hayashi, Shogo; Mimura, Masaru; Nakayama, Kazuhiko
2016-01-01
The Neuropsychiatric Inventory (NPI) comprises 12 items, which were conventionally determined by psychopathological symptoms of patients with dementia. The clinical rating scales with structured questionnaires have been useful to evaluate neuropsychiatric symptoms (NPSs) of patients with dementia over the past twenty year. The aim of this study was to classify the conventional NPSs in patients with Alzheimer's disease (AD) requiring antipsychotic treatment for their NPSs into distinct clusters to simplify assessment of these numerous symptoms. Twelve items scores (product of severity and frequency of each symptom) in the NPI taken from the baseline visit were classified into subgroups by principle component analysis using data from 421 outpatients with AD enrolled in the Clinical Antipsychotic Trials of Intervention Effectiveness-Alzheimer's Disease (CATIE-AD) Phase 1. Chi square tests were conducted to examine the co-occurrence of the subgroups. We found four distinct clusters: aggressiveness (agitation and irritabilities), apathy and eating problems (apathy and appetite/eating disturbance), psychosis (delusions and hallucinations), and emotion and disinhibition (depression, euphoria, and disinhibition). Anxiety, aberrant motor behavior, and sleep disturbance were not included by these clusters. Apathy and eating problems, and emotion and disinhibition co-occurred (p = 0.002), whereas aggressiveness and psychosis occurred independent of the other clusters. Four distinct category clusters were identified from NPSs in patients with AD requiring antipsychotic treatment. Future studies should investigate psychosocial backgrounds or risk factors of each distinct cluster, in addition to their longitudinal course over treatment intervention.
Cardiometabolic risk clustering in spinal cord injury: results of exploratory factor analysis.
Libin, Alexander; Tinsley, Emily A; Nash, Mark S; Mendez, Armando J; Burns, Patricia; Elrod, Matt; Hamm, Larry F; Groah, Suzanne L
2013-01-01
Evidence suggests an elevated prevalence of cardiometabolic risks among persons with spinal cord injury (SCI); however, the unique clustering of risk factors in this population has not been fully explored. The purpose of this study was to describe unique clustering of cardiometabolic risk factors differentiated by level of injury. One hundred twenty-one subjects (mean 37 ± 12 years; range, 18-73) with chronic C5 to T12 motor complete SCI were studied. Assessments included medical histories, anthropometrics and blood pressure, and fasting serum lipids, glucose, insulin, and hemoglobin A1c (HbA1c). The most common cardiometabolic risk factors were overweight/obesity, high levels of low-density lipoprotein (LDL-C), and low levels of high-density lipoprotein (HDL-C). Risk clustering was found in 76.9% of the population. Exploratory principal component factor analysis using varimax rotation revealed a 3-factor model in persons with paraplegia (65.4% variance) and a 4-factor solution in persons with tetraplegia (73.3% variance). The differences between groups were emphasized by the varied composition of the extracted factors: Lipid Profile A (total cholesterol [TC] and LDL-C), Body Mass-Hypertension Profile (body mass index [BMI], systolic blood pressure [SBP], and fasting insulin [FI]); Glycemic Profile (fasting glucose and HbA1c), and Lipid Profile B (TG and HDL-C). BMI and SBP formed a separate factor only in persons with tetraplegia. Although the majority of the population with SCI has risk clustering, the composition of the risk clusters may be dependent on level of injury, based on a factor analysis group comparison. This is clinically plausible and relevant as tetraplegics tend to be hypo- to normotensive and more sedentary, resulting in lower HDL-C and a greater propensity toward impaired carbohydrate metabolism.
Incipient fault detection study for advanced spacecraft systems
NASA Technical Reports Server (NTRS)
Milner, G. Martin; Black, Michael C.; Hovenga, J. Mike; Mcclure, Paul F.
1986-01-01
A feasibility study to investigate the application of vibration monitoring to the rotating machinery of planned NASA advanced spacecraft components is described. Factors investigated include: (1) special problems associated with small, high RPM machines; (2) application across multiple component types; (3) microgravity; (4) multiple fault types; (5) eight different analysis techniques including signature analysis, high frequency demodulation, cepstrum, clustering, amplitude analysis, and pattern recognition are compared; and (6) small sample statistical analysis is used to compare performance by computation of probability of detection and false alarm for an ensemble of repeated baseline and faulted tests. Both detection and classification performance are quantified. Vibration monitoring is shown to be an effective means of detecting the most important problem types for small, high RPM fans and pumps typical of those planned for the advanced spacecraft. A preliminary monitoring system design and implementation plan is presented.
Yoo, Minjae; Shin, Jimin; Kim, Hyunmin; Kim, Jihye; Kang, Jaewoo; Tan, Aik Choon
2018-04-04
Traditional Chinese Medicine (TCM) has been practiced over thousands of years in China and other Asian countries for treating various symptoms and diseases. However, the underlying molecular mechanisms of TCM are poorly understood, partly due to the "multi-component, multi-target" nature of TCM. To uncover the molecular mechanisms of TCM, we perform comprehensive gene expression analysis using connectivity map. We interrogated gene expression signatures obtained 102 TCM components using the next generation Connectivity Map (CMap) resource. We performed systematic data mining and analysis on the mechanism of action (MoA) of these TCM components based on the CMap results. We clustered the 102 TCM components into four groups based on their MoAs using next generation CMap resource. We performed gene set enrichment analysis on these components to provide additional supports for explaining these molecular mechanisms. We also provided literature evidence to validate the MoAs identified through this bioinformatics analysis. Finally, we developed the Traditional Chinese Medicine Drug Repurposing Hub (TCM Hub) - a connectivity map resource to facilitate the elucidation of TCM MoA for drug repurposing research. TCMHub is freely available in http://tanlab.ucdenver.edu/TCMHub. Molecular mechanisms of TCM could be uncovered by using gene expression signatures and connectivity map. Through this analysis, we identified many of the TCM components possess diverse MoAs, this may explain the applications of TCM in treating various symptoms and diseases. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Independent component processes underlying emotions during natural music listening
Zollinger, Nina; Elmer, Stefan; Jäncke, Lutz
2016-01-01
The aim of this study was to investigate the brain processes underlying emotions during natural music listening. To address this, we recorded high-density electroencephalography (EEG) from 22 subjects while presenting a set of individually matched whole musical excerpts varying in valence and arousal. Independent component analysis was applied to decompose the EEG data into functionally distinct brain processes. A k-means cluster analysis calculated on the basis of a combination of spatial (scalp topography and dipole location mapped onto the Montreal Neurological Institute brain template) and functional (spectra) characteristics revealed 10 clusters referring to brain areas typically involved in music and emotion processing, namely in the proximity of thalamic-limbic and orbitofrontal regions as well as at frontal, fronto-parietal, parietal, parieto-occipital, temporo-occipital and occipital areas. This analysis revealed that arousal was associated with a suppression of power in the alpha frequency range. On the other hand, valence was associated with an increase in theta frequency power in response to excerpts inducing happiness compared to sadness. These findings are partly compatible with the model proposed by Heller, arguing that the frontal lobe is involved in modulating valenced experiences (the left frontal hemisphere for positive emotions) whereas the right parieto-temporal region contributes to the emotional arousal. PMID:27217116
Qian, Chang-Min; Song, Zhao-Hui; Zhang, Lan-Lan; Zhou, Shui-Ping; Feng, Feng
2013-09-01
An ultra performance liquid chromatography (UPLC) method was established and validated to simultaneously determine the contents of six aconitum alkaloids in mother, daughter and fibrous roots of 19 batches of Aconitum carmichaelii from Sichuan province. The separation of the six alkaloids was achieved on a ACQUITY UPLC BEH C18 (2.1 mm x 100 mm, 1.7 microm) column at 40 degrees C with a mobile phase consisting of acetonitrile in 30 mmol x L(-1) ammonium acetate buffer solution (adjusted to pH 10.0 with aqueous ammonia) in gradient mode. The data and plots showed that the six aconitum alkaloids have different distributions. Four aconitum alkaloids were almost same in mother and daughter root except benzoylmesaconine and mesaconitine, while the fibrous root differed from the other two roots. The comparisons of significant differences of six aconitum alkaloids between the mother and daughter roots definitely demonstrated that benzoylmesaconine and mesaconitine were the representative components. The 38 detecting samples were classified as two clusters by hierarchical clustering analysis (HCA) and principle component analysis (PCA), the results indicated that the mother root was different from the daughter root on chemical material basis. The study might contribute to the reasonable clinical application of A. carmichaelii.
NASA Astrophysics Data System (ADS)
Sokolov, Anton; Dmitriev, Egor; Delbarre, Hervé; Augustin, Patrick; Gengembre, Cyril; Fourmenten, Marc
2016-04-01
The problem of atmospheric contamination by principal air pollutants was considered in the industrialized coastal region of English Channel in Dunkirk influenced by north European metropolitan areas. MESO-NH nested models were used for the simulation of the local atmospheric dynamics and the online calculation of Lagrangian backward trajectories with 15-minute temporal resolution and the horizontal resolution down to 500 m. The one-month mesoscale numerical simulation was coupled with local pollution measurements of volatile organic components, particulate matter, ozone, sulphur dioxide and nitrogen oxides. Principal atmospheric pathways were determined by clustering technique applied to backward trajectories simulated. Six clusters were obtained which describe local atmospheric dynamics, four winds blowing through the English Channel, one coming from the south, and the biggest cluster with small wind speeds. This last cluster includes mostly sea breeze events. The analysis of meteorological data and pollution measurements allows relating the principal atmospheric pathways with local air contamination events. It was shown that contamination events are mostly connected with a channelling of pollution from local sources and low-turbulent states of the local atmosphere.
Viscous self interacting dark matter and cosmic acceleration
NASA Astrophysics Data System (ADS)
Atreya, Abhishek; Bhatt, Jitesh R.; Mishra, Arvind
2018-02-01
Self interacting dark matter (SIDM) provides us with a consistent solution to certain astrophysical observations in conflict with collision-less cold DM paradigm. In this work we estimate the shear viscosity (η) and bulk viscosity (ζ) of SIDM, within kinetic theory formalism, for galactic and cluster size SIDM halos. To that extent we make use of the recent constraints on SIDM cross-section for the dwarf galaxies, LSB galaxies and clusters. We also estimate the change in solution of Einstein's equation due to these viscous effects and find that σ/m constraints on SIDM from astrophysical data provide us with sufficient viscosity to account for the observed cosmic acceleration at present epoch, without the need of any additional dark energy component. Using the estimates of dark matter density for galactic and cluster size halo we find that the mean free path of dark matter ~ few Mpc. Thus the smallest scale at which the viscous effect start playing the role is cluster scale. Astrophysical data for dwarf, LSB galaxies and clusters also seems to suggest the same. The entire analysis is independent of any specific particle physics motivated model for SIDM.
Liévanos, Raoul S
2015-11-01
This article contributes to environmental inequality outcomes research on the spatial and demographic factors associated with cumulative air-toxic health risks at multiple geographic scales across the United States. It employs a rigorous spatial cluster analysis of census tract-level 2005 estimated lifetime cancer risk (LCR) of ambient air-toxic emissions from stationary (e.g., facility) and mobile (e.g., vehicular) sources to locate spatial clusters of air-toxic LCR risk in the continental United States. It then tests intersectional environmental inequality hypotheses on the predictors of tract presence in air-toxic LCR clusters with tract-level principal component factor measures of economic deprivation by race and immigrant status. Logistic regression analyses show that net of controls, isolated Latino immigrant-economic deprivation is the strongest positive demographic predictor of tract presence in air-toxic LCR clusters, followed by black-economic deprivation and isolated Asian/Pacific Islander immigrant-economic deprivation. Findings suggest scholarly and practical implications for future research, advocacy, and policy. Copyright © 2015 Elsevier Inc. All rights reserved.
2013-01-01
Background Various diet- and activity-related parenting practices are positive determinants of child dietary and activity behaviour, including home availability, parental modelling and parental policies. There is evidence that parenting practices cluster within the dietary domain and within the activity domain. This study explores whether diet- and activity-related parenting practices cluster across the dietary and activity domain. Also examined is whether the clusters are related to child and parental background characteristics. Finally, to indicate the relevance of the clusters in influencing child dietary and activity behaviour, we examined whether clusters of parenting practices are related to these behaviours. Methods Data were used from 1480 parent–child dyads participating in the Dutch IVO Nutrition and Physical Activity Child cohorT (INPACT). Parents of children aged 8–11 years completed questionnaires at home assessing their diet- and activity-related parenting practices, child and parental background characteristics, and child dietary and activity behaviours. Principal component analysis (PCA) was used to identify clusters of parenting practices. Backward regression analysis was used to examine the relationship between child and parental background characteristics with cluster scores, and partial correlations to examine associations between cluster scores and child dietary and activity behaviours. Results PCA revealed five clusters of parenting practices: 1) high visibility and accessibility of screens and unhealthy food, 2) diet- and activity-related rules, 3) low availability of unhealthy food, 4) diet- and activity-related positive modelling, and 5) positive modelling on sports and fruit. Low parental education was associated with unhealthy cluster 1, while high(er) education was associated with healthy clusters 2, 3 and 5. Separate clusters were related to both child dietary and activity behaviour in the hypothesized directions: healthy clusters were positively related to obesity-reducing behaviours and negatively to obesity-inducing behaviours. Conclusion Parenting practices cluster across the dietary and activity domain. Parental education can be seen as an indicator of a broader parental context in which clusters of parenting practices operate. Separate clusters are related to both child dietary and activity behaviour. Interventions that focus on clusters of parenting practices to assist parents (especially low-educated parents) in changing their child’s dietary and activity behaviour seems justified. PMID:23531232
Employment relations and global health: a typological study of world labor markets.
Chung, Haejoo; Muntaner, Carles; Benach, Joan
2010-01-01
In this study, the authors investigate the global labor market and employment relations, which are central building blocks of the welfare state; the aim is to propose a global typology of labor markets to explain global inequalities in population health. Countries are categorized into core (21), semi-peripheral (42), and peripheral (71) countries, based on gross national product per capita (Atlas method). Labor market-related variables and factors are then used to generate clusters of countries with principal components and cluster analysis methods. The authors then examine the relationship between the resulting clusters and health outcomes. The clusters of countries are largely geographically defined, each cluster with similar historical background and developmental strategy. However, there are interesting exceptions, which warrant further elaboration. The relationship between health outcomes and clusters largely follows the authors' expectations (except for communicable diseases): more egalitarian labor institutions have better health outcomes. The world system, then, can be divided according to different types of labor markets that are predictive of population health outcomes at each level of economic development. As is the case for health and social policies, variability in labor market characteristics is likely to reflect, in part, the relative strength of a country's political actors.
CAMPAIGN: an open-source library of GPU-accelerated data clustering algorithms.
Kohlhoff, Kai J; Sosnick, Marc H; Hsu, William T; Pande, Vijay S; Altman, Russ B
2011-08-15
Data clustering techniques are an essential component of a good data analysis toolbox. Many current bioinformatics applications are inherently compute-intense and work with very large datasets. Sequential algorithms are inadequate for providing the necessary performance. For this reason, we have created Clustering Algorithms for Massively Parallel Architectures, Including GPU Nodes (CAMPAIGN), a central resource for data clustering algorithms and tools that are implemented specifically for execution on massively parallel processing architectures. CAMPAIGN is a library of data clustering algorithms and tools, written in 'C for CUDA' for Nvidia GPUs. The library provides up to two orders of magnitude speed-up over respective CPU-based clustering algorithms and is intended as an open-source resource. New modules from the community will be accepted into the library and the layout of it is such that it can easily be extended to promising future platforms such as OpenCL. Releases of the CAMPAIGN library are freely available for download under the LGPL from https://simtk.org/home/campaign. Source code can also be obtained through anonymous subversion access as described on https://simtk.org/scm/?group_id=453. kjk33@cantab.net.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caruso, Francesco; Bellacicca, Andrea; Milani, Paolo, E-mail: pmilani@mi.infn.it
We report the rapid prototyping of passive electrical components (resistors and capacitors) on plain paper by an additive and parallel technology consisting of supersonic cluster beam deposition (SCBD) coupled with shadow mask printing. Cluster-assembled films have a growth mechanism substantially different from that of atom-assembled ones providing the possibility of a fine tuning of their electrical conduction properties around the percolative conduction threshold. Exploiting the precise control on cluster beam intensity and shape typical of SCBD, we produced, in a one-step process, batches of resistors with resistance values spanning a range of two orders of magnitude. Parallel plate capacitors withmore » paper as the dielectric medium were also produced with capacitance in the range of tens of picofarads. Compared to standard deposition technologies, SCBD allows for a very efficient use of raw materials and the rapid production of components with different shape and dimensions while controlling independently the electrical characteristics. Discrete electrical components produced by SCBD are very robust against deformation and bending, and they can be easily assembled to build circuits with desired characteristics. The availability of large batches of these components enables the rapid and cheap prototyping and integration of electrical components on paper as building blocks of more complex systems.« less
Tuttolomondo, Teresa; La Bella, Salvatore; Licata, Mario; Virga, Giuseppe; Leto, Claudio; Saija, Antonella; Trombetta, Domenico; Tomaino, Antonio; Speciale, Antonio; Napoli, Edoardo M; Siracusa, Laura; Pasquale, Andrea; Curcuruto, Giusy; Ruberto, Giuseppe
2013-03-01
An extensive survey of wild Sicilian oregano was made. A total of 57 samples were collected from various sites, followed by taxonomic characterization from an agronomic perspective. Based on morphological and production characteristics obtained from the 57 samples, cluster analysis was used to divide the samples into homogeneous groups, to identify the best biotypes. All samples were analyzed for their phytochemical content, applying a cascade-extraction protocol and hydrodistillation, to obtain the non volatile components and the essential oils, respectively. The extracts contained thirteen polyphenol derivatives, i.e., four flavanones, seven flavones, and two organic acids. Their qualitative and quantitative characterization was carried out by LC/MS analyses. The essential oils were characterized using a combination of GC-FID and GC/MS analyses; a total of 81 components were identified. The major components of the oils were thymol, p-cymene, and γ-terpinene. Cluster analysis was carried out on both phytochemical profiles and resulted in the division of the oregano samples into different chemical groups. The antioxidant activity of the essential oils and extracts was investigated by the Folin-Ciocalteau (FC) colorimetric assay, by UV radiation-induced peroxidation in liposomal membranes (UV-IP test), and by determining the O(2)(∙-)-scavenging activity. Copyright © 2013 Verlag Helvetica Chimica Acta AG, Zürich.
Brambilla, Giovanni; Maffei, Luigi; Di Gabriele, Maria; Gallo, Veronica
2013-07-01
An experimental study was carried out in 20 squares in the center of Rome, covering a wide range of different uses, sonic environments, geometry, and architectural styles. Soundwalks along the perimeter of each square were performed during daylight and weekdays taking binaural and video recordings, as well as spot measurements of illuminance. The cluster analysis performed on the physical parameters, not only acoustic, provided two clusters that are in satisfactory agreement with the "a priori" classification. Applying the principal component analysis (PCA) to five physical parameters, two main components were obtained which might be associated to two environmental features, namely, "chaotic/calm" and "open/enclosed." On the basis of these two features, six squares were selected for the laboratory audio-video tests where 32 subjects took part filling in a questionnaire. The PCA performed on the subjective ratings on the sonic environment showed two main components which might be associated to two emotional meanings, namely, "calmness" and "vibrancy." The linear regression modeling between five objective parameters and the mean value of subjective ratings on chaotic/calm and enclosed/open attributes showed a good correlation. Notwithstanding these interesting results being limited to the specific data set, it is worth pointing out that the complexity of the soundscape quality assessment can be more comprehensively examined merging the field measurements of physical parameters with the subjective ratings provided by field and/or laboratory tests.
PCA/HEXTE Observations of Coma and A2319
NASA Technical Reports Server (NTRS)
Rephaeli, Yoel
1998-01-01
The Coma cluster was observed in 1996 for 90 ks by the PCA and HEXTE instruments aboard the RXTE satellite, the first simultaneous, pointing measurement of Coma in the broad, 2-250 keV, energy band. The high sensitivity achieved during this long observation allows precise determination of the spectrum. Our analysis of the measurements clearly indicates that in addition to the main thermal emission from hot intracluster gas at kT=7.5 keV, a second spectral component is required to best-fit the data. If thermal, it can be described with a temperature of 4.7 keV contributing about 20% of the total flux. The additional spectral component can also be described by a power-law, possibly due to Compton scattering of relativistic electrons by the CMB. This interpretation is based on the diffuse radio synchrotron emission, which has a spectral index of 2.34, within the range allowed by fits to the RXTE spectral data. A Compton origin of the measured nonthermal component would imply that the volume-averaged magnetic field in the central region of Coma is B =0.2 micro-Gauss, a value deduced directly from the radio and X-ray measurements (and thus free of the usual assumption of energy equipartition). Barring the presence of unknown systematic errors in the RXTE source or background measurements, our spectral analysis yields considerable evidence for Compton X-ray emission in the Coma cluster.
Sarrafzadeh, Omid; Dehnavi, Alireza Mehri
2015-01-01
Segmentation of leukocytes acts as the foundation for all automated image-based hematological disease recognition systems. Most of the time, hematologists are interested in evaluation of white blood cells only. Digital image processing techniques can help them in their analysis and diagnosis. The main objective of this paper is to detect leukocytes from a blood smear microscopic image and segment them into their two dominant elements, nucleus and cytoplasm. The segmentation is conducted using two stages of applying K-means clustering. First, the nuclei are segmented using K-means clustering. Then, a proposed method based on region growing is applied to separate the connected nuclei. Next, the nuclei are subtracted from the original image. Finally, the cytoplasm is segmented using the second stage of K-means clustering. The results indicate that the proposed method is able to extract the nucleus and cytoplasm regions accurately and works well even though there is no significant contrast between the components in the image. In this paper, a method based on K-means clustering and region growing is proposed in order to detect leukocytes from a blood smear microscopic image and segment its components, the nucleus and the cytoplasm. As region growing step of the algorithm relies on the information of edges, it will not able to separate the connected nuclei more accurately in poor edges and it requires at least a weak edge to exist between the nuclei. The nucleus and cytoplasm segments of a leukocyte can be used for feature extraction and classification which leads to automated leukemia detection.
Nucleus and cytoplasm segmentation in microscopic images using K-means clustering and region growing
Sarrafzadeh, Omid; Dehnavi, Alireza Mehri
2015-01-01
Background: Segmentation of leukocytes acts as the foundation for all automated image-based hematological disease recognition systems. Most of the time, hematologists are interested in evaluation of white blood cells only. Digital image processing techniques can help them in their analysis and diagnosis. Materials and Methods: The main objective of this paper is to detect leukocytes from a blood smear microscopic image and segment them into their two dominant elements, nucleus and cytoplasm. The segmentation is conducted using two stages of applying K-means clustering. First, the nuclei are segmented using K-means clustering. Then, a proposed method based on region growing is applied to separate the connected nuclei. Next, the nuclei are subtracted from the original image. Finally, the cytoplasm is segmented using the second stage of K-means clustering. Results: The results indicate that the proposed method is able to extract the nucleus and cytoplasm regions accurately and works well even though there is no significant contrast between the components in the image. Conclusions: In this paper, a method based on K-means clustering and region growing is proposed in order to detect leukocytes from a blood smear microscopic image and segment its components, the nucleus and the cytoplasm. As region growing step of the algorithm relies on the information of edges, it will not able to separate the connected nuclei more accurately in poor edges and it requires at least a weak edge to exist between the nuclei. The nucleus and cytoplasm segments of a leukocyte can be used for feature extraction and classification which leads to automated leukemia detection. PMID:26605213
Blecha, Kevin A.; Alldredge, Mat W.
2015-01-01
Animal space use studies using GPS collar technology are increasingly incorporating behavior based analysis of spatio-temporal data in order to expand inferences of resource use. GPS location cluster analysis is one such technique applied to large carnivores to identify the timing and location of feeding events. For logistical and financial reasons, researchers often implement predictive models for identifying these events. We present two separate improvements for predictive models that future practitioners can implement. Thus far, feeding prediction models have incorporated a small range of covariates, usually limited to spatio-temporal characteristics of the GPS data. Using GPS collared cougar (Puma concolor) we include activity sensor data as an additional covariate to increase prediction performance of feeding presence/absence. Integral to the predictive modeling of feeding events is a ground-truthing component, in which GPS location clusters are visited by human observers to confirm the presence or absence of feeding remains. Failing to account for sources of ground-truthing false-absences can bias the number of predicted feeding events to be low. Thus we account for some ground-truthing error sources directly in the model with covariates and when applying model predictions. Accounting for these errors resulted in a 10% increase in the number of clusters predicted to be feeding events. Using a double-observer design, we show that the ground-truthing false-absence rate is relatively low (4%) using a search delay of 2–60 days. Overall, we provide two separate improvements to the GPS cluster analysis techniques that can be expanded upon and implemented in future studies interested in identifying feeding behaviors of large carnivores. PMID:26398546
Hamad, Ismail; AbdElgawad, Hamada; Al Jaouni, Soad; Zinta, Gaurav; Asard, Han; Hassan, Sherif; Hegab, Momtaz; Hagagy, Nashwa; Selim, Samy
2015-07-27
Date palm is an important crop, especially in the hot-arid regions of the world. Date palm fruits have high nutritional and therapeutic value and possess significant antibacterial and antifungal properties. In this study, we performed bioactivity analyses and metabolic profiling of date fruits of 12 cultivars from Saudi Arabia to assess their nutritional value. Our results showed that the date extracts from different cultivars have different free radical scavenging and anti-lipid peroxidation activities. Moreover, the cultivars showed significant differences in their chemical composition, e.g., the phenolic content (10.4-22.1 mg/100 g DW), amino acids (37-108 μmol·g-1 FW) and minerals (237-969 mg/100 g DW). Principal component analysis (PCA) showed a clear separation of the cultivars into four different groups. The first group consisted of the Sokary, Nabtit Ali cultivars, the second group of Khlas Al Kharj, Khla Al Qassim, Mabroom, Khlas Al Ahsa, the third group of Khals Elshiokh, Nabot Saif, Khodry, and the fourth group consisted of Ajwa Al Madinah, Saffawy, Rashodia, cultivars. Hierarchical cluster analysis (HCA) revealed clustering of date cultivars into two groups. The first cluster consisted of the Sokary, Rashodia and Nabtit Ali cultivars, and the second cluster contained all the other tested cultivars. These results indicate that date fruits have high nutritive value, and different cultivars have different chemical composition.
NASA Astrophysics Data System (ADS)
Kumar, Raj; Sharma, Vishal
2017-03-01
The present research is focused on the analysis of writing inks using destructive UV-Vis spectroscopy (dissolution of ink by the solvent) and non-destructive diffuse reflectance UV-Vis-NIR spectroscopy along with Chemometrics. Fifty seven samples of blue ballpoint pen inks were analyzed under optimum conditions to determine the differences in spectral features of inks among same and different manufacturers. Normalization was performed on the spectroscopic data before chemometric analysis. Principal Component Analysis (PCA) and K-mean cluster analysis were used on the data to ascertain whether the blue ballpoint pen inks could be differentiated by their UV-Vis/UV-Vis NIR spectra. The discriminating power is calculated by qualitative analysis by the visual comparison of the spectra (absorbance peaks), produced by the destructive and non-destructive methods. In the latter two methods, the pairwise comparison is made by incorporating the clustering method. It is found that chemometric method provides better discriminating power (98.72% and 99.46%, in destructive and non-destructive, respectively) in comparison to the qualitative analysis (69.67%).
10B+α states with chain-like structures in 14N
NASA Astrophysics Data System (ADS)
Kanada-En'yo, Yoshiko
2015-12-01
I investigate 10B+α -cluster states of 14N with a 10B+α -cluster model. Near the α -decay threshold energy, I obtain Kπ=3+ and Kπ=1+ rotational bands having 10B(3+) +α and 10B(1+) +α components, respectively. I assign the bandhead state of the Kπ=3+ band to the experimental 3+ at Ex=13.19 MeV of 14N observed in α scattering reactions by 10B and show that the calculated α -decay width is consistent with the experimental data. I discuss an α -cluster motion around the 10B cluster and show that the Kπ=3+ and Kπ=1+ rotational bands contain an enhanced component of a linear-chain 3 α configuration, in which an α cluster is localized in the longitudinal direction around the deformed 10B cluster.
Cuellar, M; Harkrider, A W; Jenson, D; Thornton, D; Bowers, A; Saltuklaroglu, T
2016-07-01
Electroencephalography (EEG) was used to map the temporal dynamics of sensorimotor integration relative to the strength and timing of muscular activity during swallowing. 64-channel EEG data and surface electromyographic (sEMG) data were recorded from 25 neurologically-healthy adults during swallowing and tongue-tapping. Events were demarcated so that sensorimotor activity primarily from the pharyngeal and esophageal phases of swallowing could be compared to activity resulting from tongue tapping. Independent component analysis identified bilateral clusters of sensorimotor mu components localized to the premotor and primary motor cortices as well as an infrahyoid myogenic cluster. Subsequent event-related spectral perturbations (ERSP) analyses showed event-related desynchronization (ERD) in the spectral power in the alpha (8-13Hz) and beta (15-25Hz) frequency bands of the mu clusters in both tasks. Mu ERD was stronger during swallowing when compared to tongue tapping (pFDR<.05) and the differences in sensorimotor processing between conditions was greater in the right hemisphere than the left, suggesting stronger right hemisphere lateralization for swallowing than tongue-tapping. Mu activity was interpreted as representing a normal feed forward and feedback driven sensorimotor loop during the later stages of swallowing. Results support further use of this novel neuroimaging technique to concurrently map neural and muscle activity during swallowing in clinical populations using EEG. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
In vivo confocal microscopic analysis of normal human anterior limbal stroma
Mathews, Saumi; Chidambaram, Jaya Devi; Lanjewar, Shruti; Mascarenhas, Jeena; Prajna, Namperumalsamy Venkatesh; Muthukkaruppan, Veerappan; Chidambaranathan, Gowri Priya
2015-01-01
Purpose To characterize the microarchitecture of the anterior limbal stroma in healthy individuals using in vivo confocal microscopy (IVCM) and to correlate it with mesenchymal stem cells (MSCs), a component of the limbal-niche. Methods The corneal side of the superior limbus was scanned in 30 eyes of 17 normal subjects beyond the basal epithelium, deep into the stroma using a HRT III laser scanning microscope. The IVCM findings were correlated with the immunohistochemical features of MSCs in the anterior limbal stroma. Results Clusters of hyperreflective structures were observed in the anterior limbal stroma, subjacent to the basal epithelium (depth: 50.2±8.7 - 98±12.8 μm), but not in the corneal stroma. The structures showed unique morphology compared to epithelial cells, keratocytes, neurons and dendritic cells. In parallel, confocal analysis of immunostained sections showed clusters of cells, double positive for MSC specific markers (CD90 and CD105) in the anterior limbal stroma at a depth of 55.3±12.7 μm to 72±37.6 μm. The organization and distribution of the MSC clusters locates them within the hyperreflective region in the anterior limbal stroma. Conclusions The hyperreflective structures, demonstrated for the first time in the human anterior limbal stroma, probably represent an important component of the limbal-niche. Our approach of in vivo imaging may pave the way for assessing the limbal stromal health. PMID:25742388
Sajil Kumar, P J
2013-03-01
The aim of this study was to investigate the controls of leather industries on fluoride contamination in and around a tannery cluster in Vaniyambadi. Hydrochemical analysis, mineral saturation indices and statistical methods were used to evaluate the intervening factors that controls the contamination processes. Fluoride in groundwater is exceeded the WHO guideline value (1.5 mg/L), in 62 % of the samples, mostly with Na-HCO3 and Na-Cl type of water. Results of the principal component analysis grouped Na, F, HCO3 and NO3 under component 1. This result was in agreement with the cross plot indicating high positive correlation between F and Na (r (2) = 0.87), HCO3 (r (2) = 0.84) and NO3 (r (2) = 0.55). Fluorite (CaF2) and Halite (NaCl) was undersaturated, while calcite (CaCO3) was oversaturated for all the samples. This suggest more dissolution of F-rich minerals under the active supports of Na. Bivariate plots of Na versus Cl and Na + K versus HCO3 showed a combined origin of Na from tannery effluent as well as silicate weathering. Two major clusters, based on the Na, HCO3 and F concentration showed that groundwater is affected by tanneries and silicate weathering. Fluoride concentration in 38 % of samples (n = 5) have significantly affected by the high Na concentration from tanneries.
Kalisvaart, Hanneke; van Broeckhuysen, Saskia; Bühring, Martina; Kool, Marianne B; van Dulmen, Sandra; Geenen, Rinie
2012-01-01
How a patient is connected with one's body is core to rehabilitation of somatoform disorder but a common model to describe body-relatedness is missing. The aim of our study was to investigate the components and hierarchical structure of body-relatedness as perceived by patients with severe somatoform disorder and their therapists. Interviews with patients and therapists yielded statements about components of body-relatedness. Patients and therapists individually sorted these statements according to similarity. Hierarchical cluster analysis was applied to these sortings. Analysis of variance was used to compare the perceived importance of the statements between patients and therapists. The hierarchical structure included 71 characteristics of body-relatedness. It consisted of three levels with eight clusters at the lowest level: 1) understanding, 2) acceptance, 3) adjustment, 4) respect for the body, 5) regulation, 6) confidence, 7) self-esteem, and 8) autonomy. The cluster 'understanding' was considered most important by patients and therapists. Patients valued 'regulating the body' more than therapists. According to patients with somatoform disorders and their therapists, body-relatedness includes awareness of the body and self by understanding, accepting and adjusting to bodily signals, by respecting and regulating the body, by confiding and esteeming oneself and by being autonomous. This definition and structure of body-relatedness may help professionals to improve interdisciplinary communication, assessment, and treatment, and it may help patients to better understand their symptoms and treatment. (German language abstract, Abstract S1; Spanish language abstract, Abstract S2).
[From "deadly quartet" to "metabolic syndrome". An analysis of its clinical relevance].
Vancheri, Federico; Burgio, Antonio; Dovico, Rossana
2007-03-01
The metabolic syndrome denotes a clustering of specific risk factors for both cardiovascular disease and type 2 diabetes, whose underlying pathophysiology is believed to include insulin resistance. It has been widely reported that the syndrome is a simple clinical tool to identify people at high long term risk of cardiovascular disease and diabetes. However, its clinical importance is under debate. There are substantial uncertainties about the clinical definition of the syndrome, as to whether the risk factors clustering indicates a single unifying disorder, whether the risk conferred by the condition as a whole is higher risk than its individual components, and whether its predictive value of future cardiovascular events or diabetes is greater than established predicting models such as the Framingham Risk Score and the Diabetes Risk Score. We undertook an extensive review of the literature. Our analysis indicates that current definitions of the syndrome are incomplete or ambiguous, more than one pathophysiological process underlies the syndrome, although the combination of insulin resistance and hyperinsulinemia are related to most cases; the risk associated with the syndrome is no greater than that explained by the presence of its components, and the syndrome is less effective in predicting the future development of cardiovascular events and diabetes than established predicting models. Although the syndrome has some importance in understanding the pathophysiology of cardiovascular and diabetes risk factors clustering, its use as a clinical syndrome is not justified by current data.
Shyamalamma, S; Chandra, S B C; Hegde, M; Naryanswamy, P
2008-07-22
Artocarpus heterophyllus Lam., commonly called jackfruit, is a medium-sized evergreen tree that bears high yields of the largest known edible fruit. Yet, it has been little explored commercially due to wide variation in fruit quality. The genetic diversity and genetic relatedness of 50 jackfruit accessions were studied using amplified fragment length polymorphism markers. Of 16 primer pairs evaluated, eight were selected for screening of genotypes based on the number and quality of polymorphic fragments produced. These primer combinations produced 5976 bands, 1267 (22%) of which were polymorphic. Among the jackfruit accessions, the similarity coefficient ranged from 0.137 to 0.978; the accessions also shared a large number of monomorphic fragments (78%). Cluster analysis and principal component analysis grouped all jackfruit genotypes into three major clusters. Cluster I included the genotypes grown in a jackfruit region of Karnataka, called Tamaka, with very dry conditions; cluster II contained the genotypes collected from locations having medium to heavy rainfall in Karnataka; cluster III grouped the genotypes in distant locations with different environmental conditions. Strong coincidence of these amplified fragment length polymorphism-based groupings with geographical localities as well as morphological characters was observed. We found moderate genetic diversity in these jackfruit accessions. This information should be useful for tree breeding programs, as part of our effort to popularize jackfruit as a commercial crop.
Li, Siyue; Zhang, Quanfa
2010-04-15
A data matrix (4032 observations), obtained during a 2-year monitoring period (2005-2006) from 42 sites in the upper Han River is subjected to various multivariate statistical techniques including cluster analysis, principal component analysis (PCA), factor analysis (FA), correlation analysis and analysis of variance to determine the spatial characterization of dissolved trace elements and heavy metals. Our results indicate that waters in the upper Han River are primarily polluted by Al, As, Cd, Pb, Sb and Se, and the potential pollutants include Ba, Cr, Hg, Mn and Ni. Spatial distribution of trace metals indicates the polluted sections mainly concentrate in the Danjiang, Danjiangkou Reservoir catchment and Hanzhong Plain, and the most contaminated river is in the Hanzhong Plain. Q-model clustering depends on geographical location of sampling sites and groups the 42 sampling sites into four clusters, i.e., Danjiang, Danjiangkou Reservoir region (lower catchment), upper catchment and one river in headwaters pertaining to water quality. The headwaters, Danjiang and lower catchment, and upper catchment correspond to very high polluted, moderate polluted and relatively low polluted regions, respectively. Additionally, PCA/FA and correlation analysis demonstrates that Al, Cd, Mn, Ni, Fe, Si and Sr are controlled by natural sources, whereas the other metals appear to be primarily controlled by anthropogenic origins though geogenic source contributing to them. 2009 Elsevier B.V. All rights reserved.
Clustering the Orion B giant molecular cloud based on its molecular emission.
Bron, Emeric; Daudon, Chloé; Pety, Jérôme; Levrier, François; Gerin, Maryvonne; Gratier, Pierre; Orkisz, Jan H; Guzman, Viviana; Bardeau, Sébastien; Goicoechea, Javier R; Liszt, Harvey; Öberg, Karin; Peretto, Nicolas; Sievers, Albrecht; Tremblin, Pascal
2018-02-01
Previous attempts at segmenting molecular line maps of molecular clouds have focused on using position-position-velocity data cubes of a single molecular line to separate the spatial components of the cloud. In contrast, wide field spectral imaging over a large spectral bandwidth in the (sub)mm domain now allows one to combine multiple molecular tracers to understand the different physical and chemical phases that constitute giant molecular clouds (GMCs). We aim at using multiple tracers (sensitive to different physical processes and conditions) to segment a molecular cloud into physically/chemically similar regions (rather than spatially connected components), thus disentangling the different physical/chemical phases present in the cloud. We use a machine learning clustering method, namely the Meanshift algorithm, to cluster pixels with similar molecular emission, ignoring spatial information. Clusters are defined around each maximum of the multidimensional Probability Density Function (PDF) of the line integrated intensities. Simple radiative transfer models were used to interpret the astrophysical information uncovered by the clustering analysis. A clustering analysis based only on the J = 1 - 0 lines of three isotopologues of CO proves suffcient to reveal distinct density/column density regimes ( n H ~ 100 cm -3 , ~ 500 cm -3 , and > 1000 cm -3 ), closely related to the usual definitions of diffuse, translucent and high-column-density regions. Adding two UV-sensitive tracers, the J = 1 - 0 line of HCO + and the N = 1 - 0 line of CN, allows us to distinguish two clearly distinct chemical regimes, characteristic of UV-illuminated and UV-shielded gas. The UV-illuminated regime shows overbright HCO + and CN emission, which we relate to a photochemical enrichment effect. We also find a tail of high CN/HCO + intensity ratio in UV-illuminated regions. Finer distinctions in density classes ( n H ~ 7 × 10 3 cm -3 ~ 4 × 10 4 cm -3 ) for the densest regions are also identified, likely related to the higher critical density of the CN and HCO + (1 - 0) lines. These distinctions are only possible because the high-density regions are spatially resolved. Molecules are versatile tracers of GMCs because their line intensities bear the signature of the physics and chemistry at play in the gas. The association of simultaneous multi-line, wide-field mapping and powerful machine learning methods such as the Meanshift clustering algorithm reveals how to decode the complex information available in these molecular tracers.
Bignell, Dawn R D; Seipke, Ryan F; Huguet-Tapia, José C; Chambers, Alan H; Parry, Ronald J; Loria, Rosemary
2010-02-01
Plant-pathogenic Streptomyces spp. cause scab disease on economically important root and tuber crops, the most important of which is potato. Key virulence determinants produced by these species include the cellulose synthesis inhibitor, thaxtomin A, and the secreted Nec1 protein that is required for colonization of the plant host. Recently, the genome sequence of Streptomyces scabies 87-22 was completed, and a biosynthetic cluster was identified that is predicted to synthesize a novel compound similar to coronafacic acid (CFA), a component of the virulence-associated coronatine phytotoxin produced by the plant-pathogenic bacterium Pseudomonas syringae. Southern analysis indicated that the cfa-like cluster in S. scabies 87-22 is likely conserved in other strains of S. scabies but is absent from two other pathogenic streptomycetes, S. turgidiscabies and S. acidiscabies. Transcriptional analyses demonstrated that the cluster is expressed during plant-microbe interactions and that expression requires a transcriptional regulator embedded in the cluster as well as the bldA tRNA. A knockout strain of the biosynthetic cluster displayed a reduced virulence phenotype on tobacco seedlings compared with the wild-type strain. Thus, the cfa-like biosynthetic cluster is a newly discovered locus in S. scabies that contributes to host-pathogen interactions.
Mo, Yun; Zhang, Zhongzhao; Meng, Weixiao; Ma, Lin; Wang, Yao
2014-01-01
Indoor positioning systems based on the fingerprint method are widely used due to the large number of existing devices with a wide range of coverage. However, extensive positioning regions with a massive fingerprint database may cause high computational complexity and error margins, therefore clustering methods are widely applied as a solution. However, traditional clustering methods in positioning systems can only measure the similarity of the Received Signal Strength without being concerned with the continuity of physical coordinates. Besides, outage of access points could result in asymmetric matching problems which severely affect the fine positioning procedure. To solve these issues, in this paper we propose a positioning system based on the Spatial Division Clustering (SDC) method for clustering the fingerprint dataset subject to physical distance constraints. With the Genetic Algorithm and Support Vector Machine techniques, SDC can achieve higher coarse positioning accuracy than traditional clustering algorithms. In terms of fine localization, based on the Kernel Principal Component Analysis method, the proposed positioning system outperforms its counterparts based on other feature extraction methods in low dimensionality. Apart from balancing online matching computational burden, the new positioning system exhibits advantageous performance on radio map clustering, and also shows better robustness and adaptability in the asymmetric matching problem aspect. PMID:24451470
Groundwater flow and hydrogeochemical evolution in the Jianghan Plain, central China
NASA Astrophysics Data System (ADS)
Gan, Yiqun; Zhao, Ke; Deng, Yamin; Liang, Xing; Ma, Teng; Wang, Yanxin
2018-05-01
Hydrogeochemical analysis and multivariate statistics were applied to identify flow patterns and major processes controlling the hydrogeochemistry of groundwater in the Jianghan Plain, which is located in central Yangtze River Basin (central China) and characterized by intensive surface-water/groundwater interaction. Although HCO3-Ca-(Mg) type water predominated in the study area, the 457 (21 surface water and 436 groundwater) samples were effectively classified into five clusters by hierarchical cluster analysis. The hydrochemical variations among these clusters were governed by three factors from factor analysis. Major components (e.g., Ca, Mg and HCO3) in surface water and groundwater originated from carbonate and silicate weathering (factor 1). Redox conditions (factor 2) influenced the geogenic Fe and As contamination in shallow confined groundwater. Anthropogenic activities (factor 3) primarily caused high levels of Cl and SO4 in surface water and phreatic groundwater. Furthermore, the factor score 1 of samples in the shallow confined aquifer gradually increased along the flow paths. This study demonstrates that enhanced information on hydrochemistry in complex groundwater flow systems, by multivariate statistical methods, improves the understanding of groundwater flow and hydrogeochemical evolution due to natural and anthropogenic impacts.
Diabetes Changes Symptoms Cluster Patterns in Persons Living With HIV.
Zuniga, Julie Ann; Bose, Eliezer; Park, Jungmin; Lapiz-Bluhm, M Danet; García, Alexandra A
Approximately 10-15% of persons living with HIV (PLWH) have a comorbid diagnosis of diabetes mellitus (DM). Both of these long-term chronic conditions are associated with high rates of symptom burden. The purpose of our study was to describe symptom patterns for PLWH with DM (PLWH+DM) using a large secondary dataset. The prevalence, burden, and bothersomeness of symptoms reported by patients in routine clinic visits during 2015 were assessed using the 20-item HIV Symptom Index. Principal component analysis was used to identify symptom clusters. Three main clusters were identified: (a) neurological/psychological, (b) gastrointestinal/flu-like, and (c) physical changes. The most prevalent symptoms were fatigue, poor sleep, aches, neuropathy, and sadness. When compared to a previous symptom study with PLWH, symptoms clustered differently in our sample of patients with dual diagnoses of HIV and diabetes. Clinicians should appropriately assess symptoms for their patients' comorbid conditions. Copyright © 2017 Association of Nurses in AIDS Care. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Glagolev, Mikhail K.; Vasilevskaya, Valentina V., E-mail: vvvas@polly.phys.msu.ru; Khokhlov, Alexei R.
Impact of mixture composition on self-organization in concentrated solutions of stiff helical and flexible macromolecules was studied by means of molecular dynamics simulation. The macromolecules were composed of identical amphiphilic monomer units but a fraction f of macromolecules had stiff helical backbones and the remaining chains were flexible. In poor solvents the compacted flexible macromolecules coexist with bundles or filament clusters from few intertwined stiff helical macromolecules. The increase of relative content f of helical macromolecules leads to increase of the length of helical clusters, to alignment of clusters with each other, and then to liquid-crystalline-like ordering along a singlemore » direction. The formation of filament clusters causes segregation of helical and flexible macromolecules and the alignment of the filaments induces effective liquid-like ordering of flexible macromolecules. A visual analysis and calculation of order parameter relaying the anisotropy of diffraction allow concluding that transition from disordered to liquid-crystalline state proceeds sharply at relatively low content of stiff components.« less
Viscoelasticity promotes collective swimming of sperm
NASA Astrophysics Data System (ADS)
Tung, Chih-Kuan; Harvey, Benedict B.; Fiore, Alyssa G.; Ardon, Florencia; Suarez, Susan S.; Wu, Mingming
From flocking birds to swarming insects, interactions of organisms large and small lead to the emergence of collective dynamics. Here, we report striking collective swimming of bovine sperm, with sperm orienting in the same direction within each cluster, enabled by the viscoelasticity of the fluid. A long-chain polyacrylamide solution was used as a model viscoelastic fluid such that its rheology can be fine-tuned to mimic that of bovine cervical mucus. In viscoelastic fluid, sperm formed dynamic clusters, and the cluster size increased with elasticity of the polyacrylamide solution. In contrast, sperm swam randomly and individually in Newtonian fluids of similar viscosity. Analysis of the fluid motion surrounding individual swimming sperm indicated that sperm-fluid interaction is facilitated by the elastic component of the fluid. We note that almost all biological fluids (e.g. mucus and blood) are viscoelastic in nature, this finding highlights the importance of fluid elasticity in biological function. We will discuss what the orientation fluctuation within a cluster reveals about the interaction strength. Supported by NIH Grant 1R01HD070038.
Quality Evaluation of Agricultural Distillates Using an Electronic Nose
Dymerski, Tomasz; Gębicki, Jacek; Wardencki, Waldemar; Namieśnik, Jacek
2013-01-01
The paper presents the application of an electronic nose instrument to fast evaluation of agricultural distillates differing in quality. The investigations were carried out using a prototype of electronic nose equipped with a set of six semiconductor sensors by FIGARO Co., an electronic circuit converting signal into digital form and a set of thermostats able to provide gradient temperature characteristics to a gas mixture. A volatile fraction of the agricultural distillate samples differing in quality was obtained by barbotage. Interpretation of the results involved three data analysis techniques: principal component analysis, single-linkage cluster analysis and cluster analysis with spheres method. The investigations prove the usefulness of the presented technique in the quality control of agricultural distillates. Optimum measurements conditions were also defined, including volumetric flow rate of carrier gas (15 L/h), thermostat temperature during the barbotage process (15 °C) and time of sensor signal acquisition from the onset of the barbotage process (60 s). PMID:24287525
Failure Mode Identification Through Clustering Analysis
NASA Technical Reports Server (NTRS)
Arunajadai, Srikesh G.; Stone, Robert B.; Tumer, Irem Y.; Clancy, Daniel (Technical Monitor)
2002-01-01
Research has shown that nearly 80% of the costs and problems are created in product development and that cost and quality are essentially designed into products in the conceptual stage. Currently, failure identification procedures (such as FMEA (Failure Modes and Effects Analysis), FMECA (Failure Modes, Effects and Criticality Analysis) and FTA (Fault Tree Analysis)) and design of experiments are being used for quality control and for the detection of potential failure modes during the detail design stage or post-product launch. Though all of these methods have their own advantages, they do not give information as to what are the predominant failures that a designer should focus on while designing a product. This work uses a functional approach to identify failure modes, which hypothesizes that similarities exist between different failure modes based on the functionality of the product/component. In this paper, a statistical clustering procedure is proposed to retrieve information on the set of predominant failures that a function experiences. The various stages of the methodology are illustrated using a hypothetical design example.
NASA Astrophysics Data System (ADS)
Benninghoff, L.; von Czarnowski, D.; Denkhaus, E.; Lemke, K.
1997-07-01
For the determination of trace element distributions of more than 20 elements in malignant and normal tissues of the human colon, tissue samples (approx. 400 mg wet weight) were digested with 3 ml of nitric acid (sub-boiled quality) by use of an autoclave system. The accuracy of measurements has been investigated by using certified materials. The analytical results were evaluated by using a spreadsheet program to give an overview of the element distribution in cancerous samples and in normal colon tissues. A further application, cluster analysis of the analytical results, was introduced to demonstrate the possibility of classification for cancer diagnosis. To confirm the results of cluster analysis, multivariate three-way principal component analysis was performed. Additionally, microtome frozen sections (10 μm) were prepared from the same tissue samples to compare the analytical results, i.e. the mass fractions of elements, according to the preparation method and to exclude systematic errors depending on the inhomogeneity of the tissues.
Multivariate Statistical Analysis of Water Quality data in Indian River Lagoon, Florida
NASA Astrophysics Data System (ADS)
Sayemuzzaman, M.; Ye, M.
2015-12-01
The Indian River Lagoon, is part of the longest barrier island complex in the United States, is a region of particular concern to the environmental scientist because of the rapid rate of human development throughout the region and the geographical position in between the colder temperate zone and warmer sub-tropical zone. Thus, the surface water quality analysis in this region always brings the newer information. In this present study, multivariate statistical procedures were applied to analyze the spatial and temporal water quality in the Indian River Lagoon over the period 1998-2013. Twelve parameters have been analyzed on twelve key water monitoring stations in and beside the lagoon on monthly datasets (total of 27,648 observations). The dataset was treated using cluster analysis (CA), principle component analysis (PCA) and non-parametric trend analysis. The CA was used to cluster twelve monitoring stations into four groups, with stations on the similar surrounding characteristics being in the same group. The PCA was then applied to the similar groups to find the important water quality parameters. The principal components (PCs), PC1 to PC5 was considered based on the explained cumulative variances 75% to 85% in each cluster groups. Nutrient species (phosphorus and nitrogen), salinity, specific conductivity and erosion factors (TSS, Turbidity) were major variables involved in the construction of the PCs. Statistical significant positive or negative trends and the abrupt trend shift were detected applying Mann-Kendall trend test and Sequential Mann-Kendall (SQMK), for each individual stations for the important water quality parameters. Land use land cover change pattern, local anthropogenic activities and extreme climate such as drought might be associated with these trends. This study presents the multivariate statistical assessment in order to get better information about the quality of surface water. Thus, effective pollution control/management of the surface waters can be undertaken.
NASA Astrophysics Data System (ADS)
Wagstaff, Kiri L.
2012-03-01
On obtaining a new data set, the researcher is immediately faced with the challenge of obtaining a high-level understanding from the observations. What does a typical item look like? What are the dominant trends? How many distinct groups are included in the data set, and how is each one characterized? Which observable values are common, and which rarely occur? Which items stand out as anomalies or outliers from the rest of the data? This challenge is exacerbated by the steady growth in data set size [11] as new instruments push into new frontiers of parameter space, via improvements in temporal, spatial, and spectral resolution, or by the desire to "fuse" observations from different modalities and instruments into a larger-picture understanding of the same underlying phenomenon. Data clustering algorithms provide a variety of solutions for this task. They can generate summaries, locate outliers, compress data, identify dense or sparse regions of feature space, and build data models. It is useful to note up front that "clusters" in this context refer to groups of items within some descriptive feature space, not (necessarily) to "galaxy clusters" which are dense regions in physical space. The goal of this chapter is to survey a variety of data clustering methods, with an eye toward their applicability to astronomical data analysis. In addition to improving the individual researcher’s understanding of a given data set, clustering has led directly to scientific advances, such as the discovery of new subclasses of stars [14] and gamma-ray bursts (GRBs) [38]. All clustering algorithms seek to identify groups within a data set that reflect some observed, quantifiable structure. Clustering is traditionally an unsupervised approach to data analysis, in the sense that it operates without any direct guidance about which items should be assigned to which clusters. There has been a recent trend in the clustering literature toward supporting semisupervised or constrained clustering, in which some partial information about item assignments or other components of the resulting output are already known and must be accommodated by the solution. Some algorithms seek a partition of the data set into distinct clusters, while others build a hierarchy of nested clusters that can capture taxonomic relationships. Some produce a single optimal solution, while others construct a probabilistic model of cluster membership. More formally, clustering algorithms operate on a data set X composed of items represented by one or more features (dimensions). These could include physical location, such as right ascension and declination, as well as other properties such as brightness, color, temporal change, size, texture, and so on. Let D be the number of dimensions used to represent each item, xi ∈ RD. The clustering goal is to produce an organization P of the items in X that optimizes an objective function f : P -> R, which quantifies the quality of solution P. Often f is defined so as to maximize similarity within a cluster and minimize similarity between clusters. To that end, many algorithms make use of a measure d : X x X -> R of the distance between two items. A partitioning algorithm produces a set of clusters P = {c1, . . . , ck} such that the clusters are nonoverlapping (c_i intersected with c_j = empty set, i != j) subsets of the data set (Union_i c_i=X). Hierarchical algorithms produce a series of partitions P = {p1, . . . , pn }. For a complete hierarchy, the number of partitions n’= n, the number of items in the data set; the top partition is a single cluster containing all items, and the bottom partition contains n clusters, each containing a single item. For model-based clustering, each cluster c_j is represented by a model m_j , such as the cluster center or a Gaussian distribution. The wide array of available clustering algorithms may seem bewildering, and covering all of them is beyond the scope of this chapter. Choosing among them for a particular application involves considerations of the kind of data being analyzed, algorithm runtime efficiency, and how much prior knowledge is available about the problem domain, which can dictate the nature of clusters sought. Fundamentally, the clustering method and its representations of clusters carries with it a definition of what a cluster is, and it is important that this be aligned with the analysis goals for the problem at hand. In this chapter, I emphasize this point by identifying for each algorithm the cluster representation as a model, m_j , even for algorithms that are not typically thought of as creating a “model.” This chapter surveys a basic collection of clustering methods useful to any practitioner who is interested in applying clustering to a new data set. The algorithms include k-means (Section 25.2), EM (Section 25.3), agglomerative (Section 25.4), and spectral (Section 25.5) clustering, with side mentions of variants such as kernel k-means and divisive clustering. The chapter also discusses each algorithm’s strengths and limitations and provides pointers to additional in-depth reading for each subject. Section 25.6 discusses methods for incorporating domain knowledge into the clustering process. This chapter concludes with a brief survey of interesting applications of clustering methods to astronomy data (Section 25.7). The chapter begins with k-means because it is both generally accessible and so widely used that understanding it can be considered a necessary prerequisite for further work in the field. EM can be viewed as a more sophisticated version of k-means that uses a generative model for each cluster and probabilistic item assignments. Agglomerative clustering is the most basic form of hierarchical clustering and provides a basis for further exploration of algorithms in that vein. Spectral clustering permits a departure from feature-vector-based clustering and can operate on data sets instead represented as affinity, or similarity matrices—cases in which only pairwise information is known. The list of algorithms covered in this chapter is representative of those most commonly in use, but it is by no means comprehensive. There is an extensive collection of existing books on clustering that provide additional background and depth. Three early books that remain useful today are Anderberg’s Cluster Analysis for Applications [3], Hartigan’s Clustering Algorithms [25], and Gordon’s Classification [22]. The latter covers basics on similarity measures, partitioning and hierarchical algorithms, fuzzy clustering, overlapping clustering, conceptual clustering, validations methods, and visualization or data reduction techniques such as principal components analysis (PCA),multidimensional scaling, and self-organizing maps. More recently, Jain et al. provided a useful and informative survey [27] of a variety of different clustering algorithms, including those mentioned here as well as fuzzy, graph-theoretic, and evolutionary clustering. Everitt’s Cluster Analysis [19] provides a modern overview of algorithms, similarity measures, and evaluation methods.
Web Program for Development of GUIs for Cluster Computers
NASA Technical Reports Server (NTRS)
Czikmantory, Akos; Cwik, Thomas; Klimeck, Gerhard; Hua, Hook; Oyafuso, Fabiano; Vinyard, Edward
2003-01-01
WIGLAF (a Web Interface Generator and Legacy Application Facade) is a computer program that provides a Web-based, distributed, graphical-user-interface (GUI) framework that can be adapted to any of a broad range of application programs, written in any programming language, that are executed remotely on any cluster computer system. WIGLAF enables the rapid development of a GUI for controlling and monitoring a specific application program running on the cluster and for transferring data to and from the application program. The only prerequisite for the execution of WIGLAF is a Web-browser program on a user's personal computer connected with the cluster via the Internet. WIGLAF has a client/server architecture: The server component is executed on the cluster system, where it controls the application program and serves data to the client component. The client component is an applet that runs in the Web browser. WIGLAF utilizes the Extensible Markup Language to hold all data associated with the application software, Java to enable platform-independent execution on the cluster system and the display of a GUI generator through the browser, and the Java Remote Method Invocation software package to provide simple, effective client/server networking.
Khosravi, Rasoul; Rezaei, Hamid Reza; Kaboli, Mohammad
2013-01-01
The genetic threat due to hybridization with free-ranging dogs is one major concern in wolf conservation. The identification of hybrids and extent of hybridization is important in the conservation and management of wolf populations. Genetic variation was analyzed at 15 unlinked loci in 28 dogs, 28 wolves, four known hybrids, two black wolves, and one dog with abnormal traits in Iran. Pritchard's model, multivariate ordination by principal component analysis and neighbor joining clustering were used for population clustering and individual assignment. Analysis of genetic variation showed that genetic variability is high in both wolf and dog populations in Iran. Values of H(E) in dog and wolf samples ranged from 0.75-0.92 and 0.77-0.92, respectively. The results of AMOVA showed that the two groups of dog and wolf were significantly different (F(ST) = 0.05 and R(ST) = 0.36; P < 0.001). In each of the three methods, wolf and dog samples were separated into two distinct clusters. Two dark wolves were assigned to the wolf cluster. Also these models detected D32 (dog with abnormal traits) and some other samples, which were assigned to more than one cluster and could be a hybrid. This study is the beginning of a genetic study in wolf populations in Iran, and our results reveal that as in other countries, hybridization between wolves and dogs is sporadic in Iran and can be a threat to wolf populations if human perturbations increase.
Analysis of the structure and dynamics of human serum albumin.
Guizado, T R Cuya
2014-10-01
Human serum albumin (HSA) is a biologically relevant protein that binds a variety of drugs and other small molecules. No less than 50 structures are deposited in the RCSB Protein Data Bank (PDB). Based on these structures, we first performed a clustering analysis. Despite the diversity of ligands, only two well defined conformations are detected, with a deviation of 0.46 nm between the average structures of the two clusters, while deviations within each cluster are smaller than 0.08 nm. Those two conformations are representative of the apoprotein and the HSA-myristate complex already identified in previous literature. Considering the structures within each cluster as a representative sample of the dynamical states of the corresponding conformation, we scrutinize the structural and dynamical differences between both conformations. Analysis of the fluctuations within each cluster set reveals that domain II is the most rigid one and better matches both structures. Then, taking this domain as reference, we show that the structural difference between both conformations can be expressed in terms of twist and hinge motions of domains I and III, respectively. We also characterize the dynamical difference between conformations by computing correlations and principal components for each set of dynamical states. The two conformations display different collective motions. The results are compared with those obtained from the trajectories of short molecular dynamics simulations, giving consistent outcomes. Let us remark that, beyond the relevance of the results for the structural and dynamical characterization of HAS conformations, the present methodology could be extended to other proteins in the PDB archive.
Costa, Patrício Soares; Santos, Nadine Correia; Cunha, Pedro; Cotter, Jorge; Sousa, Nuno
2013-01-01
The main focus of this study was to illustrate the applicability of multiple correspondence analysis (MCA) in detecting and representing underlying structures in large datasets used to investigate cognitive ageing. Principal component analysis (PCA) was used to obtain main cognitive dimensions, and MCA was used to detect and explore relationships between cognitive, clinical, physical, and lifestyle variables. Two PCA dimensions were identified (general cognition/executive function and memory), and two MCA dimensions were retained. Poorer cognitive performance was associated with older age, less school years, unhealthier lifestyle indicators, and presence of pathology. The first MCA dimension indicated the clustering of general/executive function and lifestyle indicators and education, while the second association was between memory and clinical parameters and age. The clustering analysis with object scores method was used to identify groups sharing similar characteristics. The weaker cognitive clusters in terms of memory and executive function comprised individuals with characteristics contributing to a higher MCA dimensional mean score (age, less education, and presence of indicators of unhealthier lifestyle habits and/or clinical pathologies). MCA provided a powerful tool to explore complex ageing data, covering multiple and diverse variables, showing if a relationship exists and how variables are related, and offering statistical results that can be seen both analytically and visually.
Chemometric analysis of minerals in gluten-free products.
Gliszczyńska-Świgło, Anna; Klimczak, Inga; Rybicka, Iga
2018-06-01
Numerous studies indicate mineral deficiencies in people on a gluten-free (GF) diet. These deficiencies may indicate that GF products are a less valuable source of minerals than gluten-containing products. In the study, the nutritional quality of 50 GF products is discussed taking into account the nutritional requirements for minerals expressed as percentage of recommended daily allowance (%RDA) or percentage of adequate intake (%AI) for a model celiac patient. Elements analyzed were calcium, potassium, magnesium, sodium, copper, iron, manganese, and zinc. Analysis of %RDA or %AI was performed using principal component analysis (PCA) and hierarchical cluster analysis (HCA). Using PCA, the differentiation between products based on rice, corn, potato, GF wheat starch and based on buckwheat, chickpea, millet, oats, amaranth, teff, quinoa, chestnut, and acorn was possible. In the HCA, four clusters were created. The main criterion determining the adherence of the sample to the cluster was the content of all minerals included to HCA (K, Mg, Cu, Fe, Mn); however, only the Mn content differentiated four formed groups. GF products made of buckwheat, chickpea, millet, oats, amaranth, teff, quinoa, chestnut, and acorn are better source of minerals than based on other GF raw materials, what was confirmed by PCA and HCA. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.