Sample records for statistical analysis principal

  1. [The principal components analysis--method to classify the statistical variables with applications in medicine].

    PubMed

    Dascălu, Cristina Gena; Antohe, Magda Ecaterina

    2009-01-01

    Based on the eigenvalues and the eigenvectors analysis, the principal component analysis has the purpose to identify the subspace of the main components from a set of parameters, which are enough to characterize the whole set of parameters. Interpreting the data for analysis as a cloud of points, we find through geometrical transformations the directions where the cloud's dispersion is maximal--the lines that pass through the cloud's center of weight and have a maximal density of points around them (by defining an appropriate criteria function and its minimization. This method can be successfully used in order to simplify the statistical analysis on questionnaires--because it helps us to select from a set of items only the most relevant ones, which cover the variations of the whole set of data. For instance, in the presented sample we started from a questionnaire with 28 items and, applying the principal component analysis we identified 7 principal components--or main items--fact that simplifies significantly the further data statistical analysis.

  2. Transforming Graph Data for Statistical Relational Learning

    DTIC Science & Technology

    2012-10-01

    Jordan, 2003), PLSA (Hofmann, 1999), ? Classification via RMN (Taskar et al., 2003) or SVM (Hasan, Chaoji, Salem , & Zaki, 2006) ? Hierarchical...dimensionality reduction methods such as Principal 407 Rossi, McDowell, Aha, & Neville Component Analysis (PCA), Principal Factor Analysis ( PFA ), and...clustering algorithm. Journal of the Royal Statistical Society. Series C, Applied statistics, 28, 100–108. Hasan, M. A., Chaoji, V., Salem , S., & Zaki, M

  3. Relationships between Association of Research Libraries (ARL) Statistics and Bibliometric Indicators: A Principal Components Analysis

    ERIC Educational Resources Information Center

    Hendrix, Dean

    2010-01-01

    This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…

  4. Principal component regression analysis with SPSS.

    PubMed

    Liu, R X; Kuang, J; Gong, Q; Hou, X L

    2003-06-01

    The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

  5. A reduction in ag/residential signature conflict using principal components analysis of LANDSAT temporal data

    NASA Technical Reports Server (NTRS)

    Williams, D. L.; Borden, F. Y.

    1977-01-01

    Methods to accurately delineate the types of land cover in the urban-rural transition zone of metropolitan areas were considered. The application of principal components analysis to multidate LANDSAT imagery was investigated as a means of reducing the overlap between residential and agricultural spectral signatures. The statistical concepts of principal components analysis were discussed, as well as the results of this analysis when applied to multidate LANDSAT imagery of the Washington, D.C. metropolitan area.

  6. [The application of the multidimensional statistical methods in the evaluation of the influence of atmospheric pollution on the population's health].

    PubMed

    Surzhikov, V D; Surzhikov, D V

    2014-01-01

    The search and measurement of causal relationships between exposure to air pollution and health state of the population is based on the system analysis and risk assessment to improve the quality of research. With this purpose there is applied the modern statistical analysis with the use of criteria of independence, principal component analysis and discriminate function analysis. As a result of analysis out of all atmospheric pollutants there were separated four main components: for diseases of the circulatory system main principal component is implied with concentrations of suspended solids, nitrogen dioxide, carbon monoxide, hydrogen fluoride, for the respiratory diseases the main c principal component is closely associated with suspended solids, sulfur dioxide and nitrogen dioxide, charcoal black. The discriminant function was shown to be used as a measure of the level of air pollution.

  7. Predicting Teacher Job Satisfaction Based on Principals' Instructional Supervision Behaviours: A Study of Turkish Teachers

    ERIC Educational Resources Information Center

    Ilgan, Abdurrahman; Parylo, Oksana; Sungu, Hilmi

    2015-01-01

    This quantitative research examined instructional supervision behaviours of school principals as a predictor of teacher job satisfaction through the analysis of Turkish teachers' perceptions of principals' instructional supervision behaviours. There was a statistically significant difference found between the teachers' job satisfaction level and…

  8. Statistical analysis of major ion and trace element geochemistry of water, 1986-2006, at seven wells transecting the freshwater/saline-water interface of the Edwards Aquifer, San Antonio, Texas

    USGS Publications Warehouse

    Mahler, Barbara J.

    2008-01-01

    The statistical analyses taken together indicate that the geochemistry at the freshwater-zone wells is more variable than that at the transition-zone wells. The geochemical variability at the freshwater-zone wells might result from dilution of ground water by meteoric water. This is indicated by relatively constant major ion molar ratios; a preponderance of positive correlations between SC, major ions, and trace elements; and a principal components analysis in which the major ions are strongly loaded on the first principal component. Much of the variability at three of the four transition-zone wells might result from the use of different laboratory analytical methods or reporting procedures during the period of sampling. This is reflected by a lack of correlation between SC and major ion concentrations at the transition-zone wells and by a principal components analysis in which the variability is fairly evenly distributed across several principal components. The statistical analyses further indicate that, although the transition-zone wells are less well connected to surficial hydrologic conditions than the freshwater-zone wells, there is some connection but the response time is longer. 

  9. Use of multivariate statistics to identify unreliable data obtained using CASA.

    PubMed

    Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón

    2013-06-01

    In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.

  10. Principal Component Analysis: Resources for an Essential Application of Linear Algebra

    ERIC Educational Resources Information Center

    Pankavich, Stephen; Swanson, Rebecca

    2015-01-01

    Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…

  11. Applications of Nonlinear Principal Components Analysis to Behavioral Data.

    ERIC Educational Resources Information Center

    Hicks, Marilyn Maginley

    1981-01-01

    An empirical investigation of the statistical procedure entitled nonlinear principal components analysis was conducted on a known equation and on measurement data in order to demonstrate the procedure and examine its potential usefulness. This method was suggested by R. Gnanadesikan and based on an early paper of Karl Pearson. (Author/AL)

  12. An application of principal component analysis to the clavicle and clavicle fixation devices.

    PubMed

    Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan

    2010-03-26

    Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.

  13. Variable Neighborhood Search Heuristics for Selecting a Subset of Variables in Principal Component Analysis

    ERIC Educational Resources Information Center

    Brusco, Michael J.; Singh, Renu; Steinley, Douglas

    2009-01-01

    The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…

  14. Data analysis techniques

    NASA Technical Reports Server (NTRS)

    Park, Steve

    1990-01-01

    A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.

  15. The potential of statistical shape modelling for geometric morphometric analysis of human teeth in archaeological research

    PubMed Central

    Fernee, Christianne; Browne, Martin; Zakrzewski, Sonia

    2017-01-01

    This paper introduces statistical shape modelling (SSM) for use in osteoarchaeology research. SSM is a full field, multi-material analytical technique, and is presented as a supplementary geometric morphometric (GM) tool. Lower mandibular canines from two archaeological populations and one modern population were sampled, digitised using micro-CT, aligned, registered to a baseline and statistically modelled using principal component analysis (PCA). Sample material properties were incorporated as a binary enamel/dentin parameter. Results were assessed qualitatively and quantitatively using anatomical landmarks. Finally, the technique’s application was demonstrated for inter-sample comparison through analysis of the principal component (PC) weights. It was found that SSM could provide high detail qualitative and quantitative insight with respect to archaeological inter- and intra-sample variability. This technique has value for archaeological, biomechanical and forensic applications including identification, finite element analysis (FEA) and reconstruction from partial datasets. PMID:29216199

  16. Ripening-dependent metabolic changes in the volatiles of pineapple (Ananas comosus (L.) Merr.) fruit: II. Multivariate statistical profiling of pineapple aroma compounds based on comprehensive two-dimensional gas chromatography-mass spectrometry.

    PubMed

    Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg

    2015-03-01

    Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.

  17. Visualization of time series statistical data by shape analysis (GDP ratio changes among Asia countries)

    NASA Astrophysics Data System (ADS)

    Shirota, Yukari; Hashimoto, Takako; Fitri Sari, Riri

    2018-03-01

    It has been very significant to visualize time series big data. In the paper we shall discuss a new analysis method called “statistical shape analysis” or “geometry driven statistics” on time series statistical data in economics. In the paper, we analyse the agriculture, value added and industry, value added (percentage of GDP) changes from 2000 to 2010 in Asia. We handle the data as a set of landmarks on a two-dimensional image to see the deformation using the principal components. The point of the analysis method is the principal components of the given formation which are eigenvectors of its bending energy matrix. The local deformation can be expressed as the set of non-Affine transformations. The transformations give us information about the local differences between in 2000 and in 2010. Because the non-Affine transformation can be decomposed into a set of partial warps, we present the partial warps visually. The statistical shape analysis is widely used in biology but, in economics, no application can be found. In the paper, we investigate its potential to analyse the economic data.

  18. Application of principal component analysis (PCA) as a sensory assessment tool for fermented food products.

    PubMed

    Ghosh, Debasree; Chattopadhyay, Parimal

    2012-06-01

    The objective of the work was to use the method of quantitative descriptive analysis (QDA) to describe the sensory attributes of the fermented food products prepared with the incorporation of lactic cultures. Panellists were selected and trained to evaluate various attributes specially color and appearance, body texture, flavor, overall acceptability and acidity of the fermented food products like cow milk curd and soymilk curd, idli, sauerkraut and probiotic ice cream. Principal component analysis (PCA) identified the six significant principal components that accounted for more than 90% of the variance in the sensory attribute data. Overall product quality was modelled as a function of principal components using multiple least squares regression (R (2) = 0.8). The result from PCA was statistically analyzed by analysis of variance (ANOVA). These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring the fermented food product attributes that are important for consumer acceptability.

  19. The Influence Function of Principal Component Analysis by Self-Organizing Rule.

    PubMed

    Higuchi; Eguchi

    1998-07-28

    This article is concerned with a neural network approach to principal component analysis (PCA). An algorithm for PCA by the self-organizing rule has been proposed and its robustness observed through the simulation study by Xu and Yuille (1995). In this article, the robustness of the algorithm against outliers is investigated by using the theory of influence function. The influence function of the principal component vector is given in an explicit form. Through this expression, the method is shown to be robust against any directions orthogonal to the principal component vector. In addition, a statistic generated by the self-organizing rule is proposed to assess the influence of data in PCA.

  20. Parallel auto-correlative statistics with VTK.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    2013-08-01

    This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.

  1. The Higher Education System in Israel: Statistical Abstract and Analysis.

    ERIC Educational Resources Information Center

    Herskovic, Shlomo

    This edition of a statistical abstract published every few years on the higher education system in Israel presents the most recent data available through 1990-91. The data were gathered through the cooperation of the Central Bureau of Statistics and institutions of higher education. Chapter 1 presents a summary of principal findings covering the…

  2. Psychometric evaluation of the Persian version of the Templer's Death Anxiety Scale in cancer patients.

    PubMed

    Soleimani, Mohammad Ali; Yaghoobzadeh, Ameneh; Bahrami, Nasim; Sharif, Saeed Pahlevan; Sharif Nia, Hamid

    2016-10-01

    In this study, 398 Iranian cancer patients completed the 15-item Templer's Death Anxiety Scale (TDAS). Tests of internal consistency, principal components analysis, and confirmatory factor analysis were conducted to assess the internal consistency and factorial validity of the Persian TDAS. The construct reliability statistic and average variance extracted were also calculated to measure construct reliability, convergent validity, and discriminant validity. Principal components analysis indicated a 3-component solution, which was generally supported in the confirmatory analysis. However, acceptable cutoffs for construct reliability, convergent validity, and discriminant validity were not fulfilled for the three subscales that were derived from the principal component analysis. This study demonstrated both the advantages and potential limitations of using the TDAS with Persian-speaking cancer patients.

  3. Statistical Inference for Porous Materials using Persistent Homology.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moon, Chul; Heath, Jason E.; Mitchell, Scott A.

    2017-12-01

    We propose a porous materials analysis pipeline using persistent homology. We rst compute persistent homology of binarized 3D images of sampled material subvolumes. For each image we compute sets of homology intervals, which are represented as summary graphics called persistence diagrams. We convert persistence diagrams into image vectors in order to analyze the similarity of the homology of the material images using the mature tools for image analysis. Each image is treated as a vector and we compute its principal components to extract features. We t a statistical model using the loadings of principal components to estimate material porosity, permeability,more » anisotropy, and tortuosity. We also propose an adaptive version of the structural similarity index (SSIM), a similarity metric for images, as a measure to determine the statistical representative elementary volumes (sREV) for persistence homology. Thus we provide a capability for making a statistical inference of the uid ow and transport properties of porous materials based on their geometry and connectivity.« less

  4. A statistically derived index for classifying East Coast fever reactions in cattle challenged with Theileria parva under experimental conditions.

    PubMed

    Rowlands, G J; Musoke, A J; Morzaria, S P; Nagda, S M; Ballingall, K T; McKeever, D J

    2000-04-01

    A statistically derived disease reaction index based on parasitological, clinical and haematological measurements observed in 309 5 to 8-month-old Boran cattle following laboratory challenge with Theileria parva is described. Principal component analysis was applied to 13 measures including first appearance of schizonts, first appearance of piroplasms and first occurrence of pyrexia, together with the duration and severity of these symptoms, and white blood cell count. The first principal component, which was based on approximately equal contributions of the 13 variables, provided the definition for the disease reaction index, defined on a scale of 0-10. As well as providing a more objective measure of the severity of the reaction, the continuous nature of the index score enables more powerful statistical analysis of the data compared with that which has been previously possible through clinically derived categories of non-, mild, moderate and severe reactions.

  5. 76 FR 27563 - Margin and Capital Requirements for Covered Swap Entities

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-11

    .... Board: Sean D. Campbell, Deputy Associate Director, Division of Research and Statistics, (202) 452-3761, Michael Gibson, Senior Associate Director, Division of Research and Statistics, (202) 452- 2495, or Jeremy..., DC 20429. FHFA: Robert Collender, Principal Policy Analyst, Office of Policy Analysis and Research...

  6. Climate drivers on malaria transmission in Arunachal Pradesh, India.

    PubMed

    Upadhyayula, Suryanaryana Murty; Mutheneni, Srinivasa Rao; Chenna, Sumana; Parasaram, Vaideesh; Kadiri, Madhusudhan Rao

    2015-01-01

    The present study was conducted during the years 2006 to 2012 and provides information on prevalence of malaria and its regulation with effect to various climatic factors in East Siang district of Arunachal Pradesh, India. Correlation analysis, Principal Component Analysis and Hotelling's T² statistics models are adopted to understand the effect of weather variables on malaria transmission. The epidemiological study shows that the prevalence of malaria is mostly caused by the parasite Plasmodium vivax followed by Plasmodium falciparum. It is noted that, the intensity of malaria cases declined gradually from the year 2006 to 2012. The transmission of malaria observed was more during the rainy season, as compared to summer and winter seasons. Further, the data analysis study with Principal Component Analysis and Hotelling's T² statistic has revealed that the climatic variables such as temperature and rainfall are the most influencing factors for the high rate of malaria transmission in East Siang district of Arunachal Pradesh.

  7. Principal component analysis and neurocomputing-based models for total ozone concentration over different urban regions of India

    NASA Astrophysics Data System (ADS)

    Chattopadhyay, Goutami; Chattopadhyay, Surajit; Chakraborthy, Parthasarathi

    2012-07-01

    The present study deals with daily total ozone concentration time series over four metro cities of India namely Kolkata, Mumbai, Chennai, and New Delhi in the multivariate environment. Using the Kaiser-Meyer-Olkin measure, it is established that the data set under consideration are suitable for principal component analysis. Subsequently, by introducing rotated component matrix for the principal components, the predictors suitable for generating artificial neural network (ANN) for daily total ozone prediction are identified. The multicollinearity is removed in this way. Models of ANN in the form of multilayer perceptron trained through backpropagation learning are generated for all of the study zones, and the model outcomes are assessed statistically. Measuring various statistics like Pearson correlation coefficients, Willmott's indices, percentage errors of prediction, and mean absolute errors, it is observed that for Mumbai and Kolkata the proposed ANN model generates very good predictions. The results are supported by the linearly distributed coordinates in the scatterplots.

  8. Time-oriented hierarchical method for computation of principal components using subspace learning algorithm.

    PubMed

    Jankovic, Marko; Ogawa, Hidemitsu

    2004-10-01

    Principal Component Analysis (PCA) and Principal Subspace Analysis (PSA) are classic techniques in statistical data analysis, feature extraction and data compression. Given a set of multivariate measurements, PCA and PSA provide a smaller set of "basis vectors" with less redundancy, and a subspace spanned by them, respectively. Artificial neurons and neural networks have been shown to perform PSA and PCA when gradient ascent (descent) learning rules are used, which is related to the constrained maximization (minimization) of statistical objective functions. Due to their low complexity, such algorithms and their implementation in neural networks are potentially useful in cases of tracking slow changes of correlations in the input data or in updating eigenvectors with new samples. In this paper we propose PCA learning algorithm that is fully homogeneous with respect to neurons. The algorithm is obtained by modification of one of the most famous PSA learning algorithms--Subspace Learning Algorithm (SLA). Modification of the algorithm is based on Time-Oriented Hierarchical Method (TOHM). The method uses two distinct time scales. On a faster time scale PSA algorithm is responsible for the "behavior" of all output neurons. On a slower scale, output neurons will compete for fulfillment of their "own interests". On this scale, basis vectors in the principal subspace are rotated toward the principal eigenvectors. At the end of the paper it will be briefly analyzed how (or why) time-oriented hierarchical method can be used for transformation of any of the existing neural network PSA method, into PCA method.

  9. Considering Horn's Parallel Analysis from a Random Matrix Theory Point of View.

    PubMed

    Saccenti, Edoardo; Timmerman, Marieke E

    2017-03-01

    Horn's parallel analysis is a widely used method for assessing the number of principal components and common factors. We discuss the theoretical foundations of parallel analysis for principal components based on a covariance matrix by making use of arguments from random matrix theory. In particular, we show that (i) for the first component, parallel analysis is an inferential method equivalent to the Tracy-Widom test, (ii) its use to test high-order eigenvalues is equivalent to the use of the joint distribution of the eigenvalues, and thus should be discouraged, and (iii) a formal test for higher-order components can be obtained based on a Tracy-Widom approximation. We illustrate the performance of the two testing procedures using simulated data generated under both a principal component model and a common factors model. For the principal component model, the Tracy-Widom test performs consistently in all conditions, while parallel analysis shows unpredictable behavior for higher-order components. For the common factor model, including major and minor factors, both procedures are heuristic approaches, with variable performance. We conclude that the Tracy-Widom procedure is preferred over parallel analysis for statistically testing the number of principal components based on a covariance matrix.

  10. Principals' Time, Tasks, and Professional Development: An Analysis of Schools and Staffing Survey Data. REL 2017-201

    ERIC Educational Resources Information Center

    Lavigne, Heather J.; Shakman, Karen; Zweig, Jacqueline; Greller, Sara L.

    2016-01-01

    This study describes how principals reported spending their time and what professional development they reported participating in, based on data collected through the Schools and Staffing Survey by the National Center for Education Statistics during the 2011/12 school year. The study analyzes schools by grade level, poverty level, and within…

  11. Analysis of satellite data on energetic particles of ionospheric origin

    NASA Technical Reports Server (NTRS)

    Sharp, R. D.; Johnson, R. G.; Shelley, E. G.

    1976-01-01

    The principal result of this program has been the completion of a detailed statistical study of the properties of precipitating O(+) and H(+) ions during two principal magnetic storms. The results of the analysis of selected data of ion mass spectrometer experiment on satellites are given with emphasis on the morphology of the O(+) ions of ionospheric origin with energies in the 0.7 les than or equal to E less than or equal to 12 keV range that were discovered with this experiment.

  12. Instructional leadership in elementary science: How are school leaders positioned to lead in a next generation science standards era?

    NASA Astrophysics Data System (ADS)

    Winn, Kathleen Mary

    The Next Generation Science Standards (NGSS) are the newest K-12 science content standards created by a coalition of educators, scientists, and researchers available for adoption by states and schools. Principals are important actors during policy implementation especially since principals are charged with assuming the role of an instructional leader for their teachers in all subject areas. Science poses a unique challenge to the elementary curricular landscape because traditionally, elementary teachers report low levels of self-efficacy in the subject. Support in this area therefore becomes important for a successful integration of a new science education agenda. This study analyzed self-reported survey data from public elementary principals (N=667) to address the following three research questions: (1) What type of science backgrounds do elementary principals have? (2) What indicators predict if elementary principals will engage in instructional leadership behaviors in science? (3) Does self-efficacy mediate the relationship between science background and a capacity for instructional leadership in science? The survey data were analyzed quantitatively. Descriptive statistics address the first research question and inferential statistics (hierarchal regression analysis and a mediation analysis) answer the second and third research questions.The sample data show that about 21% of elementary principals have a formal science degree and 26% have a degree in a STEM field. Most principals have not had recent experience teaching science, nor were they every exclusively a science teacher. The analyses suggests that demographic, experiential, and self-efficacy variables predict instructional leadership practices in science.

  13. Priority of VHS Development Based in Potential Area using Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Meirawan, D.; Ana, A.; Saripudin, S.

    2018-02-01

    The current condition of VHS is still inadequate in quality, quantity and relevance. The purpose of this research is to analyse the development of VHS based on the development of regional potential by using principal component analysis (PCA) in Bandung, Indonesia. This study used descriptive qualitative data analysis using the principle of secondary data reduction component. The method used is Principal Component Analysis (PCA) analysis with Minitab Statistics Software tool. The results of this study indicate the value of the lowest requirement is a priority of the construction of development VHS with a program of majors in accordance with the development of regional potential. Based on the PCA score found that the main priority in the development of VHS in Bandung is in Saguling, which has the lowest PCA value of 416.92 in area 1, Cihampelas with the lowest PCA value in region 2 and Padalarang with the lowest PCA value.

  14. ASCS online fault detection and isolation based on an improved MPCA

    NASA Astrophysics Data System (ADS)

    Peng, Jianxin; Liu, Haiou; Hu, Yuhui; Xi, Junqiang; Chen, Huiyan

    2014-09-01

    Multi-way principal component analysis (MPCA) has received considerable attention and been widely used in process monitoring. A traditional MPCA algorithm unfolds multiple batches of historical data into a two-dimensional matrix and cut the matrix along the time axis to form subspaces. However, low efficiency of subspaces and difficult fault isolation are the common disadvantages for the principal component model. This paper presents a new subspace construction method based on kernel density estimation function that can effectively reduce the storage amount of the subspace information. The MPCA model and the knowledge base are built based on the new subspace. Then, fault detection and isolation with the squared prediction error (SPE) statistic and the Hotelling ( T 2) statistic are also realized in process monitoring. When a fault occurs, fault isolation based on the SPE statistic is achieved by residual contribution analysis of different variables. For fault isolation of subspace based on the T 2 statistic, the relationship between the statistic indicator and state variables is constructed, and the constraint conditions are presented to check the validity of fault isolation. Then, to improve the robustness of fault isolation to unexpected disturbances, the statistic method is adopted to set the relation between single subspace and multiple subspaces to increase the corrective rate of fault isolation. Finally fault detection and isolation based on the improved MPCA is used to monitor the automatic shift control system (ASCS) to prove the correctness and effectiveness of the algorithm. The research proposes a new subspace construction method to reduce the required storage capacity and to prove the robustness of the principal component model, and sets the relationship between the state variables and fault detection indicators for fault isolation.

  15. [A study of Boletus bicolor from different areas using Fourier transform infrared spectrometry].

    PubMed

    Zhou, Zai-Jin; Liu, Gang; Ren, Xian-Pei

    2010-04-01

    It is hard to differentiate the same species of wild growing mushrooms from different areas by macromorphological features. In this paper, Fourier transform infrared (FTIR) spectroscopy combined with principal component analysis was used to identify 58 samples of boletus bicolor from five different areas. Based on the fingerprint infrared spectrum of boletus bicolor samples, principal component analysis was conducted on 58 boletus bicolor spectra in the range of 1 350-750 cm(-1) using the statistical software SPSS 13.0. According to the result, the accumulated contributing ratio of the first three principal components accounts for 88.87%. They included almost all the information of samples. The two-dimensional projection plot using first and second principal component is a satisfactory clustering effect for the classification and discrimination of boletus bicolor. All boletus bicolor samples were divided into five groups with a classification accuracy of 98.3%. The study demonstrated that wild growing boletus bicolor at species level from different areas can be identified by FTIR spectra combined with principal components analysis.

  16. Principal Component Analysis in the Spectral Analysis of the Dynamic Laser Speckle Patterns

    NASA Astrophysics Data System (ADS)

    Ribeiro, K. M.; Braga, R. A., Jr.; Horgan, G. W.; Ferreira, D. D.; Safadi, T.

    2014-02-01

    Dynamic laser speckle is a phenomenon that interprets an optical patterns formed by illuminating a surface under changes with coherent light. Therefore, the dynamic change of the speckle patterns caused by biological material is known as biospeckle. Usually, these patterns of optical interference evolving in time are analyzed by graphical or numerical methods, and the analysis in frequency domain has also been an option, however involving large computational requirements which demands new approaches to filter the images in time. Principal component analysis (PCA) works with the statistical decorrelation of data and it can be used as a data filtering. In this context, the present work evaluated the PCA technique to filter in time the data from the biospeckle images aiming the reduction of time computer consuming and improving the robustness of the filtering. It was used 64 images of biospeckle in time observed in a maize seed. The images were arranged in a data matrix and statistically uncorrelated by PCA technique, and the reconstructed signals were analyzed using the routine graphical and numerical methods to analyze the biospeckle. Results showed the potential of the PCA tool in filtering the dynamic laser speckle data, with the definition of markers of principal components related to the biological phenomena and with the advantage of fast computational processing.

  17. Sparse approximation of currents for statistics on curves and surfaces.

    PubMed

    Durrleman, Stanley; Pennec, Xavier; Trouvé, Alain; Ayache, Nicholas

    2008-01-01

    Computing, processing, visualizing statistics on shapes like curves or surfaces is a real challenge with many applications ranging from medical image analysis to computational geometry. Modelling such geometrical primitives with currents avoids feature-based approach as well as point-correspondence method. This framework has been proved to be powerful to register brain surfaces or to measure geometrical invariants. However, if the state-of-the-art methods perform efficiently pairwise registrations, new numerical schemes are required to process groupwise statistics due to an increasing complexity when the size of the database is growing. Statistics such as mean and principal modes of a set of shapes often have a heavy and highly redundant representation. We propose therefore to find an adapted basis on which mean and principal modes have a sparse decomposition. Besides the computational improvement, this sparse representation offers a way to visualize and interpret statistics on currents. Experiments show the relevance of the approach on 34 sets of 70 sulcal lines and on 50 sets of 10 meshes of deep brain structures.

  18. Proposal for a biometrics of the cortical surface: a statistical method for relative surface distance metrics

    NASA Astrophysics Data System (ADS)

    Bookstein, Fred L.

    1995-08-01

    Recent advances in computational geometry have greatly extended the range of neuroanatomical questions that can be approached by rigorous quantitative methods. One of the major current challenges in this area is to describe the variability of human cortical surface form and its implications for individual differences in neurophysiological functioning. Existing techniques for representation of stochastically invaginated surfaces do not conduce to the necessary parametric statistical summaries. In this paper, following a hint from David Van Essen and Heather Drury, I sketch a statistical method customized for the constraints of this complex data type. Cortical surface form is represented by its Riemannian metric tensor and averaged according to parameters of a smooth averaged surface. Sulci are represented by integral trajectories of the smaller principal strains of this metric, and their statistics follow the statistics of that relative metric. The diagrams visualizing this tensor analysis look like alligator leather but summarize all aspects of cortical surface form in between the principal sulci, the reliable ones; no flattening is required.

  19. Obscure phenomena in statistical analysis of quantitative structure-activity relationships. Part 1: Multicollinearity of physicochemical descriptors.

    PubMed

    Mager, P P; Rothe, H

    1990-10-01

    Multicollinearity of physicochemical descriptors leads to serious consequences in quantitative structure-activity relationship (QSAR) analysis, such as incorrect estimators and test statistics of regression coefficients of the ordinary least-squares (OLS) model applied usually to QSARs. Beside the diagnosis of the known simple collinearity, principal component regression analysis (PCRA) also allows the diagnosis of various types of multicollinearity. Only if the absolute values of PCRA estimators are order statistics that decrease monotonically, the effects of multicollinearity can be circumvented. Otherwise, obscure phenomena may be observed, such as good data recognition but low predictive model power of a QSAR model.

  20. Towards Solving the Mixing Problem in the Decomposition of Geophysical Time Series by Independent Component Analysis

    NASA Technical Reports Server (NTRS)

    Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

    2000-01-01

    The use of the Principal Component Analysis technique for the analysis of geophysical time series has been questioned in particular for its tendency to extract components that mix several physical phenomena even when the signal is just their linear sum. We demonstrate with a data simulation experiment that the Independent Component Analysis, a recently developed technique, is able to solve this problem. This new technique requires the statistical independence of components, a stronger constraint, that uses higher-order statistics, instead of the classical decorrelation a weaker constraint, that uses only second-order statistics. Furthermore, ICA does not require additional a priori information such as the localization constraint used in Rotational Techniques.

  1. 75 FR 61136 - Notice of Proposed Information Collection Requests

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-04

    ... EdFacts data as well as data from surveys of school principals and special education designees about their school improvement practices. The study will use descriptive statistics and regression analysis to...

  2. On the Use of Principal Component and Spectral Density Analysis to Evaluate the Community Multiscale Air Quality (CMAQ) Model

    EPA Science Inventory

    A 5 year (2002-2006) simulation of CMAQ covering the eastern United States is evaluated using principle component analysis in order to identify and characterize statistically significant patterns of model bias. Such analysis is useful in that in can identify areas of poor model ...

  3. Regional Morphology Analysis Package (RMAP): Empirical Orthogonal Function Analysis, Background and Examples

    DTIC Science & Technology

    2007-10-01

    1984. Complex principal component analysis : Theory and examples. Journal of Climate and Applied Meteorology 23: 1660-1673. Hotelling, H. 1933...Sediments 99. ASCE: 2,566-2,581. Von Storch, H., and A. Navarra. 1995. Analysis of climate variability. Applications of statistical techniques. Berlin...ERDC TN-SWWRP-07-9 October 2007 Regional Morphology Empirical Analysis Package (RMAP): Orthogonal Function Analysis , Background and Examples by

  4. Evaluation of Meterorite Amono Acid Analysis Data Using Multivariate Techniques

    NASA Technical Reports Server (NTRS)

    McDonald, G.; Storrie-Lombardi, M.; Nealson, K.

    1999-01-01

    The amino acid distributions in the Murchison carbonaceous chondrite, Mars meteorite ALH84001, and ice from the Allan Hills region of Antarctica are shown, using a multivariate technique known as Principal Component Analysis (PCA), to be statistically distinct from the average amino acid compostion of 101 terrestrial protein superfamilies.

  5. A statistical evaluation of spectral fingerprinting methods using analysis of variance and principal component analysis

    USDA-ARS?s Scientific Manuscript database

    Six methods were compared with respect to spectral fingerprinting of a well-characterized series of broccoli samples. Spectral fingerprints were acquired for finely-powdered solid samples using Fourier transform-infrared (IR) and Fourier transform-near infrared (NIR) spectrometry and for aqueous met...

  6. Multivariate Statistical Analysis: a tool for groundwater quality assessment in the hidrogeologic region of the Ring of Cenotes, Yucatan, Mexico.

    NASA Astrophysics Data System (ADS)

    Ye, M.; Pacheco Castro, R. B.; Pacheco Avila, J.; Cabrera Sansores, A.

    2014-12-01

    The karstic aquifer of Yucatan is a vulnerable and complex system. The first fifteen meters of this aquifer have been polluted, due to this the protection of this resource is important because is the only source of potable water of the entire State. Through the assessment of groundwater quality we can gain some knowledge about the main processes governing water chemistry as well as spatial patterns which are important to establish protection zones. In this work multivariate statistical techniques are used to assess the groundwater quality of the supply wells (30 to 40 meters deep) in the hidrogeologic region of the Ring of Cenotes, located in Yucatan, Mexico. Cluster analysis and principal component analysis are applied in groundwater chemistry data of the study area. Results of principal component analysis show that the main sources of variation in the data are due sea water intrusion and the interaction of the water with the carbonate rocks of the system and some pollution processes. The cluster analysis shows that the data can be divided in four clusters. The spatial distribution of the clusters seems to be random, but is consistent with sea water intrusion and pollution with nitrates. The overall results show that multivariate statistical analysis can be successfully applied in the groundwater quality assessment of this karstic aquifer.

  7. Relevant principal component analysis applied to the characterisation of Portuguese heather honey.

    PubMed

    Martins, Rui C; Lopes, Victor V; Valentão, Patrícia; Carvalho, João C M F; Isabel, Paulo; Amaral, Maria T; Batista, Maria T; Andrade, Paula B; Silva, Branca M

    2008-01-01

    The main purpose of this study was the characterisation of 'Serra da Lousã' heather honey by using novel statistical methodology, relevant principal component analysis, in order to assess the correlations between production year, locality and composition. Herein, we also report its chemical composition in terms of sugars, glycerol and ethanol, and physicochemical parameters. Sugars profiles from 'Serra da Lousã' heather and 'Terra Quente de Trás-os-Montes' lavender honeys were compared and allowed the discrimination: 'Serra da Lousã' honeys do not contain sucrose, generally exhibit lower contents of turanose, trehalose and maltose and higher contents of fructose and glucose. Different localities from 'Serra da Lousã' provided groups of samples with high and low glycerol contents. Glycerol and ethanol contents were revealed to be independent of the sugars profiles. These data and statistical models can be very useful in the comparison and detection of adulterations during the quality control analysis of 'Serra da Lousã' honey.

  8. Joint principal trend analysis for longitudinal high-dimensional data.

    PubMed

    Zhang, Yuping; Ouyang, Zhengqing

    2018-06-01

    We consider a research scenario motivated by integrating multiple sources of information for better knowledge discovery in diverse dynamic biological processes. Given two longitudinal high-dimensional datasets for a group of subjects, we want to extract shared latent trends and identify relevant features. To solve this problem, we present a new statistical method named as joint principal trend analysis (JPTA). We demonstrate the utility of JPTA through simulations and applications to gene expression data of the mammalian cell cycle and longitudinal transcriptional profiling data in response to influenza viral infections. © 2017, The International Biometric Society.

  9. Statistical tables and charts showing geochemical variation in the Mesoproterozoic Big Creek, Apple Creek, and Gunsight formations, Lemhi group, Salmon River Mountains and Lemhi Range, central Idaho

    USGS Publications Warehouse

    Lindsey, David A.; Tysdal, Russell G.; Taggart, Joseph E.

    2002-01-01

    The principal purpose of this report is to provide a reference archive for results of a statistical analysis of geochemical data for metasedimentary rocks of Mesoproterozoic age of the Salmon River Mountains and Lemhi Range, central Idaho. Descriptions of geochemical data sets, statistical methods, rationale for interpretations, and references to the literature are provided. Three methods of analysis are used: R-mode factor analysis of major oxide and trace element data for identifying petrochemical processes, analysis of variance for effects of rock type and stratigraphic position on chemical composition, and major-oxide ratio plots for comparison with the chemical composition of common clastic sedimentary rocks.

  10. Independent component analysis for automatic note extraction from musical trills

    NASA Astrophysics Data System (ADS)

    Brown, Judith C.; Smaragdis, Paris

    2004-05-01

    The method of principal component analysis, which is based on second-order statistics (or linear independence), has long been used for redundancy reduction of audio data. The more recent technique of independent component analysis, enforcing much stricter statistical criteria based on higher-order statistical independence, is introduced and shown to be far superior in separating independent musical sources. This theory has been applied to piano trills and a database of trill rates was assembled from experiments with a computer-driven piano, recordings of a professional pianist, and commercially available compact disks. The method of independent component analysis has thus been shown to be an outstanding, effective means of automatically extracting interesting musical information from a sea of redundant data.

  11. Source Evaluation and Trace Metal Contamination in Benthic Sediments from Equatorial Ecosystems Using Multivariate Statistical Techniques

    PubMed Central

    Benson, Nsikak U.; Asuquo, Francis E.; Williams, Akan B.; Essien, Joseph P.; Ekong, Cyril I.; Akpabio, Otobong; Olajire, Abaas A.

    2016-01-01

    Trace metals (Cd, Cr, Cu, Ni and Pb) concentrations in benthic sediments were analyzed through multi-step fractionation scheme to assess the levels and sources of contamination in estuarine, riverine and freshwater ecosystems in Niger Delta (Nigeria). The degree of contamination was assessed using the individual contamination factors (ICF) and global contamination factor (GCF). Multivariate statistical approaches including principal component analysis (PCA), cluster analysis and correlation test were employed to evaluate the interrelationships and associated sources of contamination. The spatial distribution of metal concentrations followed the pattern Pb>Cu>Cr>Cd>Ni. Ecological risk index by ICF showed significant potential mobility and bioavailability for Cu, Cu and Ni. The ICF contamination trend in the benthic sediments at all studied sites was Cu>Cr>Ni>Cd>Pb. The principal component and agglomerative clustering analyses indicate that trace metals contamination in the ecosystems was influenced by multiple pollution sources. PMID:27257934

  12. Online signature recognition using principal component analysis and artificial neural network

    NASA Astrophysics Data System (ADS)

    Hwang, Seung-Jun; Park, Seung-Je; Baek, Joong-Hwan

    2016-12-01

    In this paper, we propose an algorithm for on-line signature recognition using fingertip point in the air from the depth image acquired by Kinect. We extract 10 statistical features from X, Y, Z axis, which are invariant to changes in shifting and scaling of the signature trajectories in three-dimensional space. Artificial neural network is adopted to solve the complex signature classification problem. 30 dimensional features are converted into 10 principal components using principal component analysis, which is 99.02% of total variances. We implement the proposed algorithm and test to actual on-line signatures. In experiment, we verify the proposed method is successful to classify 15 different on-line signatures. Experimental result shows 98.47% of recognition rate when using only 10 feature vectors.

  13. Information extraction from multivariate images

    NASA Technical Reports Server (NTRS)

    Park, S. K.; Kegley, K. A.; Schiess, J. R.

    1986-01-01

    An overview of several multivariate image processing techniques is presented, with emphasis on techniques based upon the principal component transformation (PCT). Multiimages in various formats have a multivariate pixel value, associated with each pixel location, which has been scaled and quantized into a gray level vector, and the bivariate of the extent to which two images are correlated. The PCT of a multiimage decorrelates the multiimage to reduce its dimensionality and reveal its intercomponent dependencies if some off-diagonal elements are not small, and for the purposes of display the principal component images must be postprocessed into multiimage format. The principal component analysis of a multiimage is a statistical analysis based upon the PCT whose primary application is to determine the intrinsic component dimensionality of the multiimage. Computational considerations are also discussed.

  14. Application of principal component regression and partial least squares regression in ultraviolet spectrum water quality detection

    NASA Astrophysics Data System (ADS)

    Li, Jiangtong; Luo, Yongdao; Dai, Honglin

    2018-01-01

    Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.

  15. Inquiring the Most Critical Teacher's Technology Education Competences in the Highest Efficient Technology Education Learning Organization

    ERIC Educational Resources Information Center

    Yung-Kuan, Chan; Hsieh, Ming-Yuan; Lee, Chin-Feng; Huang, Chih-Cheng; Ho, Li-Chih

    2017-01-01

    Under the hyper-dynamic education situation, this research, in order to comprehensively explore the interplays between Teacher Competence Demands (TCD) and Learning Organization Requests (LOR), cross-employs the data refined method of Descriptive Statistics (DS) method and Analysis of Variance (ANOVA) and Principal Components Analysis (PCA)…

  16. Principle Component Analysis with Incomplete Data: A simulation of R pcaMethods package in Constructing an Environmental Quality Index with Missing Data

    EPA Science Inventory

    Missing data is a common problem in the application of statistical techniques. In principal component analysis (PCA), a technique for dimensionality reduction, incomplete data points are either discarded or imputed using interpolation methods. Such approaches are less valid when ...

  17. Proceedings of the NASA Symposium on Mathematical Pattern Recognition and Image Analysis

    NASA Technical Reports Server (NTRS)

    Guseman, L. F., Jr.

    1983-01-01

    The application of mathematical and statistical analyses techniques to imagery obtained by remote sensors is described by Principal Investigators. Scene-to-map registration, geometric rectification, and image matching are among the pattern recognition aspects discussed.

  18. Statistical interpretation of chromatic indicators in correlation to phytochemical profile of a sulfur dioxide-free mulberry (Morus nigra) wine submitted to non-thermal maturation processes.

    PubMed

    Tchabo, William; Ma, Yongkun; Kwaw, Emmanuel; Zhang, Haining; Xiao, Lulu; Apaliya, Maurice T

    2018-01-15

    The four different methods of color measurement of wine proposed by Boulton, Giusti, Glories and Commission International de l'Eclairage (CIE) were applied to assess the statistical relationship between the phytochemical profile and chromatic characteristics of sulfur dioxide-free mulberry (Morus nigra) wine submitted to non-thermal maturation processes. The alteration in chromatic properties and phenolic composition of non-thermal aged mulberry wine were examined, aided by the used of Pearson correlation, cluster and principal component analysis. The results revealed a positive effect of non-thermal processes on phytochemical families of wines. From Pearson correlation analysis relationships between chromatic indexes and flavonols as well as anthocyanins were established. Cluster analysis highlighted similarities between Boulton and Giusti parameters, as well as Glories and CIE parameters in the assessment of chromatic properties of wines. Finally, principal component analysis was able to discriminate wines subjected to different maturation techniques on the basis of their chromatic and phenolics characteristics. Copyright © 2017. Published by Elsevier Ltd.

  19. Chemometric and multivariate statistical analysis of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulfides.

    PubMed

    Kalegowda, Yogesh; Harmer, Sarah L

    2012-03-20

    Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.

  20. The Job Dimensions Underlying the Job Elements of the Position Analysis Questionnaire (PAQ) (Form B). Report No. 4.

    ERIC Educational Resources Information Center

    Marquardt, Lloyd D.; McCormick, Ernest J.

    This study was concerned with the identification of the job dimension underlying the job elements of the Position Analysis Questionnaire (PAQ), Form B. The PAQ is a structured job analysis instrument consisting of 187 worker-oriented job elements which are divided into six a priori major divisions. The statistical procedure of principal components…

  1. Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.

    PubMed

    Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V

    2007-01-01

    The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.

  2. Incorporating principal component analysis into air quality model evaluation

    EPA Science Inventory

    The efficacy of standard air quality model evaluation techniques is becoming compromised as the simulation periods continue to lengthen in response to ever increasing computing capacity. Accordingly, the purpose of this paper is to demonstrate a statistical approach called Princi...

  3. Dimensionality of Community Satisfaction.

    ERIC Educational Resources Information Center

    Williams, R. Gary; Knop, Edward

    Using factor analysis (both principal factor solutions and rotated factor solutions) to search for underlying statistical commonality among 12 indicators of community satisfaction in 6 Colorado communities, the research explores various dimensions of community satisfaction that may be important across different communities. The communities under…

  4. Investigating elementary principals' science beliefs and knowledge and its relationship to students' science outcomes

    NASA Astrophysics Data System (ADS)

    Khan, Uzma Zafar

    The aim of this quantitative study was to investigate elementary principals' beliefs about reformed science teaching and learning, science subject matter knowledge, and how these factors relate to fourth grade students' superior science outcomes. Online survey methodology was used for data collection and included a demographic questionnaire and two survey instruments: the K-4 Physical Science Misconceptions Oriented Science Assessment Resources for Teachers (MOSART) and the Beliefs About Reformed Science Teaching and Learning (BARSTL). Hierarchical multiple regression analysis was used to assess the separate and collective contributions of background variables such as principals' personal and school characteristics, principals' science teaching and learning beliefs, and principals' science knowledge on students' superior science outcomes. Mediation analysis was also used to explore whether principals' science knowledge mediated the relationship between their beliefs about science teaching and learning and students' science outcomes. Findings indicated that principals' science beliefs and knowledge do not contribute to predicting students' superior science scores. Fifty-two percent of the variance in percentage of students with superior science scores was explained by school characteristics with free or reduced price lunch and school type as the only significant individual predictors. Furthermore, principals' science knowledge did not mediate the relationship between their science beliefs and students' science outcomes. There was no statistically significant variation among the variables. The data failed to support the proposed mediation model of the study. Implications for future research are discussed.

  5. Differential principal component analysis of ChIP-seq.

    PubMed

    Ji, Hongkai; Li, Xia; Wang, Qian-fei; Ning, Yang

    2013-04-23

    We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.

  6. PM10 and gaseous pollutants trends from air quality monitoring networks in Bari province: principal component analysis and absolute principal component scores on a two years and half data set

    PubMed Central

    2014-01-01

    Background The chemical composition of aerosols and particle size distributions are the most significant factors affecting air quality. In particular, the exposure to finer particles can cause short and long-term effects on human health. In the present paper PM10 (particulate matter with aerodynamic diameter lower than 10 μm), CO, NOx (NO and NO2), Benzene and Toluene trends monitored in six monitoring stations of Bari province are shown. The data set used was composed by bi-hourly means for all parameters (12 bi-hourly means per day for each parameter) and it’s referred to the period of time from January 2005 and May 2007. The main aim of the paper is to provide a clear illustration of how large data sets from monitoring stations can give information about the number and nature of the pollutant sources, and mainly to assess the contribution of the traffic source to PM10 concentration level by using multivariate statistical techniques such as Principal Component Analysis (PCA) and Absolute Principal Component Scores (APCS). Results Comparing the night and day mean concentrations (per day) for each parameter it has been pointed out that there is a different night and day behavior for some parameters such as CO, Benzene and Toluene than PM10. This suggests that CO, Benzene and Toluene concentrations are mainly connected with transport systems, whereas PM10 is mostly influenced by different factors. The statistical techniques identified three recurrent sources, associated with vehicular traffic and particulate transport, covering over 90% of variance. The contemporaneous analysis of gas and PM10 has allowed underlining the differences between the sources of these pollutants. Conclusions The analysis of the pollutant trends from large data set and the application of multivariate statistical techniques such as PCA and APCS can give useful information about air quality and pollutant’s sources. These knowledge can provide useful advices to environmental policies in order to reach the WHO recommended levels. PMID:24555534

  7. SIMREL: Software for Coefficient Alpha and Its Confidence Intervals with Monte Carlo Studies

    ERIC Educational Resources Information Center

    Yurdugul, Halil

    2009-01-01

    This article describes SIMREL, a software program designed for the simulation of alpha coefficients and the estimation of its confidence intervals. SIMREL runs on two alternatives. In the first one, if SIMREL is run for a single data file, it performs descriptive statistics, principal components analysis, and variance analysis of the item scores…

  8. Understanding the Relationship between School-Based Management, Emotional Intelligence and Performance of Religious Upper Secondary School Principals in Banten Province

    ERIC Educational Resources Information Center

    Muslihah, Oleh Eneng

    2015-01-01

    The research examines the correlation between the understanding of school-based management, emotional intelligences and headmaster performance. Data was collected, using quantitative methods. The statistical analysis used was the Pearson Correlation, and multivariate regression analysis. The results of this research suggest firstly that there is…

  9. Correcting for population structure and kinship using the linear mixed model: theory and extensions.

    PubMed

    Hoffman, Gabriel E

    2013-01-01

    Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.

  10. MORPHOLOGICAL VARIATION IN HATCHLING AMERICAN ALLIGATORS (ALLIGATOR MISSISSIPPIENSIS) FROM THREE FLORIDA LAKES

    EPA Science Inventory

    Morphological variation of 508 hatchling alligators from three lakes in north central Florida (Lakes Woodruff, Apopka, and Orange) was analyzed using multivariate statistics. Morphological variation was found among clutches as well as among lakes. Principal components analysis wa...

  11. SOME STATISTICAL TOOLS FOR EVALUATING COMPUTER SIMULATIONS: A DATA ANALYSIS. (R825381)

    EPA Science Inventory

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  12. Comparative study on fast classification of brick samples by combination of principal component analysis and linear discriminant analysis using stand-off and table-top laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef

    2014-11-01

    Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.

  13. A novel principal component analysis for spatially misaligned multivariate air pollution data.

    PubMed

    Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A

    2017-01-01

    We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.

  14. Principal component analysis of the cytokine and chemokine response to human traumatic brain injury.

    PubMed

    Helmy, Adel; Antoniades, Chrystalina A; Guilfoyle, Mathew R; Carpenter, Keri L H; Hutchinson, Peter J

    2012-01-01

    There is a growing realisation that neuro-inflammation plays a fundamental role in the pathology of Traumatic Brain Injury (TBI). This has led to the search for biomarkers that reflect these underlying inflammatory processes using techniques such as cerebral microdialysis. The interpretation of such biomarker data has been limited by the statistical methods used. When analysing data of this sort the multiple putative interactions between mediators need to be considered as well as the timing of production and high degree of statistical co-variance in levels of these mediators. Here we present a cytokine and chemokine dataset from human brain following human traumatic brain injury and use principal component analysis and partial least squares discriminant analysis to demonstrate the pattern of production following TBI, distinct phases of the humoral inflammatory response and the differing patterns of response in brain and in peripheral blood. This technique has the added advantage of making no assumptions about the Relative Recovery (RR) of microdialysis derived parameters. Taken together these techniques can be used in complex microdialysis datasets to summarise the data succinctly and generate hypotheses for future study.

  15. Groundwater quality assessment of urban Bengaluru using multivariate statistical techniques

    NASA Astrophysics Data System (ADS)

    Gulgundi, Mohammad Shahid; Shetty, Amba

    2018-03-01

    Groundwater quality deterioration due to anthropogenic activities has become a subject of prime concern. The objective of the study was to assess the spatial and temporal variations in groundwater quality and to identify the sources in the western half of the Bengaluru city using multivariate statistical techniques. Water quality index rating was calculated for pre and post monsoon seasons to quantify overall water quality for human consumption. The post-monsoon samples show signs of poor quality in drinking purpose compared to pre-monsoon. Cluster analysis (CA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the groundwater quality data measured on 14 parameters from 67 sites distributed across the city. Hierarchical cluster analysis (CA) grouped the 67 sampling stations into two groups, cluster 1 having high pollution and cluster 2 having lesser pollution. Discriminant analysis (DA) was applied to delineate the most meaningful parameters accounting for temporal and spatial variations in groundwater quality of the study area. Temporal DA identified pH as the most important parameter, which discriminates between water quality in the pre-monsoon and post-monsoon seasons and accounts for 72% seasonal assignation of cases. Spatial DA identified Mg, Cl and NO3 as the three most important parameters discriminating between two clusters and accounting for 89% spatial assignation of cases. Principal component analysis was applied to the dataset obtained from the two clusters, which evolved three factors in each cluster, explaining 85.4 and 84% of the total variance, respectively. Varifactors obtained from principal component analysis showed that groundwater quality variation is mainly explained by dissolution of minerals from rock water interactions in the aquifer, effect of anthropogenic activities and ion exchange processes in water.

  16. Rotation of EOFs by the Independent Component Analysis: Towards A Solution of the Mixing Problem in the Decomposition of Geophysical Time Series

    NASA Technical Reports Server (NTRS)

    Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)

    2001-01-01

    The Independent Component Analysis is a recently developed technique for component extraction. This new method requires the statistical independence of the extracted components, a stronger constraint that uses higher-order statistics, instead of the classical decorrelation, a weaker constraint that uses only second-order statistics. This technique has been used recently for the analysis of geophysical time series with the goal of investigating the causes of variability in observed data (i.e. exploratory approach). We demonstrate with a data simulation experiment that, if initialized with a Principal Component Analysis, the Independent Component Analysis performs a rotation of the classical PCA (or EOF) solution. This rotation uses no localization criterion like other Rotation Techniques (RT), only the global generalization of decorrelation by statistical independence is used. This rotation of the PCA solution seems to be able to solve the tendency of PCA to mix several physical phenomena, even when the signal is just their linear sum.

  17. Classification Techniques for Multivariate Data Analysis.

    DTIC Science & Technology

    1980-03-28

    analysis among biologists, botanists, and ecologists, while some social scientists may refer "typology". Other frequently encountered terms are pattern...the determinantal equation: lB -XW 0 (42) 49 The solutions X. are the eigenvalues of the matrix W-1 B 1 as in discriminant analysis. There are t non...Statistical Package for Social Sciences (SPSS) (14) subprogram FACTOR was used for the principal components analysis. It is designed both for the factor

  18. A Genealogical Interpretation of Principal Components Analysis

    PubMed Central

    McVean, Gil

    2009-01-01

    Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's fst and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference. PMID:19834557

  19. Asymptotics of empirical eigenstructure for high dimensional spiked covariance.

    PubMed

    Wang, Weichen; Fan, Jianqing

    2017-06-01

    We derive the asymptotic distributions of the spiked eigenvalues and eigenvectors under a generalized and unified asymptotic regime, which takes into account the magnitude of spiked eigenvalues, sample size, and dimensionality. This regime allows high dimensionality and diverging eigenvalues and provides new insights into the roles that the leading eigenvalues, sample size, and dimensionality play in principal component analysis. Our results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013). They also reveal the biases of estimating leading eigenvalues and eigenvectors by using principal component analysis, and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases. Our results are successfully applied to outstanding problems in estimation of risks of large portfolios and false discovery proportions for dependent test statistics and are illustrated by simulation studies.

  20. Asymptotics of empirical eigenstructure for high dimensional spiked covariance

    PubMed Central

    Wang, Weichen

    2017-01-01

    We derive the asymptotic distributions of the spiked eigenvalues and eigenvectors under a generalized and unified asymptotic regime, which takes into account the magnitude of spiked eigenvalues, sample size, and dimensionality. This regime allows high dimensionality and diverging eigenvalues and provides new insights into the roles that the leading eigenvalues, sample size, and dimensionality play in principal component analysis. Our results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013). They also reveal the biases of estimating leading eigenvalues and eigenvectors by using principal component analysis, and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases. Our results are successfully applied to outstanding problems in estimation of risks of large portfolios and false discovery proportions for dependent test statistics and are illustrated by simulation studies. PMID:28835726

  1. PANDA-view: An easy-to-use tool for statistical analysis and visualization of quantitative proteomics data.

    PubMed

    Chang, Cheng; Xu, Kaikun; Guo, Chaoping; Wang, Jinxia; Yan, Qi; Zhang, Jian; He, Fuchu; Zhu, Yunping

    2018-05-22

    Compared with the numerous software tools developed for identification and quantification of -omics data, there remains a lack of suitable tools for both downstream analysis and data visualization. To help researchers better understand the biological meanings in their -omics data, we present an easy-to-use tool, named PANDA-view, for both statistical analysis and visualization of quantitative proteomics data and other -omics data. PANDA-view contains various kinds of analysis methods such as normalization, missing value imputation, statistical tests, clustering and principal component analysis, as well as the most commonly-used data visualization methods including an interactive volcano plot. Additionally, it provides user-friendly interfaces for protein-peptide-spectrum representation of the quantitative proteomics data. PANDA-view is freely available at https://sourceforge.net/projects/panda-view/. 1987ccpacer@163.com and zhuyunping@gmail.com. Supplementary data are available at Bioinformatics online.

  2. A climatology of total ozone mapping spectrometer data using rotated principal component analysis

    NASA Astrophysics Data System (ADS)

    Eder, Brian K.; Leduc, Sharon K.; Sickles, Joseph E.

    1999-02-01

    The spatial and temporal variability of total column ozone (Ω) obtained from the total ozone mapping spectrometer (TOMS version 7.0) during the period 1980-1992 was examined through the use of a multivariate statistical technique called rotated principal component analysis. Utilization of Kaiser's varimax orthogonal rotation led to the identification of 14, mostly contiguous subregions that together accounted for more than 70% of the total Ω variance. Each subregion displayed statistically unique Ω characteristics that were further examined through time series and spectral density analyses, revealing significant periodicities on semiannual, annual, quasi-biennial, and longer term time frames. This analysis facilitated identification of the probable mechanisms responsible for the variability of Ω within the 14 homogeneous subregions. The mechanisms were either dynamical in nature (i.e., advection associated with baroclinic waves, the quasi-biennial oscillation, or El Niño-Southern Oscillation) or photochemical in nature (i.e., production of odd oxygen (O or O3) associated with the annual progression of the Sun). The analysis has also revealed that the influence of a data retrieval artifact, found in equatorial latitudes of version 6.0 of the TOMS data, has been reduced in version 7.0.

  3. Integrated GIS and multivariate statistical analysis for regional scale assessment of heavy metal soil contamination: A critical review.

    PubMed

    Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan

    2017-12-01

    Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0-0.10 m, or 0-0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component analysis (PCA) and cluster analysis (CA). Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Non-rigid image registration using a statistical spline deformation model.

    PubMed

    Loeckx, Dirk; Maes, Frederik; Vandermeulen, Dirk; Suetens, Paul

    2003-07-01

    We propose a statistical spline deformation model (SSDM) as a method to solve non-rigid image registration. Within this model, the deformation is expressed using a statistically trained B-spline deformation mesh. The model is trained by principal component analysis of a training set. This approach allows to reduce the number of degrees of freedom needed for non-rigid registration by only retaining the most significant modes of variation observed in the training set. User-defined transformation components, like affine modes, are merged with the principal components into a unified framework. Optimization proceeds along the transformation components rather then along the individual spline coefficients. The concept of SSDM's is applied to the temporal registration of thorax CR-images using pattern intensity as the registration measure. Our results show that, using 30 training pairs, a reduction of 33% is possible in the number of degrees of freedom without deterioration of the result. The same accuracy as without SSDM's is still achieved after a reduction up to 66% of the degrees of freedom.

  5. Real-time detection of organic contamination events in water distribution systems by principal components analysis of ultraviolet spectral data.

    PubMed

    Zhang, Jian; Hou, Dibo; Wang, Ke; Huang, Pingjie; Zhang, Guangxin; Loáiciga, Hugo

    2017-05-01

    The detection of organic contaminants in water distribution systems is essential to protect public health from potential harmful compounds resulting from accidental spills or intentional releases. Existing methods for detecting organic contaminants are based on quantitative analyses such as chemical testing and gas/liquid chromatography, which are time- and reagent-consuming and involve costly maintenance. This study proposes a novel procedure based on discrete wavelet transform and principal component analysis for detecting organic contamination events from ultraviolet spectral data. Firstly, the spectrum of each observation is transformed using discrete wavelet with a coiflet mother wavelet to capture the abrupt change along the wavelength. Principal component analysis is then employed to approximate the spectra based on capture and fusion features. The significant value of Hotelling's T 2 statistics is calculated and used to detect outliers. An alarm of contamination event is triggered by sequential Bayesian analysis when the outliers appear continuously in several observations. The effectiveness of the proposed procedure is tested on-line using a pilot-scale setup and experimental data.

  6. A Divergence Statistics Extension to VTK for Performance Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pebay, Philippe Pierre; Bennett, Janine Camille

    This report follows the series of previous documents ([PT08, BPRT09b, PT09, BPT09, PT10, PB13], where we presented the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k -means, order and auto-correlative statistics engines which we developed within the Visualization Tool Kit ( VTK ) as a scalable, parallel and versatile statistics package. We now report on a new engine which we developed for the calculation of divergence statistics, a concept which we hereafter explain and whose main goal is to quantify the discrepancy, in a stasticial manner akin to measuring a distance, between an observed empirical distribution and a theoretical,more » "ideal" one. The ease of use of the new diverence statistics engine is illustrated by the means of C++ code snippets. Although this new engine does not yet have a parallel implementation, it has already been applied to HPC performance analysis, of which we provide an example.« less

  7. New software for statistical analysis of Cambridge Structural Database data

    PubMed Central

    Sykes, Richard A.; McCabe, Patrick; Allen, Frank H.; Battle, Gary M.; Bruno, Ian J.; Wood, Peter A.

    2011-01-01

    A collection of new software tools is presented for the analysis of geometrical, chemical and crystallographic data from the Cambridge Structural Database (CSD). This software supersedes the program Vista. The new functionality is integrated into the program Mercury in order to provide statistical, charting and plotting options alongside three-dimensional structural visualization and analysis. The integration also permits immediate access to other information about specific CSD entries through the Mercury framework, a common requirement in CSD data analyses. In addition, the new software includes a range of more advanced features focused towards structural analysis such as principal components analysis, cone-angle correction in hydrogen-bond analyses and the ability to deal with topological symmetry that may be exhibited in molecular search fragments. PMID:22477784

  8. The Job Dimensions Underlying the Job Elements of the Position Analysis Questionnaire (PAQ) (Form B).

    DTIC Science & Technology

    The study was concerned with the identification of the job dimension underlying the job elements of the Position Analysis Questionnaire ( PAQ ), Form B...The PAQ is a structured job analysis instrument consisting of 187 worker-oriented job elements which are divided into six a priori major divisions...The statistical procedure of principal components analysis was used to identify the job dimensions of the PAQ . Forty-five job dimensions were

  9. Statistical techniques applied to aerial radiometric surveys (STAARS): principal components analysis user's manual. [NURE program

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Koch, C.D.; Pirkle, F.L.; Schmidt, J.S.

    1981-01-01

    A Principal Components Analysis (PCA) has been written to aid in the interpretation of multivariate aerial radiometric data collected by the US Department of Energy (DOE) under the National Uranium Resource Evaluation (NURE) program. The variations exhibited by these data have been reduced and classified into a number of linear combinations by using the PCA program. The PCA program then generates histograms and outlier maps of the individual variates. Black and white plots can be made on a Calcomp plotter by the application of follow-up programs. All programs referred to in this guide were written for a DEC-10. From thismore » analysis a geologist may begin to interpret the data structure. Insight into geological processes underlying the data may be obtained.« less

  10. Rapid differentiation of Chinese hop varieties (Humulus lupulus) using volatile fingerprinting by HS-SPME-GC-MS combined with multivariate statistical analysis.

    PubMed

    Liu, Zechang; Wang, Liping; Liu, Yumei

    2018-01-18

    Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.

  11. A new strategy for statistical analysis-based fingerprint establishment: Application to quality assessment of Semen sojae praeparatum.

    PubMed

    Guo, Hui; Zhang, Zhen; Yao, Yuan; Liu, Jialin; Chang, Ruirui; Liu, Zhao; Hao, Hongyuan; Huang, Taohong; Wen, Jun; Zhou, Tingting

    2018-08-30

    Semen sojae praeparatum with homology of medicine and food is a famous traditional Chinese medicine. A simple and effective quality fingerprint analysis, coupled with chemometrics methods, was developed for quality assessment of Semen sojae praeparatum. First, similarity analysis (SA) and hierarchical clusting analysis (HCA) were applied to select the qualitative markers, which obviously influence the quality of Semen sojae praeparatum. 21 chemicals were selected and characterized by high resolution ion trap/time-of-flight mass spectrometry (LC-IT-TOF-MS). Subsequently, principal components analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) were conducted to select the quantitative markers of Semen sojae praeparatum samples from different origins. Moreover, 11 compounds with statistical significance were determined quantitatively, which provided an accurate and informative data for quality evaluation. This study proposes a new strategy for "statistic analysis-based fingerprint establishment", which would be a valuable reference for further study. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Estimating the Regional Economic Significance of Airports

    DTIC Science & Technology

    1992-09-01

    following three options for estimating induced impacts: the economic base model , an econometric model , and a regional input-output model . One approach to...limitations, however, the economic base model has been widely used for regional economic analysis. A second approach is to develop an econometric model of...analysis is the principal statistical tool used to estimate the economic relationships. Regional econometric models are capable of estimating a single

  13. Interpersonal differentiation within depression diagnosis: relating interpersonal subgroups to symptom load and the quality of the early therapeutic alliance.

    PubMed

    Grosse Holtforth, Martin; Altenstein, David; Krieger, Tobias; Flückiger, Christoph; Wright, Aidan G C; Caspar, Franz

    2014-01-01

    We examined interpersonal problems in psychotherapy outpatients with a principal diagnosis of a depressive disorder in routine care (n=361). These patients were compared to a normative non-clinical sample and to outpatients with other principal diagnoses (n=959). Furthermore, these patients were statistically assigned to interpersonally defined subgroups that were compared regarding symptoms and the quality of the early alliance. The sample of depressive patients reported higher levels of interpersonal problems than the normative sample and the sample of outpatients without a principal diagnosis of depression. Latent Class Analysis identified eight distinct interpersonal subgroups, which differed regarding self-reported symptom load and the quality of the early alliance. However, therapists' alliance ratings did not differentiate between the groups. This interpersonal differentiation within the group of patients with a principal diagnosis of depression may add to a personalized psychotherapy based on interpersonal profiles.

  14. [Analysis of the movement of long axis and the distribution of principal stress in abutment tooth retained by conical telescope].

    PubMed

    Lin, Ying-he; Man, Yi; Qu, Yi-li; Guan, Dong-hua; Lu, Xuan; Wei, Na

    2006-01-01

    To study the movement of long axis and the distribution of principal stress in the abutment teeth in removable partial denture which is retained by use of conical telescope. An ideal three dimensional finite element model was constructed by using SCT image reconstruction technique, self-programming and ANSYS software. The static loads were applied. The displacement of the long axis and the distribution of the principal stress in the abutment teeth was analyzed. There is no statistic difference of displacenat and stress distribution among different three-dimensional finite element models. Generally, the abutment teeth move along the long axis itself. Similar stress distribution was observed in each three-dimensional finite element model. The maximal principal compressive stress was observed at the distal cervix of the second premolar. The abutment teeth can be well protected by use of conical telescope.

  15. Modeling vertebrate diversity in Oregon using satellite imagery

    NASA Astrophysics Data System (ADS)

    Cablk, Mary Elizabeth

    Vertebrate diversity was modeled for the state of Oregon using a parametric approach to regression tree analysis. This exploratory data analysis effectively modeled the non-linear relationships between vertebrate richness and phenology, terrain, and climate. Phenology was derived from time-series NOAA-AVHRR satellite imagery for the year 1992 using two methods: principal component analysis and derivation of EROS data center greenness metrics. These two measures of spatial and temporal vegetation condition incorporated the critical temporal element in this analysis. The first three principal components were shown to contain spatial and temporal information about the landscape and discriminated phenologically distinct regions in Oregon. Principal components 2 and 3, 6 greenness metrics, elevation, slope, aspect, annual precipitation, and annual seasonal temperature difference were investigated as correlates to amphibians, birds, all vertebrates, reptiles, and mammals. Variation explained for each regression tree by taxa were: amphibians (91%), birds (67%), all vertebrates (66%), reptiles (57%), and mammals (55%). Spatial statistics were used to quantify the pattern of each taxa and assess validity of resulting predictions from regression tree models. Regression tree analysis was relatively robust against spatial autocorrelation in the response data and graphical results indicated models were well fit to the data.

  16. Statistical assessment of normal mitral annular geometry using automated three-dimensional echocardiographic analysis.

    PubMed

    Pouch, Alison M; Vergnat, Mathieu; McGarvey, Jeremy R; Ferrari, Giovanni; Jackson, Benjamin M; Sehgal, Chandra M; Yushkevich, Paul A; Gorman, Robert C; Gorman, Joseph H

    2014-01-01

    The basis of mitral annuloplasty ring design has progressed from qualitative surgical intuition to experimental and theoretical analysis of annular geometry with quantitative imaging techniques. In this work, we present an automated three-dimensional (3D) echocardiographic image analysis method that can be used to statistically assess variability in normal mitral annular geometry to support advancement in annuloplasty ring design. Three-dimensional patient-specific models of the mitral annulus were automatically generated from 3D echocardiographic images acquired from subjects with normal mitral valve structure and function. Geometric annular measurements including annular circumference, annular height, septolateral diameter, intercommissural width, and the annular height to intercommissural width ratio were automatically calculated. A mean 3D annular contour was computed, and principal component analysis was used to evaluate variability in normal annular shape. The following mean ± standard deviations were obtained from 3D echocardiographic image analysis: annular circumference, 107.0 ± 14.6 mm; annular height, 7.6 ± 2.8 mm; septolateral diameter, 28.5 ± 3.7 mm; intercommissural width, 33.0 ± 5.3 mm; and annular height to intercommissural width ratio, 22.7% ± 6.9%. Principal component analysis indicated that shape variability was primarily related to overall annular size, with more subtle variation in the skewness and height of the anterior annular peak, independent of annular diameter. Patient-specific 3D echocardiographic-based modeling of the human mitral valve enables statistical analysis of physiologically normal mitral annular geometry. The tool can potentially lead to the development of a new generation of annuloplasty rings that restore the diseased mitral valve annulus back to a truly normal geometry. Copyright © 2014 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.

  17. A Framework for Establishing Standard Reference Scale of Texture by Multivariate Statistical Analysis Based on Instrumental Measurement and Sensory Evaluation.

    PubMed

    Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye

    2016-01-13

    A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.

  18. [Statistical analysis of German radiologic periodicals: developmental trends in the last 10 years].

    PubMed

    Golder, W

    1999-09-01

    To identify which statistical tests are applied in German radiological publications, to what extent their use has changed during the last decade, and which factors might be responsible for this development. The major articles published in "ROFO" and "DER RADIOLOGE" during 1988, 1993 and 1998 were reviewed for statistical content. The contributions were classified by principal focus and radiological subspecialty. The methods used were assigned to descriptive, basal and advanced statistics. Sample size, significance level and power were established. The use of experts' assistance was monitored. Finally, we calculated the so-called cumulative accessibility of the publications. 525 contributions were found to be eligible. In 1988, 87% used descriptive statistics only, 12.5% basal, and 0.5% advanced statistics. The corresponding figures in 1993 and 1998 are 62 and 49%, 32 and 41%, and 6 and 10%, respectively. Statistical techniques were most likely to be used in research on musculoskeletal imaging and articles dedicated to MRI. Six basic categories of statistical methods account for the complete statistical analysis appearing in 90% of the articles. ROC analysis is the single most common advanced technique. Authors make increasingly use of statistical experts' opinion and programs. During the last decade, the use of statistical methods in German radiological journals has fundamentally improved, both quantitatively and qualitatively. Presently, advanced techniques account for 20% of the pertinent statistical tests. This development seems to be promoted by the increasing availability of statistical analysis software.

  19. Statistical methods and regression analysis of stratospheric ozone and meteorological variables in Isfahan

    NASA Astrophysics Data System (ADS)

    Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.

    2008-04-01

    Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.

  20. Wheat crown rot pathogens Fusarium graminearum and F. pseudograminearum lack specialization.

    PubMed

    Chakraborty, Sukumar; Obanor, Friday; Westecott, Rhyannyn; Abeywickrama, Krishanthi

    2010-10-01

    This article reports a lack of pathogenic specialization among Australian Fusarium graminearum and F. pseudograminearum causing crown rot (CR) of wheat using analysis of variance (ANOVA), principal component and biplot analysis, Kendall's coefficient of concordance (W), and κ statistics. Overall, F. pseudograminearum was more aggressive than F. graminearum, supporting earlier delineation of the crown-infecting group as a new species. Although significant wheat line-pathogen isolate interaction in ANOVA suggested putative specialization when seedlings of 60 wheat lines were inoculated with 4 pathogen isolates or 26 wheat lines were inoculated with 10 isolates, significant W and κ showed agreement in rank order of wheat lines, indicating a lack of specialization. The first principal component representing nondifferential aggressiveness explained a large part (up to 65%) of the variation in CR severity. The differential components were small and more pronounced in seedlings than in adult plants. By maximizing variance on the first two principal components, biplots were useful for highlighting the association between isolates and wheat lines. A key finding of this work is that a range of analytical tools are needed to explore pathogenic specialization, and a statistically significant interaction in an ANOVA cannot be taken as conclusive evidence of specialization. With no highly resistant wheat cultivars, Fusarium isolates mostly differ in aggressiveness; however, specialization may appear as more resistant cultivars become widespread.

  1. Dihedral angle principal component analysis of molecular dynamics simulations.

    PubMed

    Altis, Alexandros; Nguyen, Phuong H; Hegger, Rainer; Stock, Gerhard

    2007-06-28

    It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {phi(n)} to the metric coordinate space {x(n)=cos phi(n),y(n)=sin phi(n)} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics simulation, a critical comparison of the various methods is given.

  2. Dihedral angle principal component analysis of molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Altis, Alexandros; Nguyen, Phuong H.; Hegger, Rainer; Stock, Gerhard

    2007-06-01

    It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {φn} to the metric coordinate space {xn=cosφn,yn=sinφn} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300ns molecular dynamics simulation, a critical comparison of the various methods is given.

  3. Machine learning of frustrated classical spin models. I. Principal component analysis

    NASA Astrophysics Data System (ADS)

    Wang, Ce; Zhai, Hui

    2017-10-01

    This work aims at determining whether artificial intelligence can recognize a phase transition without prior human knowledge. If this were successful, it could be applied to, for instance, analyzing data from the quantum simulation of unsolved physical models. Toward this goal, we first need to apply the machine learning algorithm to well-understood models and see whether the outputs are consistent with our prior knowledge, which serves as the benchmark for this approach. In this work, we feed the computer data generated by the classical Monte Carlo simulation for the X Y model in frustrated triangular and union jack lattices, which has two order parameters and exhibits two phase transitions. We show that the outputs of the principal component analysis agree very well with our understanding of different orders in different phases, and the temperature dependences of the major components detect the nature and the locations of the phase transitions. Our work offers promise for using machine learning techniques to study sophisticated statistical models, and our results can be further improved by using principal component analysis with kernel tricks and the neural network method.

  4. PAH Baselines for Amazonic Surficial Sediments: A Case of Study in Guajará Bay and Guamá River (Northern Brazil).

    PubMed

    Rodrigues, Camila Carneiro Dos Santos; Santos, Ewerton; Ramos, Brunalisa Silva; Damasceno, Flaviana Cardoso; Correa, José Augusto Martins

    2018-06-01

    The 16 priority PAH were determined in sediment samples from the insular zone of Guajará Bay and Guamá River (Southern Amazon River mouth). Low hydrocarbon levels were observed and naphthalene was the most representative PAH. The low molecular weight PAH represented 51% of the total PAH. Statistical analysis showed that the sampling sites are not significantly different. Source analysis by PAH ratios and principal component analysis revealed that PAH are primary from a few rate of fossil fuel combustion, mainly related to the local small community activity. All samples presented no biological stress or damage potencial according to the sediment quality guidelines. This study discuss baselines for PAH in surface sediments from Amazonic aquatic systems based on source determination by PAH ratios and principal component analysis, sediment quality guidelines and through comparison with previous studies data.

  5. On the application of the Principal Component Analysis for an efficient climate downscaling of surface wind fields

    NASA Astrophysics Data System (ADS)

    Chavez, Roberto; Lozano, Sergio; Correia, Pedro; Sanz-Rodrigo, Javier; Probst, Oliver

    2013-04-01

    With the purpose of efficiently and reliably generating long-term wind resource maps for the wind energy industry, the application and verification of a statistical methodology for the climate downscaling of wind fields at surface level is presented in this work. This procedure is based on the combination of the Monte Carlo and the Principal Component Analysis (PCA) statistical methods. Firstly the Monte Carlo method is used to create a huge number of daily-based annual time series, so called climate representative years, by the stratified sampling of a 33-year-long time series corresponding to the available period of the NCAR/NCEP global reanalysis data set (R-2). Secondly the representative years are evaluated such that the best set is chosen according to its capability to recreate the Sea Level Pressure (SLP) temporal and spatial fields from the R-2 data set. The measure of this correspondence is based on the Euclidean distance between the Empirical Orthogonal Functions (EOF) spaces generated by the PCA (Principal Component Analysis) decomposition of the SLP fields from both the long-term and the representative year data sets. The methodology was verified by comparing the selected 365-days period against a 9-year period of wind fields generated by dynamical downscaling the Global Forecast System data with the mesoscale model SKIRON for the Iberian Peninsula. These results showed that, compared to the traditional method of dynamical downscaling any random 365-days period, the error in the average wind velocity by the PCA's representative year was reduced by almost 30%. Moreover the Mean Absolute Errors (MAE) in the monthly and daily wind profiles were also reduced by almost 25% along all SKIRON grid points. These results showed also that the methodology presented maximum error values in the wind speed mean of 0.8 m/s and maximum MAE in the monthly curves of 0.7 m/s. Besides the bulk numbers, this work shows the spatial distribution of the errors across the Iberian domain and additional wind statistics such as the velocity and directional frequency. Additional repetitions were performed to prove the reliability and robustness of this kind-of statistical-dynamical downscaling method.

  6. A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis.

    PubMed

    Reese, Sarah E; Archer, Kellie J; Therneau, Terry M; Atkinson, Elizabeth J; Vachon, Celine M; de Andrade, Mariza; Kocher, Jean-Pierre A; Eckel-Passow, Jeanette E

    2013-11-15

    Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal component analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of PCA to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test whether a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays, whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We developed a new statistic that uses gPCA to identify whether batch effects exist in high-throughput genomic data. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. The gPCA R package (Available via CRAN) provides functionality and data to perform the methods in this article. reesese@vcu.edu

  7. Some new mathematical methods for variational objective analysis

    NASA Technical Reports Server (NTRS)

    Wahba, Grace; Johnson, Donald R.

    1994-01-01

    Numerous results were obtained relevant to remote sensing, variational objective analysis, and data assimilation. A list of publications relevant in whole or in part is attached. The principal investigator gave many invited lectures, disseminating the results to the meteorological community as well as the statistical community. A list of invited lectures at meetings is attached, as well as a list of departmental colloquia at various universities and institutes.

  8. A Psychometric Investigation of the Marlowe-Crowne Social Desirability Scale Using Rasch Measurement

    ERIC Educational Resources Information Center

    Seol, Hyunsoo

    2007-01-01

    The author used Rasch measurement to examine the reliability and validity of 382 Korean university students' scores on the Marlowe-Crowne Social Desirability Scale (MCSDS; D. P. Crowne and D. Marlowe, 1960). Results revealed that item-fit statistics and principal component analysis with standardized residuals provide evidence of MCSDS'…

  9. Raman signatures of ferroic domain walls captured by principal component analysis.

    PubMed

    Nataf, G F; Barrett, N; Kreisel, J; Guennou, M

    2018-01-24

    Ferroic domain walls are currently investigated by several state-of-the art techniques in order to get a better understanding of their distinct, functional properties. Here, principal component analysis (PCA) of Raman maps is used to study ferroelectric domain walls (DWs) in LiNbO 3 and ferroelastic DWs in NdGaO 3 . It is shown that PCA allows us to quickly and reliably identify small Raman peak variations at ferroelectric DWs and that the value of a peak shift can be deduced-accurately and without a priori-from a first order Taylor expansion of the spectra. The ability of PCA to separate the contribution of ferroelastic domains and DWs to Raman spectra is emphasized. More generally, our results provide a novel route for the statistical analysis of any property mapped across a DW.

  10. Principal variance component analysis of crop composition data: a case study on herbicide-tolerant cotton.

    PubMed

    Harrison, Jay M; Howard, Delia; Malven, Marianne; Halls, Steven C; Culler, Angela H; Harrigan, George G; Wolfinger, Russell D

    2013-07-03

    Compositional studies on genetically modified (GM) and non-GM crops have consistently demonstrated that their respective levels of key nutrients and antinutrients are remarkably similar and that other factors such as germplasm and environment contribute more to compositional variability than transgenic breeding. We propose that graphical and statistical approaches that can provide meaningful evaluations of the relative impact of different factors to compositional variability may offer advantages over traditional frequentist testing. A case study on the novel application of principal variance component analysis (PVCA) in a compositional assessment of herbicide-tolerant GM cotton is presented. Results of the traditional analysis of variance approach confirmed the compositional equivalence of the GM and non-GM cotton. The multivariate approach of PVCA provided further information on the impact of location and germplasm on compositional variability relative to GM.

  11. The U.S. geological survey rass-statpac system for management and statistical reduction of geochemical data

    USGS Publications Warehouse

    VanTrump, G.; Miesch, A.T.

    1977-01-01

    RASS is an acronym for Rock Analysis Storage System and STATPAC, for Statistical Package. The RASS and STATPAC computer programs are integrated into the RASS-STATPAC system for the management and statistical reduction of geochemical data. The system, in its present form, has been in use for more than 9 yr by scores of U.S. Geological Survey geologists, geochemists, and other scientists engaged in a broad range of geologic and geochemical investigations. The principal advantage of the system is the flexibility afforded the user both in data searches and retrievals and in the manner of statistical treatment of data. The statistical programs provide for most types of statistical reduction normally used in geochemistry and petrology, but also contain bridges to other program systems for statistical processing and automatic plotting. ?? 1977.

  12. Extracting chemical information from high-resolution Kβ X-ray emission spectroscopy

    NASA Astrophysics Data System (ADS)

    Limandri, S.; Robledo, J.; Tirao, G.

    2018-06-01

    High-resolution X-ray emission spectroscopy allows studying the chemical environment of a wide variety of materials. Chemical information can be obtained by fitting the X-ray spectra and observing the behavior of some spectral features. Spectral changes can also be quantified by means of statistical parameters calculated by considering the spectrum as a probability distribution. Another possibility is to perform statistical multivariate analysis, such as principal component analysis. In this work the performance of these procedures for extracting chemical information in X-ray emission spectroscopy spectra for mixtures of Mn2+ and Mn4+ oxides are studied. A detail analysis of the parameters obtained, as well as the associated uncertainties is shown. The methodologies are also applied for Mn oxidation state characterization of double perovskite oxides Ba1+xLa1-xMnSbO6 (with 0 ≤ x ≤ 0.7). The results show that statistical parameters and multivariate analysis are the most suitable for the analysis of this kind of spectra.

  13. A Primer on Bayesian Analysis for Experimental Psychopathologists

    PubMed Central

    Krypotos, Angelos-Miltiadis; Blanken, Tessa F.; Arnaudova, Inna; Matzke, Dora; Beckers, Tom

    2016-01-01

    The principal goals of experimental psychopathology (EPP) research are to offer insights into the pathogenic mechanisms of mental disorders and to provide a stable ground for the development of clinical interventions. The main message of the present article is that those goals are better served by the adoption of Bayesian statistics than by the continued use of null-hypothesis significance testing (NHST). In the first part of the article we list the main disadvantages of NHST and explain why those disadvantages limit the conclusions that can be drawn from EPP research. Next, we highlight the advantages of Bayesian statistics. To illustrate, we then pit NHST and Bayesian analysis against each other using an experimental data set from our lab. Finally, we discuss some challenges when adopting Bayesian statistics. We hope that the present article will encourage experimental psychopathologists to embrace Bayesian statistics, which could strengthen the conclusions drawn from EPP research. PMID:28748068

  14. Improved Statistical Fault Detection Technique and Application to Biological Phenomena Modeled by S-Systems.

    PubMed

    Mansouri, Majdi; Nounou, Mohamed N; Nounou, Hazem N

    2017-09-01

    In our previous work, we have demonstrated the effectiveness of the linear multiscale principal component analysis (PCA)-based moving window (MW)-generalized likelihood ratio test (GLRT) technique over the classical PCA and multiscale principal component analysis (MSPCA)-based GLRT methods. The developed fault detection algorithm provided optimal properties by maximizing the detection probability for a particular false alarm rate (FAR) with different values of windows, and however, most real systems are nonlinear, which make the linear PCA method not able to tackle the issue of non-linearity to a great extent. Thus, in this paper, first, we apply a nonlinear PCA to obtain an accurate principal component of a set of data and handle a wide range of nonlinearities using the kernel principal component analysis (KPCA) model. The KPCA is among the most popular nonlinear statistical methods. Second, we extend the MW-GLRT technique to one that utilizes exponential weights to residuals in the moving window (instead of equal weightage) as it might be able to further improve fault detection performance by reducing the FAR using exponentially weighed moving average (EWMA). The developed detection method, which is called EWMA-GLRT, provides improved properties, such as smaller missed detection and FARs and smaller average run length. The idea behind the developed EWMA-GLRT is to compute a new GLRT statistic that integrates current and previous data information in a decreasing exponential fashion giving more weight to the more recent data. This provides a more accurate estimation of the GLRT statistic and provides a stronger memory that will enable better decision making with respect to fault detection. Therefore, in this paper, a KPCA-based EWMA-GLRT method is developed and utilized in practice to improve fault detection in biological phenomena modeled by S-systems and to enhance monitoring process mean. The idea behind a KPCA-based EWMA-GLRT fault detection algorithm is to combine the advantages brought forward by the proposed EWMA-GLRT fault detection chart with the KPCA model. Thus, it is used to enhance fault detection of the Cad System in E. coli model through monitoring some of the key variables involved in this model such as enzymes, transport proteins, regulatory proteins, lysine, and cadaverine. The results demonstrate the effectiveness of the proposed KPCA-based EWMA-GLRT method over Q , GLRT, EWMA, Shewhart, and moving window-GLRT methods. The detection performance is assessed and evaluated in terms of FAR, missed detection rates, and average run length (ARL 1 ) values.

  15. Assessing prescription drug abuse using functional principal component analysis (FPCA) of wastewater data.

    PubMed

    Salvatore, Stefania; Røislien, Jo; Baz-Lomba, Jose A; Bramness, Jørgen G

    2017-03-01

    Wastewater-based epidemiology is an alternative method for estimating the collective drug use in a community. We applied functional data analysis, a statistical framework developed for analysing curve data, to investigate weekly temporal patterns in wastewater measurements of three prescription drugs with known abuse potential: methadone, oxazepam and methylphenidate, comparing them to positive and negative control drugs. Sewage samples were collected in February 2014 from a wastewater treatment plant in Oslo, Norway. The weekly pattern of each drug was extracted by fitting of generalized additive models, using trigonometric functions to model the cyclic behaviour. From the weekly component, the main temporal features were then extracted using functional principal component analysis. Results are presented through the functional principal components (FPCs) and corresponding FPC scores. Clinically, the most important weekly feature of the wastewater-based epidemiology data was the second FPC, representing the difference between average midweek level and a peak during the weekend, representing possible recreational use of a drug in the weekend. Estimated scores on this FPC indicated recreational use of methylphenidate, with a high weekend peak, but not for methadone and oxazepam. The functional principal component analysis uncovered clinically important temporal features of the weekly patterns of the use of prescription drugs detected from wastewater analysis. This may be used as a post-marketing surveillance method to monitor prescription drugs with abuse potential. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  16. Classifying Facial Actions

    PubMed Central

    Donato, Gianluca; Bartlett, Marian Stewart; Hager, Joseph C.; Ekman, Paul; Sejnowski, Terrence J.

    2010-01-01

    The Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions. PMID:21188284

  17. Principal component analysis and analysis of variance on the effects of Entellan New on the Raman spectra of fibers.

    PubMed

    Yu, Marcia M L; Sandercock, P Mark L

    2012-01-01

    During the forensic examination of textile fibers, fibers are usually mounted on glass slides for visual inspection and identification under the microscope. One method that has the capability to accurately identify single textile fibers without subsequent demounting is Raman microspectroscopy. The effect of the mountant Entellan New on the Raman spectra of fibers was investigated to determine if it is suitable for fiber analysis. Raman spectra of synthetic fibers mounted in three different ways were collected and subjected to multivariate analysis. Principal component analysis score plots revealed that while spectra from different fiber classes formed distinct groups, fibers of the same class formed a single group regardless of the mounting method. The spectra of bare fibers and those mounted in Entellan New were found to be statistically indistinguishable by analysis of variance calculations. These results demonstrate that fibers mounted in Entellan New may be identified directly by Raman microspectroscopy without further sample preparation. © 2011 American Academy of Forensic Sciences.

  18. Principal component analysis of Raman spectra for TiO2 nanoparticle characterization

    NASA Astrophysics Data System (ADS)

    Ilie, Alina Georgiana; Scarisoareanu, Monica; Morjan, Ion; Dutu, Elena; Badiceanu, Maria; Mihailescu, Ion

    2017-09-01

    The Raman spectra of anatase/rutile mixed phases of Sn doped TiO2 nanoparticles and undoped TiO2 nanoparticles, synthesised by laser pyrolysis, with nanocrystallite dimensions varying from 8 to 28 nm, was simultaneously processed with a self-written software that applies Principal Component Analysis (PCA) on the measured spectrum to verify the possibility of objective auto-characterization of nanoparticles from their vibrational modes. The photo-excited process of Raman scattering is very sensible to the material characteristics, especially in the case of nanomaterials, where more properties become relevant for the vibrational behaviour. We used PCA, a statistical procedure that performs eigenvalue decomposition of descriptive data covariance, to automatically analyse the sample's measured Raman spectrum, and to interfere the correlation between nanoparticle dimensions, tin and carbon concentration, and their Principal Component values (PCs). This type of application can allow an approximation of the crystallite size, or tin concentration, only by measuring the Raman spectrum of the sample. The study of loadings of the principal components provides information of the way the vibrational modes are affected by the nanoparticle features and the spectral area relevant for the classification.

  19. RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis.

    PubMed

    Glaab, Enrico; Schneider, Reinhard

    2015-07-01

    High-throughput omics datasets often contain technical replicates included to account for technical sources of noise in the measurement process. Although summarizing these replicate measurements by using robust averages may help to reduce the influence of noise on downstream data analysis, the information on the variance across the replicate measurements is lost in the averaging process and therefore typically disregarded in subsequent statistical analyses.We introduce RepExplore, a web-service dedicated to exploit the information captured in the technical replicate variance to provide more reliable and informative differential expression and abundance statistics for omics datasets. The software builds on previously published statistical methods, which have been applied successfully to biomedical omics data but are difficult to use without prior experience in programming or scripting. RepExplore facilitates the analysis by providing a fully automated data processing and interactive ranking tables, whisker plot, heat map and principal component analysis visualizations to interpret omics data and derived statistics. Freely available at http://www.repexplore.tk enrico.glaab@uni.lu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  20. Reconstruction of spatio-temporal temperature from sparse historical records using robust probabilistic principal component regression

    USGS Publications Warehouse

    Tipton, John; Hooten, Mevin B.; Goring, Simon

    2017-01-01

    Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio-temporal temperature from a very sparse historical record.

  1. Let's Keep Our Quality School Principals on the Job

    ERIC Educational Resources Information Center

    Norton, M. Scott

    2003-01-01

    Research studies strongly support the fact that the leadership of the school principal impacts directly on the climate of the school and, in turn, on student achievement. National statistics relating to principal turnover and dwindling supplies of qualified replacements show clearly that principal turnover has reached crisis proportions.…

  2. Multivariate analysis of chromatographic retention data as a supplementary means for grouping structurally related compounds.

    PubMed

    Fasoula, S; Zisi, Ch; Sampsonidis, I; Virgiliou, Ch; Theodoridis, G; Gika, H; Nikitas, P; Pappa-Louisi, A

    2015-03-27

    In the present study a series of 45 metabolite standards belonging to four chemically similar metabolite classes (sugars, amino acids, nucleosides and nucleobases, and amines) was subjected to LC analysis on three HILIC columns under 21 different gradient conditions with the aim to explore whether the retention properties of these analytes are determined from the chemical group they belong. Two multivariate techniques, principal component analysis (PCA) and discriminant analysis (DA), were used for statistical evaluation of the chromatographic data and extraction similarities between chemically related compounds. The total variance explained by the first two principal components of PCA was found to be about 98%, whereas both statistical analyses indicated that all analytes are successfully grouped in four clusters of chemical structure based on the retention obtained in four or at least three chromatographic runs, which, however should be performed on two different HILIC columns. Moreover, leave-one-out cross-validation of the above retention data set showed that the chemical group in which an analyte belongs can be 95.6% correctly predicted when the analyte is subjected to LC analysis under the same four or three experimental conditions as the all set of analytes was run beforehand. That, in turn, may assist with disambiguation of analyte identification in complex biological extracts. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Detecting subtle hydrochemical anomalies with multivariate statistics: an example from homogeneous groundwaters in the Great Artesian Basin, Australia

    NASA Astrophysics Data System (ADS)

    O'Shea, Bethany; Jankowski, Jerzy

    2006-12-01

    The major ion composition of Great Artesian Basin groundwater in the lower Namoi River valley is relatively homogeneous in chemical composition. Traditional graphical techniques have been combined with multivariate statistical methods to determine whether subtle differences in the chemical composition of these waters can be delineated. Hierarchical cluster analysis and principal components analysis were successful in delineating minor variations within the groundwaters of the study area that were not visually identified in the graphical techniques applied. Hydrochemical interpretation allowed geochemical processes to be identified in each statistically defined water type and illustrated how these groundwaters differ from one another. Three main geochemical processes were identified in the groundwaters: ion exchange, precipitation, and mixing between waters from different sources. Both statistical methods delineated an anomalous sample suspected of being influenced by magmatic CO2 input. The use of statistical methods to complement traditional graphical techniques for waters appearing homogeneous is emphasized for all investigations of this type. Copyright

  4. Measuring Principals' Effectiveness: Results from New Jersey's Principal Evaluation Pilot. REL 2015-089

    ERIC Educational Resources Information Center

    Ross, Christine; Herrmann, Mariesa; Angus, Megan Hague

    2015-01-01

    The purpose of this study was to describe the measures used to evaluate principals in New Jersey in the first (pilot) year of the new principal evaluation system and examine three of the statistical properties of the measures: their variation among principals, their year-to-year stability, and the associations between these measures and the…

  5. Principal Curves on Riemannian Manifolds.

    PubMed

    Hauberg, Soren

    2016-09-01

    Euclidean statistics are often generalized to Riemannian manifolds by replacing straight-line interpolations with geodesic ones. While these Riemannian models are familiar-looking, they are restricted by the inflexibility of geodesics, and they rely on constructions which are optimal only in Euclidean domains. We consider extensions of Principal Component Analysis (PCA) to Riemannian manifolds. Classic Riemannian approaches seek a geodesic curve passing through the mean that optimizes a criteria of interest. The requirements that the solution both is geodesic and must pass through the mean tend to imply that the methods only work well when the manifold is mostly flat within the support of the generating distribution. We argue that instead of generalizing linear Euclidean models, it is more fruitful to generalize non-linear Euclidean models. Specifically, we extend the classic Principal Curves from Hastie & Stuetzle to data residing on a complete Riemannian manifold. We show that for elliptical distributions in the tangent of spaces of constant curvature, the standard principal geodesic is a principal curve. The proposed model is simple to compute and avoids many of the pitfalls of traditional geodesic approaches. We empirically demonstrate the effectiveness of the Riemannian principal curves on several manifolds and datasets.

  6. Materials Approach to Dissecting Surface Responses in the Attachment Stages of Biofouling Organisms

    DTIC Science & Technology

    2016-04-25

    their settlement behavior in regards to the coating surfaces. 5) Multivariate statistical analysis was used to examine the effect (if any) of the...applied to glass rods and were deployed in the field to evaluate settlement preferences. Canonical Analysis of Principal Coordinates were applied to...the influence of coating surface properties on the patterns in settlement observed in the field in the extension of this work over the coming year

  7. Detecting most influencing courses on students grades using block PCA

    NASA Astrophysics Data System (ADS)

    Othman, Osama H.; Gebril, Rami Salah

    2014-12-01

    One of the modern solutions adopted in dealing with the problem of large number of variables in statistical analyses is the Block Principal Component Analysis (Block PCA). This modified technique can be used to reduce the vertical dimension (variables) of the data matrix Xn×p by selecting a smaller number of variables, (say m) containing most of the statistical information. These selected variables can then be employed in further investigations and analyses. Block PCA is an adapted multistage technique of the original PCA. It involves the application of Cluster Analysis (CA) and variable selection throughout sub principal components scores (PC's). The application of Block PCA in this paper is a modified version of the original work of Liu et al (2002). The main objective was to apply PCA on each group of variables, (established using cluster analysis), instead of involving the whole large pack of variables which was proved to be unreliable. In this work, the Block PCA is used to reduce the size of a huge data matrix ((n = 41) × (p = 251)) consisting of Grade Point Average (GPA) of the students in 251 courses (variables) in the faculty of science in Benghazi University. In other words, we are constructing a smaller analytical data matrix of the GPA's of the students with less variables containing most variation (statistical information) in the original database. By applying the Block PCA, (12) courses were found to `absorb' most of the variation or influence from the original data matrix, and hence worth to be keep for future statistical exploring and analytical studies. In addition, the course Independent Study (Math.) was found to be the most influencing course on students GPA among the 12 selected courses.

  8. Investigation of domain walls in PPLN by confocal raman microscopy and PCA analysis

    NASA Astrophysics Data System (ADS)

    Shur, Vladimir Ya.; Zelenovskiy, Pavel; Bourson, Patrice

    2017-07-01

    Confocal Raman microscopy (CRM) is a powerful tool for investigation of ferroelectric domains. Mechanical stresses and electric fields existed in the vicinity of neutral and charged domain walls modify frequency, intensity and width of spectral lines [1], thus allowing to visualize micro- and nanodomain structures both at the surface and in the bulk of the crystal [2,3]. Stresses and fields are naturally coupled in ferroelectrics due to inverse piezoelectric effect and hardly can be separated in Raman spectra. PCA is a powerful statistical method for analysis of large data matrix providing a set of orthogonal variables, called principal components (PCs). PCA is widely used for classification of experimental data, for example, in crystallization experiments, for detection of small amounts of components in solid mixtures etc. [4,5]. In Raman spectroscopy PCA was applied for analysis of phase transitions and provided critical pressure with good accuracy [6]. In the present work we for the first time applied Principal Component Analysis (PCA) method for analysis of Raman spectra measured in periodically poled lithium niobate (PPLN). We found that principal components demonstrate different sensitivity to mechanical stresses and electric fields in the vicinity of the domain walls. This allowed us to separately visualize spatial distribution of fields and electric fields at the surface and in the bulk of PPLN.

  9. Screening molecular associations with lipid membranes using natural abundance 13C cross-polarization magic-angle spinning NMR and principal component analysis.

    PubMed

    Middleton, David A; Hughes, Eleri; Madine, Jillian

    2004-08-11

    We describe an NMR approach for detecting the interactions between phospholipid membranes and proteins, peptides, or small molecules. First, 1H-13C dipolar coupling profiles are obtained from hydrated lipid samples at natural isotope abundance using cross-polarization magic-angle spinning NMR methods. Principal component analysis of dipolar coupling profiles for synthetic lipid membranes in the presence of a range of biologically active additives reveals clusters that relate to different modes of interaction of the additives with the lipid bilayer. Finally, by representing profiles from multiple samples in the form of contour plots, it is possible to reveal statistically significant changes in dipolar couplings, which reflect perturbations in the lipid molecules at the membrane surface or within the hydrophobic interior.

  10. Variability in source sediment contributions by applying different statistic test for a Pyrenean catchment.

    PubMed

    Palazón, L; Navas, A

    2017-06-01

    Information on sediment contribution and transport dynamics from the contributing catchments is needed to develop management plans to tackle environmental problems related with effects of fine sediment as reservoir siltation. In this respect, the fingerprinting technique is an indirect technique known to be valuable and effective for sediment source identification in river catchments. Large variability in sediment delivery was found in previous studies in the Barasona catchment (1509 km 2 , Central Spanish Pyrenees). Simulation results with SWAT and fingerprinting approaches identified badlands and agricultural uses as the main contributors to sediment supply in the reservoir. In this study the <63 μm sediment fraction from the surface reservoir sediments (2 cm) are investigated following the fingerprinting procedure to assess how the use of different statistical procedures affects the amounts of source contributions. Three optimum composite fingerprints were selected to discriminate between source contributions based in land uses/land covers from the same dataset by the application of (1) discriminant function analysis; and its combination (as second step) with (2) Kruskal-Wallis H-test and (3) principal components analysis. Source contribution results were different between assessed options with the greatest differences observed for option using #3, including the two step process: principal components analysis and discriminant function analysis. The characteristics of the solutions by the applied mixing model and the conceptual understanding of the catchment showed that the most reliable solution was achieved using #2, the two step process of Kruskal-Wallis H-test and discriminant function analysis. The assessment showed the importance of the statistical procedure used to define the optimum composite fingerprint for sediment fingerprinting applications. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. The DSM and the Dangerous School Child

    ERIC Educational Resources Information Center

    Bailey, Simon

    2010-01-01

    The history of ADHD is in part a history of children who have not fitted in at school. Yet until recently, surges in diagnostic levels had not prompted a questioning of the school's complicity in the trend. Through an analysis of the Diagnostic and Statistical Manual for Mental Disorders (DSM), the principal diagnostic guideline relating to ADHD,…

  12. A PRINCIPAL COMPONENT ANALYSIS OF THE CLEAN AIR STATUS AND TRENDS NETWORK (CASTNET) AIR CONCENTRATION DATA

    EPA Science Inventory

    The spatial and temporal variability of ambient air concentrations of SO2, SO42-, NO3, HNO3, and NH4+ obtained from EPA's CASTNet was examined using an objective, statistically based technique...

  13. Pathway analysis with next-generation sequencing data.

    PubMed

    Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao

    2015-04-01

    Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.

  14. Short communication: Discrimination between retail bovine milks with different fat contents using chemometrics and fatty acid profiling.

    PubMed

    Vargas-Bello-Pérez, Einar; Toro-Mujica, Paula; Enriquez-Hidalgo, Daniel; Fellenberg, María Angélica; Gómez-Cortés, Pilar

    2017-06-01

    We used a multivariate chemometric approach to differentiate or associate retail bovine milks with different fat contents and non-dairy beverages, using fatty acid profiles and statistical analysis. We collected samples of bovine milk (whole, semi-skim, and skim; n = 62) and non-dairy beverages (n = 27), and we analyzed them using gas-liquid chromatography. Principal component analysis of the fatty acid data yielded 3 significant principal components, which accounted for 72% of the total variance in the data set. Principal component 1 was related to saturated fatty acids (C4:0, C6:0, C8:0, C12:0, C14:0, C17:0, and C18:0) and monounsaturated fatty acids (C14:1 cis-9, C16:1 cis-9, C17:1 cis-9, and C18:1 trans-11); whole milk samples were clearly differentiated from the rest using this principal component. Principal component 2 differentiated semi-skim milk samples by n-3 fatty acid content (C20:3n-3, C20:5n-3, and C22:6n-3). Principal component 3 was related to C18:2 trans-9,trans-12 and C20:4n-6, and its lower scores were observed in skim milk and non-dairy beverages. A cluster analysis yielded 3 groups: group 1 consisted of only whole milk samples, group 2 was represented mainly by semi-skim milks, and group 3 included skim milk and non-dairy beverages. Overall, the present study showed that a multivariate chemometric approach is a useful tool for differentiating or associating retail bovine milks and non-dairy beverages using their fatty acid profile. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  15. Multivariate frequency domain analysis of protein dynamics

    NASA Astrophysics Data System (ADS)

    Matsunaga, Yasuhiro; Fuchigami, Sotaro; Kidera, Akinori

    2009-03-01

    Multivariate frequency domain analysis (MFDA) is proposed to characterize collective vibrational dynamics of protein obtained by a molecular dynamics (MD) simulation. MFDA performs principal component analysis (PCA) for a bandpass filtered multivariate time series using the multitaper method of spectral estimation. By applying MFDA to MD trajectories of bovine pancreatic trypsin inhibitor, we determined the collective vibrational modes in the frequency domain, which were identified by their vibrational frequencies and eigenvectors. At near zero temperature, the vibrational modes determined by MFDA agreed well with those calculated by normal mode analysis. At 300 K, the vibrational modes exhibited characteristic features that were considerably different from the principal modes of the static distribution given by the standard PCA. The influences of aqueous environments were discussed based on two different sets of vibrational modes, one derived from a MD simulation in water and the other from a simulation in vacuum. Using the varimax rotation, an algorithm of the multivariate statistical analysis, the representative orthogonal set of eigenmodes was determined at each vibrational frequency.

  16. Low-Dimensional Statistics of Anatomical Variability via Compact Representation of Image Deformations.

    PubMed

    Zhang, Miaomiao; Wells, William M; Golland, Polina

    2016-10-01

    Using image-based descriptors to investigate clinical hypotheses and therapeutic implications is challenging due to the notorious "curse of dimensionality" coupled with a small sample size. In this paper, we present a low-dimensional analysis of anatomical shape variability in the space of diffeomorphisms and demonstrate its benefits for clinical studies. To combat the high dimensionality of the deformation descriptors, we develop a probabilistic model of principal geodesic analysis in a bandlimited low-dimensional space that still captures the underlying variability of image data. We demonstrate the performance of our model on a set of 3D brain MRI scans from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Our model yields a more compact representation of group variation at substantially lower computational cost than models based on the high-dimensional state-of-the-art approaches such as tangent space PCA (TPCA) and probabilistic principal geodesic analysis (PPGA).

  17. Forensic age estimation by morphometric analysis of the manubrium from 3D MR images.

    PubMed

    Martínez Vera, Naira P; Höller, Johannes; Widek, Thomas; Neumayer, Bernhard; Ehammer, Thomas; Urschler, Martin

    2017-08-01

    Forensic age estimation research based on skeletal structures focuses on patterns of growth and development using different bones. In this work, our aim was to study growth-related evolution of the manubrium in living adolescents and young adults using magnetic resonance imaging (MRI), which is an image acquisition modality that does not involve ionizing radiation. In a first step, individual manubrium and subject features were correlated with age, which confirmed a statistically significant change of manubrium volume (M vol :p<0.01, R 2 ¯=0.50) and surface area (M sur :p<0.01, R 2 ¯=0.53) for the studied age range. Additionally, shapes of the manubria were for the first time investigated using principal component analysis. The decomposition of the data in principal components allowed to analyse the contribution of each component to total shape variation. With 13 principal components, ∼96% of shape variation could be described (M shp :p<0.01, R 2 ¯=0.60). Multiple linear regression analysis modelled the relationship between the statistically best correlated variables and age. Models including manubrium shape, volume or surface area divided by the height of the subject (Y∼M shp M sur /S h :p<0.01, R 2 ¯=0.71; Y∼M shp M vol /S h :p<0.01, R 2 ¯=0.72) presented a standard error of estimate of two years. In order to estimate the accuracy of these two manubrium-based age estimation models, cross validation experiments predicting age on held-out test sets were performed. Median absolute difference of predicted and known chronological age was 1.18 years for the best performing model (Y∼M shp M sur /S h :p<0.01, R p 2 =0.67). In conclusion, despite limitations in determining legal majority age, manubrium morphometry analysis presented statistically significant results for skeletal age estimation, which indicates that this bone structure may be considered as a new candidate in multi-factorial MRI-based age estimation. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Evidence of tampering in watermark identification

    NASA Astrophysics Data System (ADS)

    McLauchlan, Lifford; Mehrübeoglu, Mehrübe

    2009-08-01

    In this work, watermarks are embedded in digital images in the discrete wavelet transform (DWT) domain. Principal component analysis (PCA) is performed on the DWT coefficients. Next higher order statistics based on the principal components and the eigenvalues are determined for different sets of images. Feature sets are analyzed for different types of attacks in m dimensional space. The results demonstrate the separability of the features for the tampered digital copies. Different feature sets are studied to determine more effective tamper evident feature sets. The digital forensics, the probable manipulation(s) or modification(s) performed on the digital information can be identified using the described technique.

  19. Principal components of wrist circumduction from electromagnetic surgical tracking.

    PubMed

    Rasquinha, Brian J; Rainbow, Michael J; Zec, Michelle L; Pichora, David R; Ellis, Randy E

    2017-02-01

    An electromagnetic (EM) surgical tracking system was used for a functionally calibrated kinematic analysis of wrist motion. Circumduction motions were tested for differences in subject gender and for differences in the sense of the circumduction as clockwise or counter-clockwise motion. Twenty subjects were instrumented for EM tracking. Flexion-extension motion was used to identify the functional axis. Subjects performed unconstrained wrist circumduction in a clockwise and counter-clockwise sense. Data were decomposed into orthogonal flexion-extension motions and radial-ulnar deviation motions. PCA was used to concisely represent motions. Nonparametric Wilcoxon tests were used to distinguish the groups. Flexion-extension motions were projected onto a direction axis with a root-mean-square error of [Formula: see text]. Using the first three principal components, there was no statistically significant difference in gender (all [Formula: see text]). For motion sense, radial-ulnar deviation distinguished the sense of circumduction in the first principal component ([Formula: see text]) and in the third principal component ([Formula: see text]); flexion-extension distinguished the sense in the second principal component ([Formula: see text]). The clockwise sense of circumduction could be distinguished by a multifactorial combination of components; there were no gender differences in this small population. These data constitute a baseline for normal wrist circumduction. The multifactorial PCA findings suggest that a higher-dimensional method, such as manifold analysis, may be a more concise way of representing circumduction in human joints.

  20. Describing patterns of weight changes using principal components analysis: results from the Action for Health in Diabetes (Look AHEAD) research group.

    PubMed

    Espeland, Mark A; Bray, George A; Neiberg, Rebecca; Rejeski, W Jack; Knowler, William C; Lang, Wei; Cheskin, Lawrence J; Williamson, Don; Lewis, C Beth; Wing, Rena

    2009-10-01

    To demonstrate how principal components analysis can be used to describe patterns of weight changes in response to an intensive lifestyle intervention. Principal components analysis was applied to monthly percent weight changes measured on 2,485 individuals enrolled in the lifestyle arm of the Action for Health in Diabetes (Look AHEAD) clinical trial. These individuals were 45 to 75 years of age, with type 2 diabetes and body mass indices greater than 25 kg/m(2). Associations between baseline characteristics and weight loss patterns were described using analyses of variance. Three components collectively accounted for 97.0% of total intrasubject variance: a gradually decelerating weight loss (88.8%), early versus late weight loss (6.6%), and a mid-year trough (1.6%). In agreement with previous reports, each of the baseline characteristics we examined had statistically significant relationships with weight loss patterns. As examples, males tended to have a steeper trajectory of percent weight loss and to lose weight more quickly than women. Individuals with higher hemoglobin A(1c) (glycosylated hemoglobin; HbA(1c)) tended to have a flatter trajectory of percent weight loss and to have mid-year troughs in weight loss compared to those with lower HbA(1c). Principal components analysis provided a coherent description of characteristic patterns of weight changes and is a useful vehicle for identifying their correlates and potentially for predicting weight control outcomes.

  1. Breast Shape Analysis With Curvature Estimates and Principal Component Analysis for Cosmetic and Reconstructive Breast Surgery.

    PubMed

    Catanuto, Giuseppe; Taher, Wafa; Rocco, Nicola; Catalano, Francesca; Allegra, Dario; Milotta, Filippo Luigi Maria; Stanco, Filippo; Gallo, Giovanni; Nava, Maurizio Bruno

    2018-03-20

    Breast shape is defined utilizing mainly qualitative assessment (full, flat, ptotic) or estimates, such as volume or distances between reference points, that cannot describe it reliably. We will quantitatively describe breast shape with two parameters derived from a statistical methodology denominated principal component analysis (PCA). We created a heterogeneous dataset of breast shapes acquired with a commercial infrared 3-dimensional scanner on which PCA was performed. We plotted on a Cartesian plane the two highest values of PCA for each breast (principal components 1 and 2). Testing of the methodology on a preoperative and postoperative surgical case and test-retest was performed by two operators. The first two principal components derived from PCA are able to characterize the shape of the breast included in the dataset. The test-retest demonstrated that different operators are able to obtain very similar values of PCA. The system is also able to identify major changes in the preoperative and postoperative stages of a two-stage reconstruction. Even minor changes were correctly detected by the system. This methodology can reliably describe the shape of a breast. An expert operator and a newly trained operator can reach similar results in a test/re-testing validation. Once developed and after further validation, this methodology could be employed as a good tool for outcome evaluation, auditing, and benchmarking.

  2. Real-time monitoring of a coffee roasting process with near infrared spectroscopy using multivariate statistical analysis: A feasibility study.

    PubMed

    Catelani, Tiago A; Santos, João Rodrigo; Páscoa, Ricardo N M J; Pezza, Leonardo; Pezza, Helena R; Lopes, João A

    2018-03-01

    This work proposes the use of near infrared (NIR) spectroscopy in diffuse reflectance mode and multivariate statistical process control (MSPC) based on principal component analysis (PCA) for real-time monitoring of the coffee roasting process. The main objective was the development of a MSPC methodology able to early detect disturbances to the roasting process resourcing to real-time acquisition of NIR spectra. A total of fifteen roasting batches were defined according to an experimental design to develop the MSPC models. This methodology was tested on a set of five batches where disturbances of different nature were imposed to simulate real faulty situations. Some of these batches were used to optimize the model while the remaining was used to test the methodology. A modelling strategy based on a time sliding window provided the best results in terms of distinguishing batches with and without disturbances, resourcing to typical MSPC charts: Hotelling's T 2 and squared predicted error statistics. A PCA model encompassing a time window of four minutes with three principal components was able to efficiently detect all disturbances assayed. NIR spectroscopy combined with the MSPC approach proved to be an adequate auxiliary tool for coffee roasters to detect faults in a conventional roasting process in real-time. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. From Principal Component to Direct Coupling Analysis of Coevolution in Proteins: Low-Eigenvalue Modes are Needed for Structure Prediction

    PubMed Central

    Cocco, Simona; Monasson, Remi; Weigt, Martin

    2013-01-01

    Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. PMID:23990764

  4. Anomaly Detection in Gamma-Ray Vehicle Spectra with Principal Components Analysis and Mahalanobis Distances

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tardiff, Mark F.; Runkle, Robert C.; Anderson, K. K.

    2006-01-23

    The goal of primary radiation monitoring in support of routine screening and emergency response is to detect characteristics in vehicle radiation signatures that indicate the presence of potential threats. Two conceptual approaches to analyzing gamma-ray spectra for threat detection are isotope identification and anomaly detection. While isotope identification is the time-honored method, an emerging technique is anomaly detection that uses benign vehicle gamma ray signatures to define an expectation of the radiation signature for vehicles that do not pose a threat. Newly acquired spectra are then compared to this expectation using statistical criteria that reflect acceptable false alarm rates andmore » probabilities of detection. The gamma-ray spectra analyzed here were collected at a U.S. land Port of Entry (POE) using a NaI-based radiation portal monitor (RPM). The raw data were analyzed to develop a benign vehicle expectation by decimating the original pulse-height channels to 35 energy bins, extracting composite variables via principal components analysis (PCA), and estimating statistically weighted distances from the mean vehicle spectrum with the mahalanobis distance (MD) metric. This paper reviews the methods used to establish the anomaly identification criteria and presents a systematic analysis of the response of the combined PCA and MD algorithm to modeled mono-energetic gamma-ray sources.« less

  5. Which statistics should tropical biologists learn?

    PubMed

    Loaiza Velásquez, Natalia; González Lutz, María Isabel; Monge-Nájera, Julián

    2011-09-01

    Tropical biologists study the richest and most endangered biodiversity in the planet, and in these times of climate change and mega-extinctions, the need for efficient, good quality research is more pressing than in the past. However, the statistical component in research published by tropical authors sometimes suffers from poor quality in data collection; mediocre or bad experimental design and a rigid and outdated view of data analysis. To suggest improvements in their statistical education, we listed all the statistical tests and other quantitative analyses used in two leading tropical journals, the Revista de Biología Tropical and Biotropica, during a year. The 12 most frequent tests in the articles were: Analysis of Variance (ANOVA), Chi-Square Test, Student's T Test, Linear Regression, Pearson's Correlation Coefficient, Mann-Whitney U Test, Kruskal-Wallis Test, Shannon's Diversity Index, Tukey's Test, Cluster Analysis, Spearman's Rank Correlation Test and Principal Component Analysis. We conclude that statistical education for tropical biologists must abandon the old syllabus based on the mathematical side of statistics and concentrate on the correct selection of these and other procedures and tests, on their biological interpretation and on the use of reliable and friendly freeware. We think that their time will be better spent understanding and protecting tropical ecosystems than trying to learn the mathematical foundations of statistics: in most cases, a well designed one-semester course should be enough for their basic requirements.

  6. Prediction of Knee Joint Contact Forces From External Measures Using Principal Component Prediction and Reconstruction.

    PubMed

    Saliba, Christopher M; Clouthier, Allison L; Brandon, Scott C E; Rainbow, Michael J; Deluzio, Kevin J

    2018-05-29

    Abnormal loading of the knee joint contributes to the pathogenesis of knee osteoarthritis. Gait retraining is a non-invasive intervention that aims to reduce knee loads by providing audible, visual, or haptic feedback of gait parameters. The computational expense of joint contact force prediction has limited real-time feedback to surrogate measures of the contact force, such as the knee adduction moment. We developed a method to predict knee joint contact forces using motion analysis and a statistical regression model that can be implemented in near real-time. Gait waveform variables were deconstructed using principal component analysis and a linear regression was used to predict the principal component scores of the contact force waveforms. Knee joint contact force waveforms were reconstructed using the predicted scores. We tested our method using a heterogenous population of asymptomatic controls and subjects with knee osteoarthritis. The reconstructed contact force waveforms had mean (SD) RMS differences of 0.17 (0.05) bodyweight compared to the contact forces predicted by a musculoskeletal model. Our method successfully predicted subject-specific shape features of contact force waveforms and is a potentially powerful tool in biofeedback and clinical gait analysis.

  7. Nutrition Education in Public Elementary and Secondary Schools. Statistical Analysis Report. Fast Response Survey System (FRSS).

    ERIC Educational Resources Information Center

    Celebuski, Carin; Farris, Elizabeth

    This report presents the findings from the "Nutrition Education in Public Schools, K-12" survey that was designed to provide data on the status of nutrition education in U.S. public schools. Questionnaires were sent to 1,000 school principals of a nationally representative sample of U.S. elementary, middle, and high schools. The survey…

  8. Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)

    NASA Astrophysics Data System (ADS)

    Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

    2017-05-01

    Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.

  9. Statistical learning and selective inference.

    PubMed

    Taylor, Jonathan; Tibshirani, Robert J

    2015-06-23

    We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.

  10. Analysis and assessment on heavy metal sources in the coastal soils developed from alluvial deposits using multivariate statistical methods.

    PubMed

    Li, Jinling; He, Ming; Han, Wei; Gu, Yifan

    2009-05-30

    An investigation on heavy metal sources, i.e., Cu, Zn, Ni, Pb, Cr, and Cd in the coastal soils of Shanghai, China, was conducted using multivariate statistical methods (principal component analysis, clustering analysis, and correlation analysis). All the results of the multivariate analysis showed that: (i) Cu, Ni, Pb, and Cd had anthropogenic sources (e.g., overuse of chemical fertilizers and pesticides, industrial and municipal discharges, animal wastes, sewage irrigation, etc.); (ii) Zn and Cr were associated with parent materials and therefore had natural sources (e.g., the weathering process of parent materials and subsequent pedo-genesis due to the alluvial deposits). The effect of heavy metals in the soils was greatly affected by soil formation, atmospheric deposition, and human activities. These findings provided essential information on the possible sources of heavy metals, which would contribute to the monitoring and assessment process of agricultural soils in worldwide regions.

  11. Identification of the isomers using principal component analysis (PCA) method

    NASA Astrophysics Data System (ADS)

    Kepceoǧlu, Abdullah; Gündoǧdu, Yasemin; Ledingham, Kenneth William David; Kilic, Hamdi Sukur

    2016-03-01

    In this work, we have carried out a detailed statistical analysis for experimental data of mass spectra from xylene isomers. Principle Component Analysis (PCA) was used to identify the isomers which cannot be distinguished using conventional statistical methods for interpretation of their mass spectra. Experiments have been carried out using a linear TOF-MS coupled to a femtosecond laser system as an energy source for the ionisation processes. We have performed experiments and collected data which has been analysed and interpreted using PCA as a multivariate analysis of these spectra. This demonstrates the strength of the method to get an insight for distinguishing the isomers which cannot be identified using conventional mass analysis obtained through dissociative ionisation processes on these molecules. The PCA results dependending on the laser pulse energy and the background pressure in the spectrometers have been presented in this work.

  12. Rapid analysis of pharmaceutical drugs using LIBS coupled with multivariate analysis.

    PubMed

    Tiwari, P K; Awasthi, S; Kumar, R; Anand, R K; Rai, P K; Rai, A K

    2018-02-01

    Type 2 diabetes drug tablets containing voglibose having dose strengths of 0.2 and 0.3 mg of various brands have been examined, using laser-induced breakdown spectroscopy (LIBS) technique. The statistical methods such as the principal component analysis (PCA) and the partial least square regression analysis (PLSR) have been employed on LIBS spectral data for classifying and developing the calibration models of drug samples. We have developed the ratio-based calibration model applying PLSR in which relative spectral intensity ratios H/C, H/N and O/N are used. Further, the developed model has been employed to predict the relative concentration of element in unknown drug samples. The experiment has been performed in air and argon atmosphere, respectively, and the obtained results have been compared. The present model provides rapid spectroscopic method for drug analysis with high statistical significance for online control and measurement process in a wide variety of pharmaceutical industrial applications.

  13. Principal components analysis of an evaluation of the hemiplegic subject based on the Bobath approach.

    PubMed

    Corriveau, H; Arsenault, A B; Dutil, E; Lepage, Y

    1992-01-01

    An evaluation based on the Bobath approach to treatment has previously been developed and partially validated. The purpose of the present study was to verify the content validity of this evaluation with the use of a statistical approach known as principal components analysis. Thirty-eight hemiplegic subjects participated in the study. Analysis of the scores on each of six parameters (sensorium, active movements, muscle tone, reflex activity, postural reactions, and pain) was evaluated on three occasions across a 2-month period. Each time this produced three factors that contained 70% of the variation in the data set. The first component mainly reflected variations in mobility, the second mainly variations in muscle tone, and the third mainly variations in sensorium and pain. The results of such exploratory analysis highlight the fact that some of the parameters are not only important but also interrelated. These results seem to partially support the conceptual framework substantiating the Bobath approach to treatment.

  14. Statistical analysis of fNIRS data: a comprehensive review.

    PubMed

    Tak, Sungho; Ye, Jong Chul

    2014-01-15

    Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.

  15. Statistical analysis of aerosol species, trace gasses, and meteorology in Chicago.

    PubMed

    Binaku, Katrina; O'Brien, Timothy; Schmeling, Martina; Fosco, Tinamarie

    2013-09-01

    Both canonical correlation analysis (CCA) and principal component analysis (PCA) were applied to atmospheric aerosol and trace gas concentrations and meteorological data collected in Chicago during the summer months of 2002, 2003, and 2004. Concentrations of ammonium, calcium, nitrate, sulfate, and oxalate particulate matter, as well as, meteorological parameters temperature, wind speed, wind direction, and humidity were subjected to CCA and PCA. Ozone and nitrogen oxide mixing ratios were also included in the data set. The purpose of statistical analysis was to determine the extent of existing linear relationship(s), or lack thereof, between meteorological parameters and pollutant concentrations in addition to reducing dimensionality of the original data to determine sources of pollutants. In CCA, the first three canonical variate pairs derived were statistically significant at the 0.05 level. Canonical correlation between the first canonical variate pair was 0.821, while correlations of the second and third canonical variate pairs were 0.562 and 0.461, respectively. The first canonical variate pair indicated that increasing temperatures resulted in high ozone mixing ratios, while the second canonical variate pair showed wind speed and humidity's influence on local ammonium concentrations. No new information was uncovered in the third variate pair. Canonical loadings were also interpreted for information regarding relationships between data sets. Four principal components (PCs), expressing 77.0 % of original data variance, were derived in PCA. Interpretation of PCs suggested significant production and/or transport of secondary aerosols in the region (PC1). Furthermore, photochemical production of ozone and wind speed's influence on pollutants were expressed (PC2) along with overall measure of local meteorology (PC3). In summary, CCA and PCA results combined were successful in uncovering linear relationships between meteorology and air pollutants in Chicago and aided in determining possible pollutant sources.

  16. Quantitative analysis of fetal facial morphology using 3D ultrasound and statistical shape modeling: a feasibility study.

    PubMed

    Dall'Asta, Andrea; Schievano, Silvia; Bruse, Jan L; Paramasivam, Gowrishankar; Kaihura, Christine Tita; Dunaway, David; Lees, Christoph C

    2017-07-01

    The antenatal detection of facial dysmorphism using 3-dimensional ultrasound may raise the suspicion of an underlying genetic condition but infrequently leads to a definitive antenatal diagnosis. Despite advances in array and noninvasive prenatal testing, not all genetic conditions can be ascertained from such testing. The aim of this study was to investigate the feasibility of quantitative assessment of fetal face features using prenatal 3-dimensional ultrasound volumes and statistical shape modeling. STUDY DESIGN: Thirteen normal and 7 abnormal stored 3-dimensional ultrasound fetal face volumes were analyzed, at a median gestation of 29 +4  weeks (25 +0 to 36 +1 ). The 20 3-dimensional surface meshes generated were aligned and served as input for a statistical shape model, which computed the mean 3-dimensional face shape and 3-dimensional shape variations using principal component analysis. Ten shape modes explained more than 90% of the total shape variability in the population. While the first mode accounted for overall size differences, the second highlighted shape feature changes from an overall proportionate toward a more asymmetric face shape with a wide prominent forehead and an undersized, posteriorly positioned chin. Analysis of the Mahalanobis distance in principal component analysis shape space suggested differences between normal and abnormal fetuses (median and interquartile range distance values, 7.31 ± 5.54 for the normal group vs 13.27 ± 9.82 for the abnormal group) (P = .056). This feasibility study demonstrates that objective characterization and quantification of fetal facial morphology is possible from 3-dimensional ultrasound. This technique has the potential to assist in utero diagnosis, particularly of rare conditions in which facial dysmorphology is a feature. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra.

    PubMed

    Shin, S M; Kim, Y-I; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

    2015-01-01

    To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. The sample included 24 female and 19 male patients with hand-wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index.

  18. The skeletal maturation status estimated by statistical shape analysis: axial images of Japanese cervical vertebra

    PubMed Central

    Shin, S M; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B

    2015-01-01

    Objectives: To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. Methods: The sample included 24 female and 19 male patients with hand–wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Results: Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Conclusions: Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index. PMID:25411713

  19. A randomised trial of adaptive pacing therapy, cognitive behaviour therapy, graded exercise, and specialist medical care for chronic fatigue syndrome (PACE): statistical analysis plan

    PubMed Central

    2013-01-01

    Background The publication of protocols by medical journals is increasingly becoming an accepted means for promoting good quality research and maximising transparency. Recently, Finfer and Bellomo have suggested the publication of statistical analysis plans (SAPs).The aim of this paper is to make public and to report in detail the planned analyses that were approved by the Trial Steering Committee in May 2010 for the principal papers of the PACE (Pacing, graded Activity, and Cognitive behaviour therapy: a randomised Evaluation) trial, a treatment trial for chronic fatigue syndrome. It illustrates planned analyses of a complex intervention trial that allows for the impact of clustering by care providers, where multiple care-providers are present for each patient in some but not all arms of the trial. Results The trial design, objectives and data collection are reported. Considerations relating to blinding, samples, adherence to the protocol, stratification, centre and other clustering effects, missing data, multiplicity and compliance are described. Descriptive, interim and final analyses of the primary and secondary outcomes are then outlined. Conclusions This SAP maximises transparency, providing a record of all planned analyses, and it may be a resource for those who are developing SAPs, acting as an illustrative example for teaching and methodological research. It is not the sum of the statistical analysis sections of the principal papers, being completed well before individual papers were drafted. Trial registration ISRCTN54285094 assigned 22 May 2003; First participant was randomised on 18 March 2005. PMID:24225069

  20. Statistical classification of hydrogeologic regions in the fractured rock area of Maryland and parts of the District of Columbia, Virginia, West Virginia, Pennsylvania, and Delaware

    USGS Publications Warehouse

    Fleming, Brandon J.; LaMotte, Andrew E.; Sekellick, Andrew J.

    2013-01-01

    Hydrogeologic regions in the fractured rock area of Maryland were classified using geographic information system tools with principal components and cluster analyses. A study area consisting of the 8-digit Hydrologic Unit Code (HUC) watersheds with rivers that flow through the fractured rock area of Maryland and bounded by the Fall Line was further subdivided into 21,431 catchments from the National Hydrography Dataset Plus. The catchments were then used as a common hydrologic unit to compile relevant climatic, topographic, and geologic variables. A principal components analysis was performed on 10 input variables, and 4 principal components that accounted for 83 percent of the variability in the original data were identified. A subsequent cluster analysis grouped the catchments based on four principal component scores into six hydrogeologic regions. Two crystalline rock hydrogeologic regions, including large parts of the Washington, D.C. and Baltimore metropolitan regions that represent over 50 percent of the fractured rock area of Maryland, are distinguished by differences in recharge, Precipitation minus Potential Evapotranspiration, sand content in soils, and groundwater contributions to streams. This classification system will provide a georeferenced digital hydrogeologic framework for future investigations of groundwater availability in the fractured rock area of Maryland.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bennett, Janine Camille; Thompson, David; Pebay, Philippe Pierre

    Statistical analysis is typically used to reduce the dimensionality of and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. Many statistical techniques, e.g., descriptive statistics or principal component analysis, are based on moments and co-moments and, using robust online update formulas, can be computed in an embarrassingly parallel manner, amenable to a map-reduce style implementation. In this paper we focus on contingency tables, through which numerous derived statistics such as joint and marginal probability, point-wise mutual information, information entropy,more » and {chi}{sup 2} independence statistics can be directly obtained. However, contingency tables can become large as data size increases, requiring a correspondingly large amount of communication between processors. This potential increase in communication prevents optimal parallel speedup and is the main difference with moment-based statistics (which we discussed in [1]) where the amount of inter-processor communication is independent of data size. Here we present the design trade-offs which we made to implement the computation of contingency tables in parallel. We also study the parallel speedup and scalability properties of our open source implementation. In particular, we observe optimal speed-up and scalability when the contingency statistics are used in their appropriate context, namely, when the data input is not quasi-diffuse.« less

  2. Quantitation of flavonoid constituents in citrus fruits.

    PubMed

    Kawaii, S; Tomono, Y; Katase, E; Ogawa, K; Yano, M

    1999-09-01

    Twenty-four flavonoids have been determined in 66 Citrus species and near-citrus relatives, grown in the same field and year, by means of reversed phase high-performance liquid chromatography analysis. Statistical methods have been applied to find relations among the species. The F ratios of 21 flavonoids obtained by applying ANOVA analysis are significant, indicating that a classification of the species using these variables is reasonable to pursue. Principal component analysis revealed that the distributions of Citrus species belonging to different classes were largely in accordance with Tanaka's classification system.

  3. Principal Attrition and Mobility: Results from the 2008-09 Principal Follow-Up Survey. First Look. NCES 2010-337

    ERIC Educational Resources Information Center

    Battle, Danielle

    2010-01-01

    While the National Center for Education Statistics (NCES) has conducted surveys of attrition and mobility among school teachers for two decades, little was known about similar movements among school principals. In order to inform discussions and decisions among policymakers, researchers, and parents, the 2008-09 Principal Follow-up Survey (PFS)…

  4. Contribution of botanical origin and sugar composition of honeys on the crystallization phenomenon.

    PubMed

    Escuredo, Olga; Dobre, Irina; Fernández-González, María; Seijo, M Carmen

    2014-04-15

    The present work provides information regarding the statistical relationships among the palynological characteristics, sugars (fructose, glucose, sucrose, melezitose and maltose), moisture content and sugar ratios (F+G, F/G and G/W) of 136 different honey types (including bramble, chestnut, eucalyptus, heather, acacia, lime, rape, sunflower and honeydew). Results of the statistical analyses (multiple comparison Bonferroni test, Spearman rank correlations and principal components) revealed the valuable significance of the botanical origin on the sugar ratios (F+G, F/G and G/W). Brassica napus and Helianthus annuus pollen were the variables situated near F+G and G/W ratio, while Castanea sativa, Rubus and Eucalyptus pollen were located further away, as shown in the principal component analysis. The F/G ratio of sunflower, rape and lime honeys were lower than those found for the chestnut, eucalyptus, heather, acacia and honeydew honeys (>1.4). A lower value F/G ratio and lower water content were related with a faster crystallization in the honey. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. Cross-visit tumor sub-segmentation and registration with outlier rejection for dynamic contrast-enhanced MRI time series data.

    PubMed

    Buonaccorsi, G A; Rose, C J; O'Connor, J P B; Roberts, C; Watson, Y; Jackson, A; Jayson, G C; Parker, G J M

    2010-01-01

    Clinical trials of anti-angiogenic and vascular-disrupting agents often use biomarkers derived from DCE-MRI, typically reporting whole-tumor summary statistics and so overlooking spatial parameter variations caused by tissue heterogeneity. We present a data-driven segmentation method comprising tracer-kinetic model-driven registration for motion correction, conversion from MR signal intensity to contrast agent concentration for cross-visit normalization, iterative principal components analysis for imputation of missing data and dimensionality reduction, and statistical outlier detection using the minimum covariance determinant to obtain a robust Mahalanobis distance. After applying these techniques we cluster in the principal components space using k-means. We present results from a clinical trial of a VEGF inhibitor, using time-series data selected because of problems due to motion and outlier time series. We obtained spatially-contiguous clusters that map to regions with distinct microvascular characteristics. This methodology has the potential to uncover localized effects in trials using DCE-MRI-based biomarkers.

  6. School food policies and practices: a state-wide survey of secondary school principals.

    PubMed

    French, Simone A; Story, Mary; Fulkerson, Jayne A

    2002-12-01

    To describe food-related policies and practices in secondary schools in Minnesota. Mailed anonymous survey including questions about the secondary school food environment and food-related practices and policies. Members of a statewide professional organization for secondary school principals (n = 610; response rate: 463/610 = 75%). Of the 463 surveys returned, 336 met the eligibility criteria (current position was either principal or assistant principal and school included at least one of the grades of 9 through 12). Descriptive statistics examined the prevalence of specific policies and practices. Chi2 analysis examined associations between policies and practices and school variables. Among principals, 65% believed it was important to have a nutrition policy for the high school; however, only 32% reported a policy at their school. Principals reported positive attitudes about providing a healthful school food environment, but 98% of the schools had soft drink vending machines and 77% had contracts with soft drink companies. Food sold at school fundraisers was most often candy, fruit, and cookies. Dietetics professionals who work in secondary school settings should collaborate with other key school staff members and parents to develop and implement a comprehensive school nutrition policy. Such a policy could foster a school food environment that is supportive of healthful food choices among youth.

  7. Spatial and temporal variability of hyperspectral signatures of terrain

    NASA Astrophysics Data System (ADS)

    Jones, K. F.; Perovich, D. K.; Koenig, G. G.

    2008-04-01

    Electromagnetic signatures of terrain exhibit significant spatial heterogeneity on a range of scales as well as considerable temporal variability. A statistical characterization of the spatial heterogeneity and spatial scaling algorithms of terrain electromagnetic signatures are required to extrapolate measurements to larger scales. Basic terrain elements including bare soil, grass, deciduous, and coniferous trees were studied in a quasi-laboratory setting using instrumented test sites in Hanover, NH and Yuma, AZ. Observations were made using a visible and near infrared spectroradiometer (350 - 2500 nm) and hyperspectral camera (400 - 1100 nm). Results are reported illustrating: i) several difference scenes; ii) a terrain scene time series sampled over an annual cycle; and iii) the detection of artifacts in scenes. A principal component analysis indicated that the first three principal components typically explained between 90 and 99% of the variance of the 30 to 40-channel hyperspectral images. Higher order principal components of hyperspectral images are useful for detecting artifacts in scenes.

  8. Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series

    NASA Technical Reports Server (NTRS)

    Vautard, R.; Ghil, M.

    1989-01-01

    Two dimensions of a dynamical system given by experimental time series are distinguished. Statistical dimension gives a theoretical upper bound for the minimal number of degrees of freedom required to describe the attractor up to the accuracy of the data, taking into account sampling and noise problems. The dynamical dimension is the intrinsic dimension of the attractor and does not depend on the quality of the data. Singular Spectrum Analysis (SSA) provides estimates of the statistical dimension. SSA also describes the main physical phenomena reflected by the data. It gives adaptive spectral filters associated with the dominant oscillations of the system and clarifies the noise characteristics of the data. SSA is applied to four paleoclimatic records. The principal climatic oscillations and the regime changes in their amplitude are detected. About 10 degrees of freedom are statistically significant in the data. Large noise and insufficient sample length do not allow reliable estimates of the dynamical dimension.

  9. A Comparison of Analytical and Data Preprocessing Methods for Spectral Fingerprinting

    PubMed Central

    LUTHRIA, DEVANAND L.; MUKHOPADHYAY, SUDARSAN; LIN, LONG-ZE; HARNLY, JAMES M.

    2013-01-01

    Spectral fingerprinting, as a method of discriminating between plant cultivars and growing treatments for a common set of broccoli samples, was compared for six analytical instruments. Spectra were acquired for finely powdered solid samples using Fourier transform infrared (FT-IR) and Fourier transform near-infrared (NIR) spectrometry. Spectra were also acquired for unfractionated aqueous methanol extracts of the powders using molecular absorption in the ultraviolet (UV) and visible (VIS) regions and mass spectrometry with negative (MS−) and positive (MS+) ionization. The spectra were analyzed using nested one-way analysis of variance (ANOVA) and principal component analysis (PCA) to statistically evaluate the quality of discrimination. All six methods showed statistically significant differences between the cultivars and treatments. The significance of the statistical tests was improved by the judicious selection of spectral regions (IR and NIR), masses (MS+ and MS−), and derivatives (IR, NIR, UV, and VIS). PMID:21352644

  10. Baseline estimation in flame's spectra by using neural networks and robust statistics

    NASA Astrophysics Data System (ADS)

    Garces, Hugo; Arias, Luis; Rojas, Alejandro

    2014-09-01

    This work presents a baseline estimation method in flame spectra based on artificial intelligence structure as a neural network, combining robust statistics with multivariate analysis to automatically discriminate measured wavelengths belonging to continuous feature for model adaptation, surpassing restriction of measuring target baseline for training. The main contributions of this paper are: to analyze a flame spectra database computing Jolliffe statistics from Principal Components Analysis detecting wavelengths not correlated with most of the measured data corresponding to baseline; to systematically determine the optimal number of neurons in hidden layers based on Akaike's Final Prediction Error; to estimate baseline in full wavelength range sampling measured spectra; and to train an artificial intelligence structure as a Neural Network which allows to generalize the relation between measured and baseline spectra. The main application of our research is to compute total radiation with baseline information, allowing to diagnose combustion process state for optimization in early stages.

  11. Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks.

    PubMed

    Nariai, N; Kim, S; Imoto, S; Miyano, S

    2004-01-01

    We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.

  12. Nature of Driving Force for Protein Folding: A Result From Analyzing the Statistical Potential

    NASA Astrophysics Data System (ADS)

    Li, Hao; Tang, Chao; Wingreen, Ned S.

    1997-07-01

    In a statistical approach to protein structure analysis, Miyazawa and Jernigan derived a 20×20 matrix of inter-residue contact energies between different types of amino acids. Using the method of eigenvalue decomposition, we find that the Miyazawa-Jernigan matrix can be accurately reconstructed from its first two principal component vectors as Mij = C0+C1\\(qi+qj\\)+C2qiqj, with constant C's, and 20 q values associated with the 20 amino acids. This regularity is due to hydrophobic interactions and a force of demixing, the latter obeying Hildebrand's solubility theory of simple liquids.

  13. Evaluating filterability of different types of sludge by statistical analysis: The role of key organic compounds in extracellular polymeric substances.

    PubMed

    Xiao, Keke; Chen, Yun; Jiang, Xie; Zhou, Yan

    2017-03-01

    An investigation was conducted for 20 different types of sludge in order to identify the key organic compounds in extracellular polymeric substances (EPS) that are important in assessing variations of sludge filterability. The different types of sludge varied in initial total solids (TS) content, organic composition and pre-treatment methods. For instance, some of the sludges were pre-treated by acid, ultrasonic, thermal, alkaline, or advanced oxidation technique. The Pearson's correlation results showed significant correlations between sludge filterability and zeta potential, pH, dissolved organic carbon, protein and polysaccharide in soluble EPS (SB EPS), loosely bound EPS (LB EPS) and tightly bound EPS (TB EPS). The principal component analysis (PCA) method was used to further explore correlations between variables and similarities among EPS fractions of different types of sludge. Two principal components were extracted: principal component 1 accounted for 59.24% of total EPS variations, while principal component 2 accounted for 25.46% of total EPS variations. Dissolved organic carbon, protein and polysaccharide in LB EPS showed higher eigenvector projection values than the corresponding compounds in SB EPS and TB EPS in principal component 1. Further characterization of fractionized key organic compounds in LB EPS was conducted with size-exclusion chromatography-organic carbon detection-organic nitrogen detection (LC-OCD-OND). A numerical multiple linear regression model was established to describe relationship between organic compounds in LB EPS and sludge filterability. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. A Methodology for the Parametric Reconstruction of Non-Steady and Noisy Meteorological Time Series

    NASA Astrophysics Data System (ADS)

    Rovira, F.; Palau, J. L.; Millán, M.

    2009-09-01

    Climatic and meteorological time series often show some persistence (in time) in the variability of certain features. One could regard annual, seasonal and diurnal time variability as trivial persistence in the variability of some meteorological magnitudes (as, e.g., global radiation, air temperature above surface, etc.). In these cases, the traditional Fourier transform into frequency space will show the principal harmonics as the components with the largest amplitude. Nevertheless, meteorological measurements often show other non-steady (in time) variability. Some fluctuations in measurements (at different time scales) are driven by processes that prevail on some days (or months) of the year but disappear on others. By decomposing a time series into time-frequency space through the continuous wavelet transformation, one is able to determine both the dominant modes of variability and how those modes vary in time. This study is based on a numerical methodology to analyse non-steady principal harmonics in noisy meteorological time series. This methodology combines both the continuous wavelet transform and the development of a parametric model that includes the time evolution of the principal and the most statistically significant harmonics of the original time series. The parameterisation scheme proposed in this study consists of reproducing the original time series by means of a statistically significant finite sum of sinusoidal signals (waves), each defined by using the three usual parameters: amplitude, frequency and phase. To ensure the statistical significance of the parametric reconstruction of the original signal, we propose a standard statistical t-student analysis of the confidence level of the amplitude in the parametric spectrum for the different wave components. Once we have assured the level of significance of the different waves composing the parametric model, we can obtain the statistically significant principal harmonics (in time) of the original time series by using the Fourier transform of the modelled signal. Acknowledgements The CEAM Foundation is supported by the Generalitat Valenciana and BANCAIXA (València, Spain). This study has been partially funded by the European Commission (FP VI, Integrated Project CIRCE - No. 036961) and by the Ministerio de Ciencia e Innovación, research projects "TRANSREG” (CGL2007-65359/CLI) and "GRACCIE” (CSD2007-00067, Program CONSOLIDER-INGENIO 2010).

  15. Differential use of fresh water environments by wintering waterfowl of coastal Texas

    USGS Publications Warehouse

    White, D.H.; James, D.

    1978-01-01

    A comparative study of the environmental relationships among 14 species of wintering waterfowl was conducted at the Welder Wildlife Foundation, San Patricia County, near Sinton, Texas during the fall and early winter of 1973. Measurements of 20 environmental factors (social, vegetational, physical, and chemical) were subjected to multivariate statistical methods to determine certain niche characteristics and environmental relationships of waterfowl wintering in the aquatic community.....Each waterfowl species occupied a unique realized niche by responding to distinct combinations of environmental factors identified by principal component analysis. One percent confidence ellipses circumscribing the mean scores plotted for the first and second principal components gave an indication of relative niche width for each species. The waterfowl environments were significantly different interspecifically and water depth at feeding site and % emergent vegetation were most important in the separation. This was shown by subjecting the transformed data to multivariate analysis of variance with an associated step-down procedure. The species were distributed along a community cline extending from shallow water with abundant emergent vegetation to open deep water with little emergent vegetation of any kind. Four waterfowl subgroups were significantly separated along the cline, as indicated by one-way analysis of variance with Duncan?s multiple range test. Clumping of the bird species toward the middle of the available habitat hyperspace was shown in a plot of the principal component scores for the random samples and individual species.....Naturally occurring relationships among waterfowl were clarified using principal comcomponent analysis and related multivariate procedures. These techniques may prove useful in wetland management for particular groups of waterfowl based on habitat preferences.

  16. Preparing Principals To Lead in the New Millennium: A Response to the Leadership Crisis in American Schools.

    ERIC Educational Resources Information Center

    Chirichello, Michael

    There are about 80,000 public school principals in the United States. The Bureau of Labor Statistics estimates there will be a 10 percent increase in the employment of educational administrators of all types through 2006. The National Association of Elementary School Principals estimates that more than 40 percent of principals will retire or leave…

  17. Multivariate statistical analysis of the polyphenolic constituents in kiwifruit juices to trace fruit varieties and geographical origins.

    PubMed

    Guo, Jing; Yuan, Yahong; Dou, Pei; Yue, Tianli

    2017-10-01

    Fifty-one kiwifruit juice samples of seven kiwifruit varieties from five regions in China were analyzed to determine their polyphenols contents and to trace fruit varieties and geographical origins by multivariate statistical analysis. Twenty-one polyphenols belonging to four compound classes were determined by ultra-high-performance liquid chromatography coupled with ultra-high-resolution TOF mass spectrometry. (-)-Epicatechin, (+)-catechin, procyanidin B1 and caffeic acid derivatives were the predominant phenolic compounds in the juices. Principal component analysis (PCA) allowed a clear separation of the juices according to kiwifruit varieties. Stepwise linear discriminant analysis (SLDA) yielded satisfactory categorization of samples, provided 100% success rate according to kiwifruit varieties and 92.2% success rate according to geographical origins. The result showed that polyphenolic profiles of kiwifruit juices contain enough information to trace fruit varieties and geographical origins. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Principal Curves and Surfaces

    DTIC Science & Technology

    1984-11-01

    welL The subipace is found by using the usual linear eigenv’ctor solution in th3 new enlarged space. This technique was first suggested by Gnanadesikan ...Wilk (1966, 1968), and a good description can be found in Gnanadesikan (1977). They suggested using polynomial functions’ of the original p co...Heidelberg, Springer Ver- lag. Gnanadesikan , R. (1977), Methods for Statistical Data Analysis of Multivariate Observa- tions, Wiley, New York

  19. The Global Oscillation Network Group site survey. 1: Data collection and analysis methods

    NASA Technical Reports Server (NTRS)

    Hill, Frank; Fischer, George; Grier, Jennifer; Leibacher, John W.; Jones, Harrison B.; Jones, Patricia P.; Kupke, Renate; Stebbins, Robin T.

    1994-01-01

    The Global Oscillation Network Group (GONG) Project is planning to place a set of instruments around the world to observe solar oscillations as continuously as possible for at least three years. The Project has now chosen the sites that will comprise the network. This paper describes the methods of data collection and analysis that were used to make this decision. Solar irradiance data were collected with a one-minute cadence at fifteen sites around the world and analyzed to produce statistics of cloud cover, atmospheric extinction, and transparency power spectra at the individual sites. Nearly 200 reasonable six-site networks were assembled from the individual stations, and a set of statistical measures of the performance of the networks was analyzed using a principal component analysis. An accompanying paper presents the results of the survey.

  20. Computerized system for assessing heart rate variability.

    PubMed

    Frigy, A; Incze, A; Brânzaniuc, E; Cotoi, S

    1996-01-01

    The principal theoretical, methodological and clinical aspects of heart rate variability (HRV) analysis are reviewed. This method has been developed over the last 10 years as a useful noninvasive method of measuring the activity of the autonomic nervous system. The main components and the functioning of the computerized rhythm-analyzer system developed by our team are presented. The system is able to perform short-term (maximum 20 minutes) time domain HRV analysis and statistical analysis of the ventricular rate in any rhythm, particularly in atrial fibrillation. The performances of our system are demonstrated by using the graphics (RR histograms, delta RR histograms, RR scattergrams) and the statistical parameters resulted from the processing of three ECG recordings. These recordings are obtained from a normal subject, from a patient with advanced heart failure, and from a patient with atrial fibrillation.

  1. Integration of statistical and physiological analyses of adaptation of near-isogenic barley lines.

    PubMed

    Romagosa, I; Fox, P N; García Del Moral, L F; Ramos, J M; García Del Moral, B; Roca de Togores, F; Molina-Cano, J L

    1993-08-01

    Seven near-isogenic barley lines, differing for three independent mutant genes, were grown in 15 environments in Spain. Genotype x environment interaction (G x E) for grain yield was examined with the Additive Main Effects and Multiplicative interaction (AMMI) model. The results of this statistical analysis of multilocation yield-data were compared with a morpho-physiological characterization of the lines at two sites (Molina-Cano et al. 1990). The first two principal component axes from the AMMI analysis were strongly associated with the morpho-physiological characters. The independent but parallel discrimination among genotypes reflects genetic differences and highlights the power of the AMMI analysis as a tool to investigate G x E. Characters which appear to be positively associated with yield in the germplasm under study could be identified for some environments.

  2. Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science).

    PubMed

    Zeng, Irene Sui Lan; Lumley, Thomas

    2018-01-01

    Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.

  3. Detecting transitions in protein dynamics using a recurrence quantification analysis based bootstrap method.

    PubMed

    Karain, Wael I

    2017-11-28

    Proteins undergo conformational transitions over different time scales. These transitions are closely intertwined with the protein's function. Numerous standard techniques such as principal component analysis are used to detect these transitions in molecular dynamics simulations. In this work, we add a new method that has the ability to detect transitions in dynamics based on the recurrences in the dynamical system. It combines bootstrapping and recurrence quantification analysis. We start from the assumption that a protein has a "baseline" recurrence structure over a given period of time. Any statistically significant deviation from this recurrence structure, as inferred from complexity measures provided by recurrence quantification analysis, is considered a transition in the dynamics of the protein. We apply this technique to a 132 ns long molecular dynamics simulation of the β-Lactamase Inhibitory Protein BLIP. We are able to detect conformational transitions in the nanosecond range in the recurrence dynamics of the BLIP protein during the simulation. The results compare favorably to those extracted using the principal component analysis technique. The recurrence quantification analysis based bootstrap technique is able to detect transitions between different dynamics states for a protein over different time scales. It is not limited to linear dynamics regimes, and can be generalized to any time scale. It also has the potential to be used to cluster frames in molecular dynamics trajectories according to the nature of their recurrence dynamics. One shortcoming for this method is the need to have large enough time windows to insure good statistical quality for the recurrence complexity measures needed to detect the transitions.

  4. Rank estimation and the multivariate analysis of in vivo fast-scan cyclic voltammetric data

    PubMed Central

    Keithley, Richard B.; Carelli, Regina M.; Wightman, R. Mark

    2010-01-01

    Principal component regression has been used in the past to separate current contributions from different neuromodulators measured with in vivo fast-scan cyclic voltammetry. Traditionally, a percent cumulative variance approach has been used to determine the rank of the training set voltammetric matrix during model development, however this approach suffers from several disadvantages including the use of arbitrary percentages and the requirement of extreme precision of training sets. Here we propose that Malinowski’s F-test, a method based on a statistical analysis of the variance contained within the training set, can be used to improve factor selection for the analysis of in vivo fast-scan cyclic voltammetric data. These two methods of rank estimation were compared at all steps in the calibration protocol including the number of principal components retained, overall noise levels, model validation as determined using a residual analysis procedure, and predicted concentration information. By analyzing 119 training sets from two different laboratories amassed over several years, we were able to gain insight into the heterogeneity of in vivo fast-scan cyclic voltammetric data and study how differences in factor selection propagate throughout the entire principal component regression analysis procedure. Visualizing cyclic voltammetric representations of the data contained in the retained and discarded principal components showed that using Malinowski’s F-test for rank estimation of in vivo training sets allowed for noise to be more accurately removed. Malinowski’s F-test also improved the robustness of our criterion for judging multivariate model validity, even though signal-to-noise ratios of the data varied. In addition, pH change was the majority noise carrier of in vivo training sets while dopamine prediction was more sensitive to noise. PMID:20527815

  5. Heavy metal contamination of agricultural soils affected by mining activities around the Ganxi River in Chenzhou, Southern China.

    PubMed

    Ma, Li; Sun, Jing; Yang, Zhaoguang; Wang, Lin

    2015-12-01

    Heavy metal contamination attracted a wide spread attention due to their strong toxicity and persistence. The Ganxi River, located in Chenzhou City, Southern China, has been severely polluted by lead/zinc ore mining activities. This work investigated the heavy metal pollution in agricultural soils around the Ganxi River. The total concentrations of heavy metals were determined by inductively coupled plasma-mass spectrometry. The potential risk associated with the heavy metals in soil was assessed by Nemerow comprehensive index and potential ecological risk index. In both methods, the study area was rated as very high risk. Multivariate statistical methods including Pearson's correlation analysis, hierarchical cluster analysis, and principal component analysis were employed to evaluate the relationships between heavy metals, as well as the correlation between heavy metals and pH, to identify the metal sources. Three distinct clusters have been observed by hierarchical cluster analysis. In principal component analysis, a total of two components were extracted to explain over 90% of the total variance, both of which were associated with anthropogenic sources.

  6. Understanding software faults and their role in software reliability modeling

    NASA Technical Reports Server (NTRS)

    Munson, John C.

    1994-01-01

    This study is a direct result of an on-going project to model the reliability of a large real-time control avionics system. In previous modeling efforts with this system, hardware reliability models were applied in modeling the reliability behavior of this system. In an attempt to enhance the performance of the adapted reliability models, certain software attributes were introduced in these models to control for differences between programs and also sequential executions of the same program. As the basic nature of the software attributes that affect software reliability become better understood in the modeling process, this information begins to have important implications on the software development process. A significant problem arises when raw attribute measures are to be used in statistical models as predictors, for example, of measures of software quality. This is because many of the metrics are highly correlated. Consider the two attributes: lines of code, LOC, and number of program statements, Stmts. In this case, it is quite obvious that a program with a high value of LOC probably will also have a relatively high value of Stmts. In the case of low level languages, such as assembly language programs, there might be a one-to-one relationship between the statement count and the lines of code. When there is a complete absence of linear relationship among the metrics, they are said to be orthogonal or uncorrelated. Usually the lack of orthogonality is not serious enough to affect a statistical analysis. However, for the purposes of some statistical analysis such as multiple regression, the software metrics are so strongly interrelated that the regression results may be ambiguous and possibly even misleading. Typically, it is difficult to estimate the unique effects of individual software metrics in the regression equation. The estimated values of the coefficients are very sensitive to slight changes in the data and to the addition or deletion of variables in the regression equation. Since most of the existing metrics have common elements and are linear combinations of these common elements, it seems reasonable to investigate the structure of the underlying common factors or components that make up the raw metrics. The technique we have chosen to use to explore this structure is a procedure called principal components analysis. Principal components analysis is a decomposition technique that may be used to detect and analyze collinearity in software metrics. When confronted with a large number of metrics measuring a single construct, it may be desirable to represent the set by some smaller number of variables that convey all, or most, of the information in the original set. Principal components are linear transformations of a set of random variables that summarize the information contained in the variables. The transformations are chosen so that the first component accounts for the maximal amount of variation of the measures of any possible linear transform; the second component accounts for the maximal amount of residual variation; and so on. The principal components are constructed so that they represent transformed scores on dimensions that are orthogonal. Through the use of principal components analysis, it is possible to have a set of highly related software attributes mapped into a small number of uncorrelated attribute domains. This definitively solves the problem of multi-collinearity in subsequent regression analysis. There are many software metrics in the literature, but principal component analysis reveals that there are few distinct sources of variation, i.e. dimensions, in this set of metrics. It would appear perfectly reasonable to characterize the measurable attributes of a program with a simple function of a small number of orthogonal metrics each of which represents a distinct software attribute domain.

  7. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool

    PubMed Central

    Clark, Neil R.; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D.; Jones, Matthew R.; Ma’ayan, Avi

    2016-01-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community. PMID:26848405

  8. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

    PubMed

    Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi

    2015-11-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

  9. Analysis of the interannual variability of tropical cyclones striking the California coast based on statistical downscaling

    NASA Astrophysics Data System (ADS)

    Mendez, F. J.; Rueda, A.; Barnard, P.; Mori, N.; Nakajo, S.; Espejo, A.; del Jesus, M.; Diez Sierra, J.; Cofino, A. S.; Camus, P.

    2016-02-01

    Hurricanes hitting California have a very low ocurrence probability due to typically cool ocean temperature and westward tracks. However, damages associated to these improbable events would be dramatic in Southern California and understanding the oceanographic and atmospheric drivers is of paramount importance for coastal risk management for present and future climates. A statistical analysis of the historical events is very difficult due to the limited resolution of atmospheric and oceanographic forcing data available. In this work, we propose a combination of: (a) statistical downscaling methods (Espejo et al, 2015); and (b) a synthetic stochastic tropical cyclone (TC) model (Nakajo et al, 2014). To build the statistical downscaling model, Y=f(X), we apply a combination of principal component analysis and the k-means classification algorithm to find representative patterns from a potential TC index derived from large-scale SST fields in Eastern Central Pacific (predictor X) and the associated tropical cyclone ocurrence (predictand Y). SST data comes from NOAA Extended Reconstructed SST V3b providing information from 1854 to 2013 on a 2.0 degree x 2.0 degree global grid. As data for the historical occurrence and paths of tropical cycloneas are scarce, we apply a stochastic TC model which is based on a Monte Carlo simulation of the joint distribution of track, minimum sea level pressure and translation speed of the historical events in the Eastern Central Pacific Ocean. Results will show the ability of the approach to explain seasonal-to-interannual variability of the predictor X, which is clearly related to El Niño Southern Oscillation. References Espejo, A., Méndez, F.J., Diez, J., Medina, R., Al-Yahyai, S. (2015) Seasonal probabilistic forecasting of tropical cyclone activity in the North Indian Ocean, Journal of Flood Risk Management, DOI: 10.1111/jfr3.12197 Nakajo, S., N. Mori, T. Yasuda, and H. Mase (2014) Global Stochastic Tropical Cyclone Model Based on Principal Component Analysis and Cluster Analysis, Journal of Applied Meteorology and Climatology, DOI: 10.1175/JAMC-D-13-08.1

  10. A Data Analytics Approach to Discovering Unique Microstructural Configurations Susceptible to Fatigue

    NASA Astrophysics Data System (ADS)

    Jha, S. K.; Brockman, R. A.; Hoffman, R. M.; Sinha, V.; Pilchak, A. L.; Porter, W. J.; Buchanan, D. J.; Larsen, J. M.; John, R.

    2018-05-01

    Principal component analysis and fuzzy c-means clustering algorithms were applied to slip-induced strain and geometric metric data in an attempt to discover unique microstructural configurations and their frequencies of occurrence in statistically representative instantiations of a titanium alloy microstructure. Grain-averaged fatigue indicator parameters were calculated for the same instantiation. The fatigue indicator parameters strongly correlated with the spatial location of the microstructural configurations in the principal components space. The fuzzy c-means clustering method identified clusters of data that varied in terms of their average fatigue indicator parameters. Furthermore, the number of points in each cluster was inversely correlated to the average fatigue indicator parameter. This analysis demonstrates that data-driven methods have significant potential for providing unbiased determination of unique microstructural configurations and their frequencies of occurrence in a given volume from the point of view of strain localization and fatigue crack initiation.

  11. Lippia origanoides chemotype differentiation based on essential oil GC-MS and principal component analysis.

    PubMed

    Stashenko, Elena E; Martínez, Jairo R; Ruíz, Carlos A; Arias, Ginna; Durán, Camilo; Salgar, William; Cala, Mónica

    2010-01-01

    Chromatographic (GC/flame ionization detection, GC/MS) and statistical analyses were applied to the study of essential oils and extracts obtained from flowers, leaves, and stems of Lippia origanoides plants, growing wild in different Colombian regions. Retention indices, mass spectra, and standard substances were used in the identification of 139 substances detected in these essential oils and extracts. Principal component analysis allowed L. origanoides classification into three chemotypes, characterized according to their essential oil major components. Alpha- and beta-phellandrenes, p-cymene, and limonene distinguished chemotype A; carvacrol and thymol were the distinctive major components of chemotypes B and C, respectively. Pinocembrin (5,7-dihydroxyflavanone) was found in L. origanoides chemotype A supercritical fluid (CO(2)) extract at a concentration of 0.83+/-0.03 mg/g of dry plant material, which makes this plant an interesting source of an important bioactive flavanone with diverse potential applications in cosmetic, food, and pharmaceutical products.

  12. The use of principal component and cluster analysis to differentiate banana peel flours based on their starch and dietary fibre components.

    PubMed

    Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat

    2010-08-01

    Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food.

  13. The Use of Principal Component and Cluster Analysis to Differentiate Banana Peel Flours Based on Their Starch and Dietary Fibre Components

    PubMed Central

    Ramli, Saifullah; Ismail, Noryati; Alkarkhi, Abbas Fadhl Mubarek; Easa, Azhar Mat

    2010-01-01

    Banana peel flour (BPF) prepared from green or ripe Cavendish and Dream banana fruits were assessed for their total starch (TS), digestible starch (DS), resistant starch (RS), total dietary fibre (TDF), soluble dietary fibre (SDF) and insoluble dietary fibre (IDF). Principal component analysis (PCA) identified that only 1 component was responsible for 93.74% of the total variance in the starch and dietary fibre components that differentiated ripe and green banana flours. Cluster analysis (CA) applied to similar data obtained two statistically significant clusters (green and ripe bananas) to indicate difference in behaviours according to the stages of ripeness based on starch and dietary fibre components. We concluded that the starch and dietary fibre components could be used to discriminate between flours prepared from peels obtained from fruits of different ripeness. The results were also suggestive of the potential of green and ripe BPF as functional ingredients in food. PMID:24575193

  14. Short wavelength Raman spectroscopy applied to the discrimination and characterization of three cultivars of extra virgin olive oils in different maturation stages.

    PubMed

    Gouvinhas, Irene; Machado, Nelson; Carvalho, Teresa; de Almeida, José M M M; Barros, Ana I R N A

    2015-01-01

    Extra virgin olive oils produced from three cultivars on different maturation stages were characterized using Raman spectroscopy. Chemometric methods (principal component analysis, discriminant analysis, principal component regression and partial least squares regression) applied to Raman spectral data were utilized to evaluate and quantify the statistical differences between cultivars and their ripening process. The models for predicting the peroxide value and free acidity of olive oils showed good calibration and prediction values and presented high coefficients of determination (>0.933). Both the R(2), and the correlation equations between the measured chemical parameters, and the values predicted by each approach are presented; these comprehend both PCR and PLS, used to assess SNV normalized Raman data, as well as first and second derivative of the spectra. This study demonstrates that a combination of Raman spectroscopy with multivariate analysis methods can be useful to predict rapidly olive oil chemical characteristics during the maturation process. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. A study of the professional development needs of Shiraz high schools' principals in the area of educational leadership.

    PubMed

    Hayat, Aliasghar; Abdollahi, Bijan; Zainabadi, Hasan Reza; Arasteh, Hamid Reza

    2015-07-01

    The increased emphasis on standards-based school accountability since the passage of the no child left behind act of 2001 is focusing critical attention on the professional development of school principals and their ability to meet the challenges of improving the student outcomes. Due to this subject, the current study examined professional development needs of Shiraz high schools principals. The statistical population consisted of 343 principals of Shiraz high schools, of whom 250 subjects were selected using Krejcie and Morgan (1978) sample size determination table. To collect the data, a questionnaire developed by Salazar (2007) was administered. This questionnaire was designed for professional development in the leadership skills/competencies and consisted of 25 items in each leadership performance domain using five-point Likert-type scales. The content validity of the questionnaire was confirmed and the Cronbach's Alpha coefficient was (a=0.78). To analyze the data, descriptive statistics and Paired-Samples t-test were used. Also, the data was analyzed through SPSS14 software. The findings showed that principals' "Importance" ratings were always higher than their "Actual proficiency" ratings. The mean score of the difference between "Importance" and "Actual proficiency" pair on "Organizing resources" was 2.11, making it the highest "need" area. The lowest need area was "Managing the organization and operational procedures" at 0.81. Also, the results showed that there was a statistically significant difference between the means of the "Importance" and the corresponding means on the "Actual proficiency" (Difference of means=1.48, t=49.38, p<0.001). Based on the obtained results, the most important professional development needs of the principals included organizing resources, resolving complex problems, understanding student development and learning, developing the vision and the mission, building team commitment, understanding measurements, evaluation and assessment strategies, facilitating the change process, solving problems and making decisions. In other words, the principals had statistically significant professional development needs in all areas of the educational leadership. Also, the results suggested that today's school principals need to grow and learn throughout their careers by ongoing professional development.

  16. 15 CFR 758.3 - Responsibilities of parties to the transaction.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...

  17. 15 CFR 758.3 - Responsibilities of parties to the transaction.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...

  18. 15 CFR 758.3 - Responsibilities of parties to the transaction.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...

  19. 15 CFR 758.3 - Responsibilities of parties to the transaction.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...

  20. 15 CFR 758.3 - Responsibilities of parties to the transaction.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...

  1. Comparing and combining biomarkers as principle surrogates for time-to-event clinical endpoints.

    PubMed

    Gabriel, Erin E; Sachs, Michael C; Gilbert, Peter B

    2015-02-10

    Principal surrogate endpoints are useful as targets for phase I and II trials. In many recent trials, multiple post-randomization biomarkers are measured. However, few statistical methods exist for comparison of or combination of biomarkers as principal surrogates, and none of these methods to our knowledge utilize time-to-event clinical endpoint information. We propose a Weibull model extension of the semi-parametric estimated maximum likelihood method that allows for the inclusion of multiple biomarkers in the same risk model as multivariate candidate principal surrogates. We propose several methods for comparing candidate principal surrogates and evaluating multivariate principal surrogates. These include the time-dependent and surrogate-dependent true and false positive fraction, the time-dependent and the integrated standardized total gain, and the cumulative distribution function of the risk difference. We illustrate the operating characteristics of our proposed methods in simulations and outline how these statistics can be used to evaluate and compare candidate principal surrogates. We use these methods to investigate candidate surrogates in the Diabetes Control and Complications Trial. Copyright © 2014 John Wiley & Sons, Ltd.

  2. Comparative forensic soil analysis of New Jersey state parks using a combination of simple techniques with multivariate statistics.

    PubMed

    Bonetti, Jennifer; Quarino, Lawrence

    2014-05-01

    This study has shown that the combination of simple techniques with the use of multivariate statistics offers the potential for the comparative analysis of soil samples. Five samples were obtained from each of twelve state parks across New Jersey in both the summer and fall seasons. Each sample was examined using particle-size distribution, pH analysis in both water and 1 M CaCl2 , and a loss on ignition technique. Data from each of the techniques were combined, and principal component analysis (PCA) and canonical discriminant analysis (CDA) were used for multivariate data transformation. Samples from different locations could be visually differentiated from one another using these multivariate plots. Hold-one-out cross-validation analysis showed error rates as low as 3.33%. Ten blind study samples were analyzed resulting in no misclassifications using Mahalanobis distance calculations and visual examinations of multivariate plots. Seasonal variation was minimal between corresponding samples, suggesting potential success in forensic applications. © 2014 American Academy of Forensic Sciences.

  3. Nonequilibrium Statistical Operator Method and Generalized Kinetic Equations

    NASA Astrophysics Data System (ADS)

    Kuzemsky, A. L.

    2018-01-01

    We consider some principal problems of nonequilibrium statistical thermodynamics in the framework of the Zubarev nonequilibrium statistical operator approach. We present a brief comparative analysis of some approaches to describing irreversible processes based on the concept of nonequilibrium Gibbs ensembles and their applicability to describing nonequilibrium processes. We discuss the derivation of generalized kinetic equations for a system in a heat bath. We obtain and analyze a damped Schrödinger-type equation for a dynamical system in a heat bath. We study the dynamical behavior of a particle in a medium taking the dissipation effects into account. We consider the scattering problem for neutrons in a nonequilibrium medium and derive a generalized Van Hove formula. We show that the nonequilibrium statistical operator method is an effective, convenient tool for describing irreversible processes in condensed matter.

  4. Analysis of select Dalbergia and trade timber using direct analysis in real time and time-of-flight mass spectrometry for CITES enforcement.

    PubMed

    Lancaster, Cady; Espinoza, Edgard

    2012-05-15

    International trade of several Dalbergia wood species is regulated by The Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). In order to supplement morphological identification of these species, a rapid chemical method of analysis was developed. Using Direct Analysis in Real Time (DART) ionization coupled with Time-of-Flight (TOF) Mass Spectrometry (MS), selected Dalbergia and common trade species were analyzed. Each of the 13 wood species was classified using principal component analysis and linear discriminant analysis (LDA). These statistical data clusters served as reliable anchors for species identification of unknowns. Analysis of 20 or more samples from the 13 species studied in this research indicates that the DART-TOFMS results are reproducible. Statistical analysis of the most abundant ions gave good classifications that were useful for identifying unknown wood samples. DART-TOFMS and LDA analysis of 13 species of selected timber samples and the statistical classification allowed for the correct assignment of unknown wood samples. This method is rapid and can be useful when anatomical identification is difficult but needed in order to support CITES enforcement. Published 2012. This article is a US Government work and is in the public domain in the USA.

  5. Feature extraction through parallel Probabilistic Principal Component Analysis for heart disease diagnosis

    NASA Astrophysics Data System (ADS)

    Shah, Syed Muhammad Saqlain; Batool, Safeera; Khan, Imran; Ashraf, Muhammad Usman; Abbas, Syed Hussnain; Hussain, Syed Adnan

    2017-09-01

    Automatic diagnosis of human diseases are mostly achieved through decision support systems. The performance of these systems is mainly dependent on the selection of the most relevant features. This becomes harder when the dataset contains missing values for the different features. Probabilistic Principal Component Analysis (PPCA) has reputation to deal with the problem of missing values of attributes. This research presents a methodology which uses the results of medical tests as input, extracts a reduced dimensional feature subset and provides diagnosis of heart disease. The proposed methodology extracts high impact features in new projection by using Probabilistic Principal Component Analysis (PPCA). PPCA extracts projection vectors which contribute in highest covariance and these projection vectors are used to reduce feature dimension. The selection of projection vectors is done through Parallel Analysis (PA). The feature subset with the reduced dimension is provided to radial basis function (RBF) kernel based Support Vector Machines (SVM). The RBF based SVM serves the purpose of classification into two categories i.e., Heart Patient (HP) and Normal Subject (NS). The proposed methodology is evaluated through accuracy, specificity and sensitivity over the three datasets of UCI i.e., Cleveland, Switzerland and Hungarian. The statistical results achieved through the proposed technique are presented in comparison to the existing research showing its impact. The proposed technique achieved an accuracy of 82.18%, 85.82% and 91.30% for Cleveland, Hungarian and Switzerland dataset respectively.

  6. Rare earth element geochemistry of shallow carbonate outcropping strata in Saudi Arabia: Application for depositional environments prediction

    NASA Astrophysics Data System (ADS)

    Eltom, Hassan A.; Abdullatif, Osman M.; Makkawi, Mohammed H.; Eltoum, Isam-Eldin A.

    2017-03-01

    The interpretation of depositional environments provides important information to understand facies distribution and geometry. The classical approach to interpret depositional environments principally relies on the analysis of lithofacies, biofacies and stratigraphic data, among others. An alternative method, based on geochemical data (chemical element data), is advantageous because it can simply, reproducibly and efficiently interpret and refine the interpretation of the depositional environment of carbonate strata. Here we geochemically analyze and statistically model carbonate samples (n = 156) from seven sections of the Arab-D reservoir outcrop analog of central Saudi Arabia, to determine whether the elemental signatures (major, trace and rare earth elements [REEs]) can be effectively used to predict depositional environments. We find that lithofacies associations of the studied outcrop (peritidal to open marine depositional environments) possess altered REE signatures, and that this trend increases stratigraphically from bottom-to-top, which corresponds to an upward shallowing of depositional environments. The relationship between REEs and major, minor and trace elements indicates that contamination by detrital materials is the principal source of REEs, whereas redox condition, marine and diagenetic processes have minimal impact on the relative distribution of REEs in the lithofacies. In a statistical model (factor analysis and logistic regression), REEs, major and trace elements cluster together and serve as markers to differentiate between peritidal and open marine facies and to differentiate between intertidal and subtidal lithofacies within the peritidal facies. The results indicate that statistical modelling of the elemental composition of carbonate strata can be used as a quantitative method to predict depositional environments and regional paleogeography. The significance of this study lies in offering new assessments of the relationships between lithofacies and geochemical elements by using advanced statistical analysis, a method that could be used elsewhere to interpret depositional environment and refine facies models.

  7. How Many Separable Sources? Model Selection In Independent Components Analysis

    PubMed Central

    Woods, Roger P.; Hansen, Lars Kai; Strother, Stephen

    2015-01-01

    Unlike mixtures consisting solely of non-Gaussian sources, mixtures including two or more Gaussian components cannot be separated using standard independent components analysis methods that are based on higher order statistics and independent observations. The mixed Independent Components Analysis/Principal Components Analysis (mixed ICA/PCA) model described here accommodates one or more Gaussian components in the independent components analysis model and uses principal components analysis to characterize contributions from this inseparable Gaussian subspace. Information theory can then be used to select from among potential model categories with differing numbers of Gaussian components. Based on simulation studies, the assumptions and approximations underlying the Akaike Information Criterion do not hold in this setting, even with a very large number of observations. Cross-validation is a suitable, though computationally intensive alternative for model selection. Application of the algorithm is illustrated using Fisher's iris data set and Howells' craniometric data set. Mixed ICA/PCA is of potential interest in any field of scientific investigation where the authenticity of blindly separated non-Gaussian sources might otherwise be questionable. Failure of the Akaike Information Criterion in model selection also has relevance in traditional independent components analysis where all sources are assumed non-Gaussian. PMID:25811988

  8. Spectral discrimination of serum from liver cancer and liver cirrhosis using Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Yang, Tianyue; Li, Xiaozhou; Yu, Ting; Sun, Ruomin; Li, Siqi

    2011-07-01

    In this paper, Raman spectra of human serum were measured using Raman spectroscopy, then the spectra was analyzed by multivariate statistical methods of principal component analysis (PCA). Then linear discriminant analysis (LDA) was utilized to differentiate the loading score of different diseases as the diagnosing algorithm. Artificial neural network (ANN) was used for cross-validation. The diagnosis sensitivity and specificity by PCA-LDA are 88% and 79%, while that of the PCA-ANN are 89% and 95%. It can be seen that modern analyzing method is a useful tool for the analysis of serum spectra for diagnosing diseases.

  9. Influences of High Quality Army Enlistments

    DTIC Science & Technology

    1987-03-01

    The second component was formed with the Money for College and Unemployment variables. The Kaiser - Meyer - Olkin (KMO) statistics (Norusis, 1985, p.129...advertising variables were in the same component for moot of the subgroups. The Kaiser - Meyer - Olkin (1MO) values for the a6vertising variables were at...one component. The Kaiser - tMeyer- Olkin (KMO) measure of sampling adequacy indicated that principal component analysis may not be appropriate for

  10. National Indian Education Study. Part II: The Educational Experiences of Fourth- and Eighth-Grade American Indian and Alaska Native Students. Statistical Analysis Report. NCES 2007-454

    ERIC Educational Resources Information Center

    Stancavage, Frances B.; Mitchell, Julia H.; de Mello, Victor Bandeira; Gaertner, Freya E.; Spain, Angeline K.; Rahal, Michelle L.

    2006-01-01

    This report presents results from a national survey, conducted in 2005, that examined the educational experiences of American Indian/Alaska Native (AI/AN) students in grades 4 and 8, with particular emphasis on the integration of native language and culture into school and classroom activities. Students, teachers, and school principals all…

  11. Combined data preprocessing and multivariate statistical analysis characterizes fed-batch culture of mouse hybridoma cells for rational medium design.

    PubMed

    Selvarasu, Suresh; Kim, Do Yun; Karimi, Iftekhar A; Lee, Dong-Yup

    2010-10-01

    We present an integrated framework for characterizing fed-batch cultures of mouse hybridoma cells producing monoclonal antibody (mAb). This framework systematically combines data preprocessing, elemental balancing and statistical analysis technique. Initially, specific rates of cell growth, glucose/amino acid consumptions and mAb/metabolite productions were calculated via curve fitting using logistic equations, with subsequent elemental balancing of the preprocessed data indicating the presence of experimental measurement errors. Multivariate statistical analysis was then employed to understand physiological characteristics of the cellular system. The results from principal component analysis (PCA) revealed three major clusters of amino acids with similar trends in their consumption profiles: (i) arginine, threonine and serine, (ii) glycine, tyrosine, phenylalanine, methionine, histidine and asparagine, and (iii) lysine, valine and isoleucine. Further analysis using partial least square (PLS) regression identified key amino acids which were positively or negatively correlated with the cell growth, mAb production and the generation of lactate and ammonia. Based on these results, the optimal concentrations of key amino acids in the feed medium can be inferred, potentially leading to an increase in cell viability and productivity, as well as a decrease in toxic waste production. The study demonstrated how the current methodological framework using multivariate statistical analysis techniques can serve as a potential tool for deriving rational medium design strategies. Copyright © 2010 Elsevier B.V. All rights reserved.

  12. Identification of fungal phytopathogens using Fourier transform infrared-attenuated total reflection spectroscopy and advanced statistical methods

    NASA Astrophysics Data System (ADS)

    Salman, Ahmad; Lapidot, Itshak; Pomerantz, Ami; Tsror, Leah; Shufan, Elad; Moreh, Raymond; Mordechai, Shaul; Huleihel, Mahmoud

    2012-01-01

    The early diagnosis of phytopathogens is of a great importance; it could save large economical losses due to crops damaged by fungal diseases, and prevent unnecessary soil fumigation or the use of fungicides and bactericides and thus prevent considerable environmental pollution. In this study, 18 isolates of three different fungi genera were investigated; six isolates of Colletotrichum coccodes, six isolates of Verticillium dahliae and six isolates of Fusarium oxysporum. Our main goal was to differentiate these fungi samples on the level of isolates, based on their infrared absorption spectra obtained using the Fourier transform infrared-attenuated total reflection (FTIR-ATR) sampling technique. Advanced statistical and mathematical methods: principal component analysis (PCA), linear discriminant analysis (LDA), and k-means were applied to the spectra after manipulation. Our results showed significant spectral differences between the various fungi genera examined. The use of k-means enabled classification between the genera with a 94.5% accuracy, whereas the use of PCA [3 principal components (PCs)] and LDA has achieved a 99.7% success rate. However, on the level of isolates, the best differentiation results were obtained using PCA (9 PCs) and LDA for the lower wavenumber region (800-1775 cm-1), with identification success rates of 87%, 85.5%, and 94.5% for Colletotrichum, Fusarium, and Verticillium strains, respectively.

  13. Principal Component Analysis in Construction of 3D Human Knee Joint Models Using a Statistical Shape Model Method

    PubMed Central

    Tsai, Tsung-Yuan; Li, Jing-Sheng; Wang, Shaobai; Li, Pingyue; Kwon, Young-Min; Li, Guoan

    2013-01-01

    The statistical shape model (SSM) method that uses 2D images of the knee joint to predict the 3D joint surface model has been reported in literature. In this study, we constructed a SSM database using 152 human CT knee joint models, including the femur, tibia and patella and analyzed the characteristics of each principal component of the SSM. The surface models of two in vivo knees were predicted using the SSM and their 2D bi-plane fluoroscopic images. The predicted models were compared to their CT joint models. The differences between the predicted 3D knee joint surfaces and the CT image-based surfaces were 0.30 ± 0.81 mm, 0.34 ± 0.79 mm and 0.36 ± 0.59 mm for the femur, tibia and patella, respectively (average ± standard deviation). The computational time for each bone of the knee joint was within 30 seconds using a personal computer. The analysis of this study indicated that the SSM method could be a useful tool to construct 3D surface models of the knee with sub-millimeter accuracy in real time. Thus it may have a broad application in computer assisted knee surgeries that require 3D surface models of the knee. PMID:24156375

  14. Case study on prediction of remaining methane potential of landfilled municipal solid waste by statistical analysis of waste composition data.

    PubMed

    Sel, İlker; Çakmakcı, Mehmet; Özkaya, Bestamin; Suphi Altan, H

    2016-10-01

    Main objective of this study was to develop a statistical model for easier and faster Biochemical Methane Potential (BMP) prediction of landfilled municipal solid waste by analyzing waste composition of excavated samples from 12 sampling points and three waste depths representing different landfilling ages of closed and active sections of a sanitary landfill site located in İstanbul, Turkey. Results of Principal Component Analysis (PCA) were used as a decision support tool to evaluation and describe the waste composition variables. Four principal component were extracted describing 76% of data set variance. The most effective components were determined as PCB, PO, T, D, W, FM, moisture and BMP for the data set. Multiple Linear Regression (MLR) models were built by original compositional data and transformed data to determine differences. It was observed that even residual plots were better for transformed data the R(2) and Adjusted R(2) values were not improved significantly. The best preliminary BMP prediction models consisted of D, W, T and FM waste fractions for both versions of regressions. Adjusted R(2) values of the raw and transformed models were determined as 0.69 and 0.57, respectively. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Trends in groundwater quality in principal aquifers of the United States, 1988-2012

    USGS Publications Warehouse

    Lindsey, Bruce D.; Rupert, Michael G.

    2014-01-01

    The U.S. Geological Survey (USGS) National Water-Quality Assessment (NAWQA) Program analyzed trends in groundwater quality throughout the nation for the sampling period of 1988-2012. Trends were determined for networks (sets of wells routinely monitored by the USGS) for a subset of constituents by statistical analysis of paired water-quality measurements collected on a near-decadal time scale. The data set for chloride, dissolved solids, and nitrate consisted of 1,511 wells in 67 networks, whereas the data set for methyl tert-butyl ether (MTBE) consisted of 1, 013 wells in 46 networks. The 25 principal aquifers represented by these networks account for about 75 percent of withdrawals of groundwater used for drinking-water supply for the nation. Statistically significant changes in chloride, dissolved-solids, or nitrate concentrations were found in many well networks over a decadal period. Concentrations increased significantly in 48 percent of networks for chloride, 42 percent of networks for dissolved solids, and 21 percent of networks for nitrate. Chloride, dissolved solids, and nitrate concentrations decreased significantly in 3, 3, and 10 percent of the networks, respectively. The magnitude of change in concentrations was typically small in most networks; however, the magnitude of change in networks with statistically significant increases was typically much larger than the magnitude of change in networks with statistically significant decreases. The largest increases of chloride concentrations were in urban areas in the northeastern and north central United States. The largest increases of nitrate concentrations were in networks in agricultural areas. Statistical analysis showed 42 or the 46 networks had no statistically significant changes in MTBE concentrations. The four networks with statistically significant changes in MTBE concentrations were in the northeastern United States, where MTBE was widely used. Two networks had increasing concentrations, and two networks had decreasing concentrations. Production and use of MTBE peaked in about 2000 and has been effectively banned in many areas since about 2006. The two networks that had increasing concentrations were sampled for the second time close to the peak of MTBE production, whereas the two networks that had decreasing concentrations were sampled for the second time 10 years after the peak of MTBE production.

  16. Assessing signal-to-noise in quantitative proteomics: multivariate statistical analysis in DIGE experiments.

    PubMed

    Friedman, David B

    2012-01-01

    All quantitative proteomics experiments measure variation between samples. When performing large-scale experiments that involve multiple conditions or treatments, the experimental design should include the appropriate number of individual biological replicates from each condition to enable the distinction between a relevant biological signal from technical noise. Multivariate statistical analyses, such as principal component analysis (PCA), provide a global perspective on experimental variation, thereby enabling the assessment of whether the variation describes the expected biological signal or the unanticipated technical/biological noise inherent in the system. Examples will be shown from high-resolution multivariable DIGE experiments where PCA was instrumental in demonstrating biologically significant variation as well as sample outliers, fouled samples, and overriding technical variation that would not be readily observed using standard univariate tests.

  17. Tree-space statistics and approximations for large-scale analysis of anatomical trees.

    PubMed

    Feragen, Aasa; Owen, Megan; Petersen, Jens; Wille, Mathilde M W; Thomsen, Laura H; Dirksen, Asger; de Bruijne, Marleen

    2013-01-01

    Statistical analysis of anatomical trees is hard to perform due to differences in the topological structure of the trees. In this paper we define statistical properties of leaf-labeled anatomical trees with geometric edge attributes by considering the anatomical trees as points in the geometric space of leaf-labeled trees. This tree-space is a geodesic metric space where any two trees are connected by a unique shortest path, which corresponds to a tree deformation. However, tree-space is not a manifold, and the usual strategy of performing statistical analysis in a tangent space and projecting onto tree-space is not available. Using tree-space and its shortest paths, a variety of statistical properties, such as mean, principal component, hypothesis testing and linear discriminant analysis can be defined. For some of these properties it is still an open problem how to compute them; others (like the mean) can be computed, but efficient alternatives are helpful in speeding up algorithms that use means iteratively, like hypothesis testing. In this paper, we take advantage of a very large dataset (N = 8016) to obtain computable approximations, under the assumption that the data trees parametrize the relevant parts of tree-space well. Using the developed approximate statistics, we illustrate how the structure and geometry of airway trees vary across a population and show that airway trees with Chronic Obstructive Pulmonary Disease come from a different distribution in tree-space than healthy ones. Software is available from http://image.diku.dk/aasa/software.php.

  18. Recharge in semiarid mountain environments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gross, G.W.

    A systematic investigation of tritium activity in precipitation, surface water, springs, and ground water of the Roswell artesian basin in New Mexico, has been supplemented by hydrogeologic reconnaissance of spring systems; by various statistical correlations and spectral analysis of stream flow and water level records of observation wells; by spring discharge measurements; by stable isotope determinations (oxygen 18 and deuterium); and by numerical modeling of part of the basin. Two recharge contributions to the Principal or Carbonate Aquifer have been distinguished principally on the basis of their tritium label and aquifer response characteristics. Almost all basin waters (including deep groundmore » water) fall close to the meteoric line of hydrogen/oxygen isotope composition, and this rules out a juvenile origin or appreciable bedrock interaction.« less

  19. The Role of School Principals in the Governorate of Ma'an in Promoting Intellectual Security among Students

    ERIC Educational Resources Information Center

    Waswas, Dima; Gasaymeh, Al-Mothana M.

    2017-01-01

    This study aims at identifying the role played by school principals in the Governorate of Ma'an to strengthen intellectual security of the school students; and identifying whether there are statistically significant differences in the roles of principals attributed to the variables: gender, academic level, and years of experience in…

  20. Nature of Driving Force for Protein Folding-- A Result From Analyzing the Statistical Potential

    NASA Astrophysics Data System (ADS)

    Li, Hao; Tang, Chao; Wingreen, Ned S.

    1998-03-01

    In a statistical approach to protein structure analysis, Miyazawa and Jernigan (MJ) derived a 20× 20 matrix of inter-residue contact energies between different types of amino acids. Using the method of eigenvalue decomposition, we find that the MJ matrix can be accurately reconstructed from its first two principal component vectors as M_ij=C_0+C_1(q_i+q_j)+C2 qi q_j, with constant C's, and 20 q values associated with the 20 amino acids. This regularity is due to hydrophobic interactions and a force of demixing, the latter obeying Hildebrand's solubility theory of simple liquids.

  1. Efficient Coding and Statistically Optimal Weighting of Covariance among Acoustic Attributes in Novel Sounds

    PubMed Central

    Stilp, Christian E.; Kluender, Keith R.

    2012-01-01

    To the extent that sensorineural systems are efficient, redundancy should be extracted to optimize transmission of information, but perceptual evidence for this has been limited. Stilp and colleagues recently reported efficient coding of robust correlation (r = .97) among complex acoustic attributes (attack/decay, spectral shape) in novel sounds. Discrimination of sounds orthogonal to the correlation was initially inferior but later comparable to that of sounds obeying the correlation. These effects were attenuated for less-correlated stimuli (r = .54) for reasons that are unclear. Here, statistical properties of correlation among acoustic attributes essential for perceptual organization are investigated. Overall, simple strength of the principal correlation is inadequate to predict listener performance. Initial superiority of discrimination for statistically consistent sound pairs was relatively insensitive to decreased physical acoustic/psychoacoustic range of evidence supporting the correlation, and to more frequent presentations of the same orthogonal test pairs. However, increased range supporting an orthogonal dimension has substantial effects upon perceptual organization. Connectionist simulations and Eigenvalues from closed-form calculations of principal components analysis (PCA) reveal that perceptual organization is near-optimally weighted to shared versus unshared covariance in experienced sound distributions. Implications of reduced perceptual dimensionality for speech perception and plausible neural substrates are discussed. PMID:22292057

  2. Multi-element fingerprinting as a tool in origin authentication of four east China marine species.

    PubMed

    Guo, Lipan; Gong, Like; Yu, Yanlei; Zhang, Hong

    2013-12-01

    The contents of 25 elements in 4 types of commercial marine species from the East China Sea were determined by inductively coupled plasma mass spectrometry and atomic absorption spectrometry. The elemental composition was used to differentiate marine species according to geographical origin by multivariate statistical analysis. The results showed that principal component analysis could distinguish samples from different areas and reveal the elements which played the most important role in origin diversity. The established models by partial least squares discriminant analysis (PLS-DA) and by probabilistic neural network (PNN) can both precisely predict the origin of the marine species. Further study indicated that PLS-DA and PNN were efficacious in regional discrimination. The models from these 2 statistical methods, with an accuracy of 97.92% and 100%, respectively, could both distinguish samples from different areas without the need for species differentiation. © 2013 Institute of Food Technologists®

  3. The Different Paths in the Franchising Entrepreneurship Choice

    NASA Astrophysics Data System (ADS)

    Tomaras, Petros; Konstantopoulos, Nikolaos; Zondiros, Dimitris

    2007-12-01

    This study aims to testify the scientific veracity of the question: is the franchisees' choice on entrepreneurial start-up univocal or many-valued? Two variables are examined by registering daily activities of the entrepreneurial franchisees, as they appear by the answers given to a closed-ended questionnaire. We proceeded with a multiple variable statistical analysis (principal component analysis) of survey data collected from franchisees of a Greece-based franchise system. The results of the research indicate that among different value standards, the entrepreneurs conclude in choosing the franchising.

  4. Characterization of exopolymers of aquatic bacteria by pyrolysis-mass spectrometry

    NASA Technical Reports Server (NTRS)

    Ford, T.; Sacco, E.; Black, J.; Kelley, T.; Goodacre, R.; Berkeley, R. C.; Mitchell, R.

    1991-01-01

    Exopolymers from a diverse collection of marine and freshwater bacteria were characterized by pyrolysis-mass spectrometry (Py-MS). Py-MS provides spectra of pyrolysis fragments that are characteristic of the original material. Analysis of the spectra by multivariate statistical techniques (principal component and canonical variate analysis) separated these exopolymers into distinct groups. Py-MS clearly distinguished characteristic fragments, which may be derived from components responsible for functional differences between polymers. The importance of these distinctions and the relevance of pyrolysis information to exopolysaccharide function in aquatic bacteria is discussed.

  5. A power analysis for multivariate tests of temporal trend in species composition.

    PubMed

    Irvine, Kathryn M; Dinger, Eric C; Sarr, Daniel

    2011-10-01

    Long-term monitoring programs emphasize power analysis as a tool to determine the sampling effort necessary to effectively document ecologically significant changes in ecosystems. Programs that monitor entire multispecies assemblages require a method for determining the power of multivariate statistical models to detect trend. We provide a method to simulate presence-absence species assemblage data that are consistent with increasing or decreasing directional change in species composition within multiple sites. This step is the foundation for using Monte Carlo methods to approximate the power of any multivariate method for detecting temporal trends. We focus on comparing the power of the Mantel test, permutational multivariate analysis of variance, and constrained analysis of principal coordinates. We find that the power of the various methods we investigate is sensitive to the number of species in the community, univariate species patterns, and the number of sites sampled over time. For increasing directional change scenarios, constrained analysis of principal coordinates was as or more powerful than permutational multivariate analysis of variance, the Mantel test was the least powerful. However, in our investigation of decreasing directional change, the Mantel test was typically as or more powerful than the other models.

  6. Hydrometeorological application of an extratropical cyclone classification scheme in the southern United States

    NASA Astrophysics Data System (ADS)

    Senkbeil, J. C.; Brommer, D. M.; Comstock, I. J.; Loyd, T.

    2012-07-01

    Extratropical cyclones (ETCs) in the southern United States are often overlooked when compared with tropical cyclones in the region and ETCs in the northern United States. Although southern ETCs are significant weather events, there is currently not an operational scheme used for identifying and discussing these nameless storms. In this research, we classified 84 ETCs (1970-2009). We manually identified five distinct formation regions and seven unique ETC types using statistical classification. Statistical classification employed the use of principal components analysis and two methods of cluster analysis. Both manual and statistical storm types generally showed positive (negative) relationships with El Niño (La Niña). Manual storm types displayed precipitation swaths consistent with discrete storm tracks which further legitimizes the existence of multiple modes of southern ETCs. Statistical storm types also displayed unique precipitation intensity swaths, but these swaths were less indicative of track location. It is hoped that by classifying southern ETCs into types, that forecasters, hydrologists, and broadcast meteorologists might be able to better anticipate projected amounts of precipitation at their locations.

  7. Cloud Statistics for NASA Climate Change Studies

    NASA Technical Reports Server (NTRS)

    Wylie, Donald P.

    1999-01-01

    The Principal Investigator participated in two field experiments and developed a global data set on cirrus cloud frequency and optical depth to aid the development of numerical models of climate. Four papers were published under this grant. The accomplishments are summarized: (1) In SUCCESS (SUbsonic aircraft: Contrail & Cloud Effects Special Study) the Principal Investigator aided weather forecasters in the start of the field program. A paper also was published on the clouds studied in SUCCESS and the use of the satellite stereographic technique to distinguish cloud forms and heights of clouds. (2) In SHEBA (Surface Heat Budget in the Arctic) FIRE/ACE (Arctic Cloud Experiment) the Principal Investigator provided daily weather and cloud forecasts for four research aircraft crews, NASA's ER-2, UCAR's C-130, University of Washington's Convert 580, and the Canadian Atmospheric Environment Service's Convert 580. Approximately 105 forecasts were written. The Principal Investigator also made daily weather summaries with calculations of air trajectories for 54 flight days in the experiment. The trajectories show where the air sampled during the flights came from and will be used in future publications to discuss the origin and history of the air and clouds sampled by the aircraft. A paper discussing how well the FIRE/ACE data represent normal climatic conditions in the arctic is being prepared. (3) The Principal Investigator's web page became the source of information for weather forecasting by the scientists on the SHEBA ship. (4) Global Cirrus frequency and optical depth is a continuing analysis of global cloud cover and frequency distribution are being made from the NOAA polar orbiting weather satellites. This analysis is sensitive to cirrus clouds because of the radiative channels used. During this grant three papers were published which describe cloud frequencies, their optical properties and compare the Wisconsin FM Cloud Analysis to other global cloud data such as the International Satellite Cloud Climatology Program (ISCCP) and the Stratospheric Aerosol and Gas Experiment (SAGE). A summary of eight years of HIRS data will be published in late 1998. Important information from this study are: 1) cirrus clouds cover most of the earth, 2) they are found about 40% of the time globally, 3) in the tropics cirrus cloud frequencies are even higher, from 80-100%, 4) there is slight evidence that cirnis cloud cover is increasing in the northern hemisphere at about 0.5% per year, and 5) cirrus clouds have an average infrared transmittance of about 40% of the terrestrial radiation. (5) Global Cloud Frequency Statistics published on the Principal Investigator's web page have been used in the planning of the future CRYSTAL experiment and have been used for refinements of a global numerical model operated at the Colorado State University.

  8. Research of facial feature extraction based on MMC

    NASA Astrophysics Data System (ADS)

    Xue, Donglin; Zhao, Jiufen; Tang, Qinhong; Shi, Shaokun

    2017-07-01

    Based on the maximum margin criterion (MMC), a new algorithm of statistically uncorrelated optimal discriminant vectors and a new algorithm of orthogonal optimal discriminant vectors for feature extraction were proposed. The purpose of the maximum margin criterion is to maximize the inter-class scatter while simultaneously minimizing the intra-class scatter after the projection. Compared with original MMC method and principal component analysis (PCA) method, the proposed methods are better in terms of reducing or eliminating the statistically correlation between features and improving recognition rate. The experiment results on Olivetti Research Laboratory (ORL) face database shows that the new feature extraction method of statistically uncorrelated maximum margin criterion (SUMMC) are better in terms of recognition rate and stability. Besides, the relations between maximum margin criterion and Fisher criterion for feature extraction were revealed.

  9. Identifying Nanoscale Structure-Function Relationships Using Multimodal Atomic Force Microscopy, Dimensionality Reduction, and Regression Techniques.

    PubMed

    Kong, Jessica; Giridharagopal, Rajiv; Harrison, Jeffrey S; Ginger, David S

    2018-05-31

    Correlating nanoscale chemical specificity with operational physics is a long-standing goal of functional scanning probe microscopy (SPM). We employ a data analytic approach combining multiple microscopy modes, using compositional information in infrared vibrational excitation maps acquired via photoinduced force microscopy (PiFM) with electrical information from conductive atomic force microscopy. We study a model polymer blend comprising insulating poly(methyl methacrylate) (PMMA) and semiconducting poly(3-hexylthiophene) (P3HT). We show that PiFM spectra are different from FTIR spectra, but can still be used to identify local composition. We use principal component analysis to extract statistically significant principal components and principal component regression to predict local current and identify local polymer composition. In doing so, we observe evidence of semiconducting P3HT within PMMA aggregates. These methods are generalizable to correlated SPM data and provide a meaningful technique for extracting complex compositional information that are impossible to measure from any one technique.

  10. The Perceptions of Principals and Teachers Regarding Mental Health Providers' Impact on Student Achievement in High Poverty Schools

    ERIC Educational Resources Information Center

    Perry, Teresa

    2012-01-01

    This study examined the perceptions of principals and teachers regarding mental health provider's impact on student achievement and behavior in high poverty schools using descriptive statistics, t-test, and two-way ANOVA. Respondents in this study shared similar views concerning principal and teacher satisfaction and levels of support for the…

  11. Principal curvatures and area ratio of propagating surfaces in isotropic turbulence

    NASA Astrophysics Data System (ADS)

    Zheng, Tianhang; You, Jiaping; Yang, Yue

    2017-10-01

    We study the statistics of principal curvatures and the surface area ratio of propagating surfaces with a constant or nonconstant propagating velocity in isotropic turbulence using direct numerical simulation. Propagating surface elements initially constitute a plane to model a planar premixed flame front. When the statistics of evolving propagating surfaces reach the stationary stage, the statistical profiles of principal curvatures scaled by the Kolmogorov length scale versus the constant displacement speed scaled by the Kolmogorov velocity scale collapse at different Reynolds numbers. The magnitude of averaged principal curvatures and the number of surviving surface elements without cusp formation decrease with increasing displacement speed. In addition, the effect of surface stretch on the nonconstant displacement speed inhibits the cusp formation on surface elements at negative Markstein numbers. In order to characterize the wrinkling process of the global propagating surface, we develop a model to demonstrate that the increase of the surface area ratio is primarily due to positive Lagrangian time integrations of the area-weighted averaged tangential strain-rate term and propagation-curvature term. The difference between the negative averaged mean curvature and the positive area-weighted averaged mean curvature characterizes the cellular geometry of the global propagating surface.

  12. On the brain structure heterogeneity of autism: Parsing out acquisition site effects with significance-weighted principal component analysis.

    PubMed

    Martinez-Murcia, Francisco Jesús; Lai, Meng-Chuan; Górriz, Juan Manuel; Ramírez, Javier; Young, Adam M H; Deoni, Sean C L; Ecker, Christine; Lombardo, Michael V; Baron-Cohen, Simon; Murphy, Declan G M; Bullmore, Edward T; Suckling, John

    2017-03-01

    Neuroimaging studies have reported structural and physiological differences that could help understand the causes and development of Autism Spectrum Disorder (ASD). Many of them rely on multisite designs, with the recruitment of larger samples increasing statistical power. However, recent large-scale studies have put some findings into question, considering the results to be strongly dependent on the database used, and demonstrating the substantial heterogeneity within this clinically defined category. One major source of variance may be the acquisition of the data in multiple centres. In this work we analysed the differences found in the multisite, multi-modal neuroimaging database from the UK Medical Research Council Autism Imaging Multicentre Study (MRC AIMS) in terms of both diagnosis and acquisition sites. Since the dissimilarities between sites were higher than between diagnostic groups, we developed a technique called Significance Weighted Principal Component Analysis (SWPCA) to reduce the undesired intensity variance due to acquisition site and to increase the statistical power in detecting group differences. After eliminating site-related variance, statistically significant group differences were found, including Broca's area and the temporo-parietal junction. However, discriminative power was not sufficient to classify diagnostic groups, yielding accuracies results close to random. Our work supports recent claims that ASD is a highly heterogeneous condition that is difficult to globally characterize by neuroimaging, and therefore different (and more homogenous) subgroups should be defined to obtain a deeper understanding of ASD. Hum Brain Mapp 38:1208-1223, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  13. A study of the professional development needs of Shiraz high schools’ principals in the area of educational leadership

    PubMed Central

    HAYAT, ALIASGHAR; ABDOLLAHI, BIJAN; ZAINABADI, HASAN REZA; ARASTEH, HAMID REZA

    2015-01-01

    Introduction The increased emphasis on standards-based school accountability since the passage of the no child left behind act of 2001 is focusing critical attention on the professional development of school principals and their ability to meet the challenges of improving the student outcomes. Due to this subject, the current study examined professional development needs of Shiraz high schools principals. Methods The statistical population consisted of 343 principals of Shiraz high schools, of whom 250 subjects were selected using Krejcie and Morgan (1978) sample size determination table. To collect the data, a questionnaire developed by Salazar (2007) was administered. This questionnaire was designed for professional development in the leadership skills/competencies and consisted of 25 items in each leadership performance domain using five-point Likert-type scales. The content validity of the questionnaire was confirmed and the Cronbach’s Alpha coefficient was (a=0.78). To analyze the data, descriptive statistics and Paired-Samples t-test were used. Also, the data was analyzed through SPSS14 software. Results The findings showed that principals’ “Importance” ratings were always higher than their “Actual proficiency” ratings. The mean score of the difference between “Importance” and “Actual proficiency” pair on “Organizing resources” was 2.11, making it the highest “need” area. The lowest need area was “Managing the organization and operational procedures” at 0.81. Also, the results showed that there was a statistically significant difference between the means of the “Importance” and the corresponding means on the “Actual proficiency” (Difference of means=1.48, t=49.38, p<0.001). Conclusion Based on the obtained results, the most important professional development needs of the principals included organizing resources, resolving complex problems, understanding student development and learning, developing the vision and the mission, building team commitment, understanding measurements, evaluation and assessment strategies, facilitating the change process, solving problems and making decisions. In other words, the principals had statistically significant professional development needs in all areas of the educational leadership. Also, the results suggested that today’s school principals need to grow and learn throughout their careers by ongoing professional development. PMID:26269786

  14. Quality improvement of diagnosis of the electromyography data based on statistical characteristics of the measured signals

    NASA Astrophysics Data System (ADS)

    Selivanova, Karina G.; Avrunin, Oleg G.; Zlepko, Sergii M.; Romanyuk, Sergii O.; Zabolotna, Natalia I.; Kotyra, Andrzej; Komada, Paweł; Smailova, Saule

    2016-09-01

    Research and systematization of motor disorders, taking into account the clinical and neurophysiologic phenomena, are important and actual problem of neurology. The article describes a technique for decomposing surface electromyography (EMG), using Principal Component Analysis. The decomposition is achieved by a set of algorithms that uses a specially developed for analyze EMG. The accuracy was verified by calculation of Mahalanobis distance and Probability error.

  15. Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques

    NASA Astrophysics Data System (ADS)

    Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein

    2017-10-01

    The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.

  16. Chemical modeling of groundwater in the Banat Plain, southwestern Romania, with elevated As content and co-occurring species by combining diagrams and unsupervised multivariate statistical approaches.

    PubMed

    Butaciu, Sinziana; Senila, Marin; Sarbu, Costel; Ponta, Michaela; Tanaselia, Claudiu; Cadar, Oana; Roman, Marius; Radu, Emil; Sima, Mihaela; Frentiu, Tiberiu

    2017-04-01

    The study proposes a combined model based on diagrams (Gibbs, Piper, Stuyfzand Hydrogeochemical Classification System) and unsupervised statistical approaches (Cluster Analysis, Principal Component Analysis, Fuzzy Principal Component Analysis, Fuzzy Hierarchical Cross-Clustering) to describe natural enrichment of inorganic arsenic and co-occurring species in groundwater in the Banat Plain, southwestern Romania. Speciation of inorganic As (arsenite, arsenate), ion concentrations (Na + , K + , Ca 2+ , Mg 2+ , HCO 3 - , Cl - , F - , SO 4 2- , PO 4 3- , NO 3 - ), pH, redox potential, conductivity and total dissolved substances were performed. Classical diagrams provided the hydrochemical characterization, while statistical approaches were helpful to establish (i) the mechanism of naturally occurring of As and F - species and the anthropogenic one for NO 3 - , SO 4 2- , PO 4 3- and K + and (ii) classification of groundwater based on content of arsenic species. The HCO 3 - type of local groundwater and alkaline pH (8.31-8.49) were found to be responsible for the enrichment of arsenic species and occurrence of F - but by different paths. The PO 4 3- -AsO 4 3- ion exchange, water-rock interaction (silicates hydrolysis and desorption from clay) were associated to arsenate enrichment in the oxidizing aquifer. Fuzzy Hierarchical Cross-Clustering was the strongest tool for the rapid simultaneous classification of groundwaters as a function of arsenic content and hydrogeochemical characteristics. The approach indicated the Na + -F - -pH cluster as marker for groundwater with naturally elevated As and highlighted which parameters need to be monitored. A chemical conceptual model illustrating the natural and anthropogenic paths and enrichment of As and co-occurring species in the local groundwater supported by mineralogical analysis of rocks was established. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Relating N2O emissions during biological nitrogen removal with operating conditions using multivariate statistical techniques.

    PubMed

    Vasilaki, V; Volcke, E I P; Nandi, A K; van Loosdrecht, M C M; Katsou, E

    2018-04-26

    Multivariate statistical analysis was applied to investigate the dependencies and underlying patterns between N 2 O emissions and online operational variables (dissolved oxygen and nitrogen component concentrations, temperature and influent flow-rate) during biological nitrogen removal from wastewater. The system under study was a full-scale reactor, for which hourly sensor data were available. The 15-month long monitoring campaign was divided into 10 sub-periods based on the profile of N 2 O emissions, using Binary Segmentation. The dependencies between operating variables and N 2 O emissions fluctuated according to Spearman's rank correlation. The correlation between N 2 O emissions and nitrite concentrations ranged between 0.51 and 0.78. Correlation >0.7 between N 2 O emissions and nitrate concentrations was observed at sub-periods with average temperature lower than 12 °C. Hierarchical k-means clustering and principal component analysis linked N 2 O emission peaks with precipitation events and ammonium concentrations higher than 2 mg/L, especially in sub-periods characterized by low N 2 O fluxes. Additionally, the highest ranges of measured N 2 O fluxes belonged to clusters corresponding with NO 3 -N concentration less than 1 mg/L in the upstream plug-flow reactor (middle of oxic zone), indicating slow nitrification rates. The results showed that the range of N 2 O emissions partially depends on the prior behavior of the system. The principal component analysis validated the findings from the clustering analysis and showed that ammonium, nitrate, nitrite and temperature explained a considerable percentage of the variance in the system for the majority of the sub-periods. The applied statistical methods, linked the different ranges of emissions with the system variables, provided insights on the effect of operating conditions on N 2 O emissions in each sub-period and can be integrated into N 2 O emissions data processing at wastewater treatment plants. Copyright © 2018. Published by Elsevier Ltd.

  18. Application of multivariate statistical analysis concepts for assessment of hydrogeochemistry of groundwater—a study in Suri I and II blocks of Birbhum District, West Bengal, India

    NASA Astrophysics Data System (ADS)

    Das, Shreya; Nag, S. K.

    2017-05-01

    Multivariate statistical techniques, cluster and principal component analysis were applied to the data on groundwater quality of Suri I and II Blocks of Birbhum District, West Bengal, India, to extract principal factors corresponding to the different sources of variation in the hydrochemistry as well as the main controls on the hydrochemistry. For this, bore well water samples have been collected in two phases, during Post-monsoon (November 2012) and Pre-monsoon (April 2013) from 26 sampling locations spread homogeneously over the two blocks. Excess fluoride in groundwater has been reported at two locations both in post- and in pre-monsoon sessions, with a rise observed in pre-monsoon. Localized presence of excess iron has also been observed during both sessions. The water is found to be mildly alkaline in post-monsoon but slightly acidic at some locations during pre-monsoon. Correlation and cluster analysis studies demonstrate that fluoride shares a moderately positive correlation with pH in post-monsoon and a very strong one with carbonate in pre-monsoon indicating dominance of rock water interaction and ion exchange activity in the study area. Certain locations in the study area have been reported with less than 0.6 mg/l fluoride in groundwater, leading to possibility of occurrence of severe dental caries especially in children. Low values of sulfate and phosphate in water indicate a meager chance of contamination of groundwater due to anthropogenic factors.

  19. FT-IR spectroscopy and multivariate analysis as an auxiliary tool for diagnosis of mental disorders: Bipolar and schizophrenia cases

    NASA Astrophysics Data System (ADS)

    Ogruc Ildiz, G.; Arslan, M.; Unsalan, O.; Araujo-Andrade, C.; Kurt, E.; Karatepe, H. T.; Yilmaz, A.; Yalcinkaya, O. B.; Herken, H.

    2016-01-01

    In this study, a methodology based on Fourier-transform infrared spectroscopy and principal component analysis and partial least square methods is proposed for the analysis of blood plasma samples in order to identify spectral changes correlated with some biomarkers associated with schizophrenia and bipolarity. Our main goal was to use the spectral information for the calibration of statistical models to discriminate and classify blood plasma samples belonging to bipolar and schizophrenic patients. IR spectra of 30 samples of blood plasma obtained from each, bipolar and schizophrenic patients and healthy control group were collected. The results obtained from principal component analysis (PCA) show a clear discrimination between the bipolar (BP), schizophrenic (SZ) and control group' (CG) blood samples that also give possibility to identify three main regions that show the major differences correlated with both mental disorders (biomarkers). Furthermore, a model for the classification of the blood samples was calibrated using partial least square discriminant analysis (PLS-DA), allowing the correct classification of BP, SZ and CG samples. The results obtained applying this methodology suggest that it can be used as a complimentary diagnostic tool for the detection and discrimination of these mental diseases.

  20. Quantifying Individual Brain Connectivity with Functional Principal Component Analysis for Networks.

    PubMed

    Petersen, Alexander; Zhao, Jianyang; Carmichael, Owen; Müller, Hans-Georg

    2016-09-01

    In typical functional connectivity studies, connections between voxels or regions in the brain are represented as edges in a network. Networks for different subjects are constructed at a given graph density and are summarized by some network measure such as path length. Examining these summary measures for many density values yields samples of connectivity curves, one for each individual. This has led to the adoption of basic tools of functional data analysis, most commonly to compare control and disease groups through the average curves in each group. Such group differences, however, neglect the variability in the sample of connectivity curves. In this article, the use of functional principal component analysis (FPCA) is demonstrated to enrich functional connectivity studies by providing increased power and flexibility for statistical inference. Specifically, individual connectivity curves are related to individual characteristics such as age and measures of cognitive function, thus providing a tool to relate brain connectivity with these variables at the individual level. This individual level analysis opens a new perspective that goes beyond previous group level comparisons. Using a large data set of resting-state functional magnetic resonance imaging scans, relationships between connectivity and two measures of cognitive function-episodic memory and executive function-were investigated. The group-based approach was implemented by dichotomizing the continuous cognitive variable and testing for group differences, resulting in no statistically significant findings. To demonstrate the new approach, FPCA was implemented, followed by linear regression models with cognitive scores as responses, identifying significant associations of connectivity in the right middle temporal region with both cognitive scores.

  1. ICAP - An Interactive Cluster Analysis Procedure for analyzing remotely sensed data

    NASA Technical Reports Server (NTRS)

    Wharton, S. W.; Turner, B. J.

    1981-01-01

    An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. ICAP differs from conventional clustering algorithms by allowing the analyst to optimize the cluster configuration by inspection, rather than by manipulating process parameters. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters, and the analyst, who can evaluate and elect to modify the cluster structure. Clusters can be deleted, or lumped together pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The principal advantage of this approach is that it allows prior information (when available) to be used directly in the analysis, since the analyst interacts with ICAP in a straightforward manner, using basic terms with which he is more likely to be familiar. Results from testing ICAP showed that an informed use of ICAP can improve classification, as compared to an existing cluster analysis procedure.

  2. Modeling and Prediction of Monthly Total Ozone Concentrations by Use of an Artificial Neural Network Based on Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Chattopadhyay, Surajit; Chattopadhyay, Goutami

    2012-10-01

    In the work discussed in this paper we considered total ozone time series over Kolkata (22°34'10.92″N, 88°22'10.92″E), an urban area in eastern India. Using cloud cover, average temperature, and rainfall as the predictors, we developed an artificial neural network, in the form of a multilayer perceptron with sigmoid non-linearity, for prediction of monthly total ozone concentrations from values of the predictors in previous months. We also estimated total ozone from values of the predictors in the same month. Before development of the neural network model we removed multicollinearity by means of principal component analysis. On the basis of the variables extracted by principal component analysis, we developed three artificial neural network models. By rigorous statistical assessment it was found that cloud cover and rainfall can act as good predictors for monthly total ozone when they are considered as the set of input variables for the neural network model constructed in the form of a multilayer perceptron. In general, the artificial neural network has good potential for predicting and estimating monthly total ozone on the basis of the meteorological predictors. It was further observed that during pre-monsoon and winter seasons, the proposed models perform better than during and after the monsoon.

  3. An Examination of Potential Racial and Gender Bias in the Principal Version of the Interactive Computer Interview System

    ERIC Educational Resources Information Center

    DiPonio, Joseph M.

    2010-01-01

    The primary object of this study was to determine whether racial and/or gender bias were evidenced in the use of the ICIS-Principal. Specifically, will the use of the ICIS-Principal result in biased scores at a statistically significant level when rating current practicing administrators of varying gender and race. The study involved simulated…

  4. A Study of Strengths and Weaknesses of Descriptive Assessment from Principals, Teachers and Experts Points of View in Chaharmahal and Bakhteyari Primary Schools

    ERIC Educational Resources Information Center

    Sharief, Mostafa; Naderi, Mahin; Hiedari, Maryam Shoja; Roodbari, Omolbanin; Jalilvand, Mohammad Reza

    2012-01-01

    The aim of current study is to determine the strengths and weaknesses of descriptive evaluation from the viewpoint of principals, teachers and experts of Chaharmahal and Bakhtiari province. A descriptive survey was performed. Statistical population includes 208 principals, 303 teachers, and 100 executive experts of descriptive evaluation scheme in…

  5. A Study on the Attitudes of Students, Instructors, and Educational Principals to Electronic Administration of Final-Semester Examinations in Payame Noor University in Iran

    ERIC Educational Resources Information Center

    Omidian, Faranak; Nedayeh Ali, Farzaneh

    2015-01-01

    The aim of this study was to investigate the attitudes of students, instructors, and educational principals to electronic administration of final-semester examinations at undergraduate and post- graduate levels in Payame Noor University in Khuzestan. The statistical population of this study consisted of all educational principals, instructors, of…

  6. Defining the ecological hydrology of Taiwan Rivers using multivariate statistical methods

    NASA Astrophysics Data System (ADS)

    Chang, Fi-John; Wu, Tzu-Ching; Tsai, Wen-Ping; Herricks, Edwin E.

    2009-09-01

    SummaryThe identification and verification of ecohydrologic flow indicators has found new support as the importance of ecological flow regimes is recognized in modern water resources management, particularly in river restoration and reservoir management. An ecohydrologic indicator system reflecting the unique characteristics of Taiwan's water resources and hydrology has been developed, the Taiwan ecohydrological indicator system (TEIS). A major challenge for the water resources community is using the TEIS to provide environmental flow rules that improve existing water resources management. This paper examines data from the extensive network of flow monitoring stations in Taiwan using TEIS statistics to define and refine environmental flow options in Taiwan. Multivariate statistical methods were used to examine TEIS statistics for 102 stations representing the geographic and land use diversity of Taiwan. The Pearson correlation coefficient showed high multicollinearity between the TEIS statistics. Watersheds were separated into upper and lower-watershed locations. An analysis of variance indicated significant differences between upstream, more natural, and downstream, more developed, locations in the same basin with hydrologic indicator redundancy in flow change and magnitude statistics. Issues of multicollinearity were examined using a Principal Component Analysis (PCA) with the first three components related to general flow and high/low flow statistics, frequency and time statistics, and quantity statistics. These principle components would explain about 85% of the total variation. A major conclusion is that managers must be aware of differences among basins, as well as differences within basins that will require careful selection of management procedures to achieve needed flow regimes.

  7. Predicting Statistical Response and Extreme Events in Uncertainty Quantification through Reduced-Order Models

    NASA Astrophysics Data System (ADS)

    Qi, D.; Majda, A.

    2017-12-01

    A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with distinct statistical structures.

  8. Hydrochemical characteristics and water quality assessment of surface water and groundwater in Songnen plain, Northeast China.

    PubMed

    Zhang, Bing; Song, Xianfang; Zhang, Yinghua; Han, Dongmei; Tang, Changyuan; Yu, Yilei; Ma, Ying

    2012-05-15

    Water quality is the critical factor that influence on human health and quantity and quality of grain production in semi-humid and semi-arid area. Songnen plain is one of the grain bases in China, as well as one of the three major distribution regions of soda saline-alkali soil in the world. To assess the water quality, surface water and groundwater were sampled and analyzed by fuzzy membership analysis and multivariate statistics. The surface water were gather into class I, IV and V, while groundwater were grouped as class I, II, III and V by fuzzy membership analysis. The water samples were grouped into four categories according to irrigation water quality assessment diagrams of USDA. Most water samples distributed in category C1-S1, C2-S2 and C3-S3. Three groups were generated from hierarchical cluster analysis. Four principal components were extracted from principal component analysis. The indicators to water quality assessment were Na, HCO(3), NO(3), Fe, Mn and EC from principal component analysis. We conclude that surface water and shallow groundwater are suitable for irrigation, the reservoir and deep groundwater in upstream are the resources for drinking. The water for drinking should remove of the naturally occurring ions of Fe and Mn. The control of sodium and salinity hazard is required for irrigation. The integrated management of surface water and groundwater for drinking and irrigation is to solve the water issues. Copyright © 2012 Elsevier Ltd. All rights reserved.

  9. Examining parents' ratings of middle-school students' academic self-regulation using principal axis factoring analysis.

    PubMed

    Chen, Peggy P; Cleary, Timothy J; Lui, Angela M

    2015-09-01

    This study examined the reliability and validity of a parent rating scale, the Self-Regulation Strategy Inventory: Parent Rating Scale (SRSI-PRS), using a sample of 451 parents of sixth- and seventh-grade middle-school students. Principal axis factoring (PAF) analysis revealed a 3-factor structure for the 23-item SRSI-PRS: (a) Managing Behavior and Learning (α = .92), (b) Maladaptive Regulatory Behaviors (α = .76), and (c) Managing Environment (α = .84). The majority of the observed relations between these 3 subscales, and the SRSI-SR, student motivation beliefs, and student mathematics grades were statistically significant and in the small to medium range. After controlling for various student variables and motivation indices of parental involvement, 2 SRSI-PRS factors (Managing Behavior and Learning, Maladaptive Regulatory Behaviors) reliably predicted students' achievement in their mathematics course. This study provides initial support for the validity and reliability of the SRSI-PRS and underscores the advantages of obtaining parental ratings of students' SRL behaviors. (c) 2015 APA, all rights reserved).

  10. From The Pierre Auger Observatory to AugerPrime

    NASA Astrophysics Data System (ADS)

    Parra, Alejandra; Martínez Bravo, Oscar; Pierre Auger Collaboration

    2017-06-01

    In the present work we report the principal motivation and reasons for the new stage of the Pierre Auger Observatory, AugerPrime. This upgrade has as its principal goal to clarify the origin of the highest energy cosmic rays through improvement in studies of the mass composition. To accomplished this goal, AugerPrime will use air shower universality, which states that extensive air showers can be completely described by three parameters: the primary energy E 0, the atmospheric shower depth of maximum X max, and the number of muons, Nμ . The Auger Collaboration has planned to complement its surface array (SD), based on water-Cherenkov detectors (WCD) with scintillator detectors, calls SSD (Scintillator Surface Detector). These will be placed at the top of each WCD station. The SSD will allow a shower to shower analysis, instead of the statistical analysis that the Observatory has previously done, to determine the mass composition of the primary particle by the electromagnetic to muonic ratio.

  11. Gambling, games of skill and human ecology: a pilot study by a multidimensional analysis approach.

    PubMed

    Valera, Luca; Giuliani, Alessandro; Gizzi, Alessio; Tartaglia, Francesco; Tambone, Vittoradolfo

    2015-01-01

    The present pilot study aims at analyzing the human activity of playing in the light of an indicator of human ecology (HE). We highlighted the four essential anthropological dimensions (FEAD), starting from the analysis of questionnaires administered to actual gamers. The coherence between theoretical construct and observational data is a remarkable proof-of-concept of the possibility of establishing an experimentally motivated link between a philosophical construct (coming from Huizinga's Homo ludens definition) and actual gamers' motivation pattern. The starting hypothesis is that the activity of playing becomes ecological (and thus not harmful) when it achieves the harmony between the FEAD, thus realizing HE; conversely, it becomes at risk of creating some form of addiction, when destroying FEAD balance. We analyzed the data by means of variable clustering (oblique principal components) so to experimentally verify the existence of the hypothesized dimensions. The subsequent projection of statistical units (gamers) on the orthogonal space spanned by principal components allowed us to generate a meaningful, albeit preliminary, clusterization of gamer profiles.

  12. Discrimination of serum Raman spectroscopy between normal and colorectal cancer

    NASA Astrophysics Data System (ADS)

    Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi

    2011-07-01

    Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.

  13. Metabolomic evaluation of ginsenosides distribution in Panax genus (Panax ginseng and Panax quinquefolius) using multivariate statistical analysis.

    PubMed

    Pace, Roberto; Martinelli, Ernesto Marco; Sardone, Nicola; D E Combarieu, Eric

    2015-03-01

    Ginseng is any one of the eleven species belonging to the genus Panax of the family Araliaceae and is found in North America and in eastern Asia. Ginseng is characterized by the presence of ginsenosides. Principally Panax ginseng and Panax quinquefolius are the adaptogenic herbs and are commonly distributed as health food markets. In the present study high performance liquid chromatography has been used to identify and quantify ginsenosides in the two subject species and the different parts of the plant (roots, neck, leaves, flowers, fruits). The power of this chromatographic technique to evaluate the identity of botanical material and to distinguishing different part of the plants has been investigated with metabolomic technique such as principal component analysis. Metabolomics provide a good opportunity for mining useful chemical information from the chromatographic data set resulting an important tool for quality evaluation of medicinal plants in the authenticity, consistency and efficacy. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. High Accuracy Passive Magnetic Field-Based Localization for Feedback Control Using Principal Component Analysis.

    PubMed

    Foong, Shaohui; Sun, Zhenglong

    2016-08-12

    In this paper, a novel magnetic field-based sensing system employing statistically optimized concurrent multiple sensor outputs for precise field-position association and localization is presented. This method capitalizes on the independence between simultaneous spatial field measurements at multiple locations to induce unique correspondences between field and position. This single-source-multi-sensor configuration is able to achieve accurate and precise localization and tracking of translational motion without contact over large travel distances for feedback control. Principal component analysis (PCA) is used as a pseudo-linear filter to optimally reduce the dimensions of the multi-sensor output space for computationally efficient field-position mapping with artificial neural networks (ANNs). Numerical simulations are employed to investigate the effects of geometric parameters and Gaussian noise corruption on PCA assisted ANN mapping performance. Using a 9-sensor network, the sensing accuracy and closed-loop tracking performance of the proposed optimal field-based sensing system is experimentally evaluated on a linear actuator with a significantly more expensive optical encoder as a comparison.

  15. Sensor Failure Detection of FASSIP System using Principal Component Analysis

    NASA Astrophysics Data System (ADS)

    Sudarno; Juarsa, Mulya; Santosa, Kussigit; Deswandri; Sunaryo, Geni Rina

    2018-02-01

    In the nuclear reactor accident of Fukushima Daiichi in Japan, the damages of core and pressure vessel were caused by the failure of its active cooling system (diesel generator was inundated by tsunami). Thus researches on passive cooling system for Nuclear Power Plant are performed to improve the safety aspects of nuclear reactors. The FASSIP system (Passive System Simulation Facility) is an installation used to study the characteristics of passive cooling systems at nuclear power plants. The accuracy of sensor measurement of FASSIP system is essential, because as the basis for determining the characteristics of a passive cooling system. In this research, a sensor failure detection method for FASSIP system is developed, so the indication of sensor failures can be detected early. The method used is Principal Component Analysis (PCA) to reduce the dimension of the sensor, with the Squarred Prediction Error (SPE) and statistic Hotteling criteria for detecting sensor failure indication. The results shows that PCA method is capable to detect the occurrence of a failure at any sensor.

  16. Design of experiments and principal component analysis as approaches for enhancing performance of gas-diffusional air-breathing bilirubin oxidase cathode

    NASA Astrophysics Data System (ADS)

    Babanova, Sofia; Artyushkova, Kateryna; Ulyanova, Yevgenia; Singhal, Sameer; Atanassov, Plamen

    2014-01-01

    Two statistical methods, design of experiments (DOE) and principal component analysis (PCA) are employed to investigate and improve performance of air-breathing gas-diffusional enzymatic electrodes. DOE is utilized as a tool for systematic organization and evaluation of various factors affecting the performance of the composite system. Based on the results from the DOE, an improved cathode is constructed. The current density generated utilizing the improved cathode (755 ± 39 μA cm-2 at 0.3 V vs. Ag/AgCl) is 2-5 times higher than the highest current density previously achieved. Three major factors contributing to the cathode performance are identified: the amount of enzyme, the volume of phosphate buffer used to immobilize the enzyme, and the thickness of the gas-diffusion layer (GDL). PCA is applied as an independent confirmation tool to support conclusions made by DOE and to visualize the contribution of factors in individual cathode configurations.

  17. Assessment of technological level of stem cell research using principal component analysis.

    PubMed

    Do Cho, Sung; Hwan Hyun, Byung; Kim, Jae Kyeom

    2016-01-01

    In general, technological levels have been assessed based on specialist's opinion through the methods such as Delphi. But in such cases, results could be significantly biased per study design and individual expert. In this study, therefore scientific literatures and patents were selected by means of analytic indexes for statistic approach and technical assessment of stem cell fields. The analytic indexes, numbers and impact indexes of scientific literatures and patents, were weighted based on principal component analysis, and then, were summated into the single value. Technological obsolescence was calculated through the cited half-life of patents issued by the United States Patents and Trademark Office and was reflected in technological level assessment. As results, ranks of each nation's in reference to the technology level were rated by the proposed method. Furthermore we were able to evaluate strengthens and weaknesses thereof. Although our empirical research presents faithful results, in the further study, there is a need to compare the existing methods and the suggested method.

  18. Statistical analysis for improving data precision in the SPME GC-MS analysis of blackberry (Rubus ulmifolius Schott) volatiles.

    PubMed

    D'Agostino, M F; Sanz, J; Martínez-Castro, I; Giuffrè, A M; Sicari, V; Soria, A C

    2014-07-01

    Statistical analysis has been used for the first time to evaluate the dispersion of quantitative data in the solid-phase microextraction (SPME) followed by gas chromatography-mass spectrometry (GC-MS) analysis of blackberry (Rubus ulmifolius Schott) volatiles with the aim of improving their precision. Experimental and randomly simulated data were compared using different statistical parameters (correlation coefficients, Principal Component Analysis loadings and eigenvalues). Non-random factors were shown to significantly contribute to total dispersion; groups of volatile compounds could be associated with these factors. A significant improvement of precision was achieved when considering percent concentration ratios, rather than percent values, among those blackberry volatiles with a similar dispersion behavior. As novelty over previous references, and to complement this main objective, the presence of non-random dispersion trends in data from simple blackberry model systems was evidenced. Although the influence of the type of matrix on data precision was proved, the possibility of a better understanding of the dispersion patterns in real samples was not possible from model systems. The approach here used was validated for the first time through the multicomponent characterization of Italian blackberries from different harvest years. Copyright © 2014 Elsevier B.V. All rights reserved.

  19. Psychological Analyses of Courageous Performance in Military Personnel

    DTIC Science & Technology

    1986-11-01

    schedule HR heart rate IBI inter- beat interval N number of subjects NS not statistically significant P probability PCA principal components analysis RAQ...tones in the range of 400 to 600 Hz, set at a level of 60 dB, transmitted for 1 sec binaurally through earphones from a commercial oscillator. The...because of interference on the recording trace. Cardiac activity was measured in terms of heart rate (HR). The number of beats /minute was estimared by

  20. Geometric, Statistical, and Topological Modeling of Intrinsic Data Manifolds: Application to 3D Shapes

    DTIC Science & Technology

    2009-01-01

    representation to a simple curve in 3D by using the Whitney embedding theorem. In a very ludic way, we propose to combine phases one and two to...elimination principle which takes advantage of the designed parametrization. To further refine discrimination among objects, we introduce a post...packing numbers and design of principal curves. IEEE transactions on Pattern Analysis and Machine Intel- ligence, 22(3):281-297, 2000. [68] M. H. Yang, Face

  1. Principal component analysis in construction of 3D human knee joint models using a statistical shape model method.

    PubMed

    Tsai, Tsung-Yuan; Li, Jing-Sheng; Wang, Shaobai; Li, Pingyue; Kwon, Young-Min; Li, Guoan

    2015-01-01

    The statistical shape model (SSM) method that uses 2D images of the knee joint to predict the three-dimensional (3D) joint surface model has been reported in the literature. In this study, we constructed a SSM database using 152 human computed tomography (CT) knee joint models, including the femur, tibia and patella and analysed the characteristics of each principal component of the SSM. The surface models of two in vivo knees were predicted using the SSM and their 2D bi-plane fluoroscopic images. The predicted models were compared to their CT joint models. The differences between the predicted 3D knee joint surfaces and the CT image-based surfaces were 0.30 ± 0.81 mm, 0.34 ± 0.79 mm and 0.36 ± 0.59 mm for the femur, tibia and patella, respectively (average ± standard deviation). The computational time for each bone of the knee joint was within 30 s using a personal computer. The analysis of this study indicated that the SSM method could be a useful tool to construct 3D surface models of the knee with sub-millimeter accuracy in real time. Thus, it may have a broad application in computer-assisted knee surgeries that require 3D surface models of the knee.

  2. National Institutes of Health Funding in Radiation Oncology: A Snapshot

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steinberg, Michael; McBride, William H.; Vlashi, Erina

    Currently, pay lines for National Institutes of Health (NIH) grants are at a historical low. In this climate of fierce competition, knowledge about the funding situation in a small field like radiation oncology becomes very important for career planning and recruitment of faculty. Unfortunately, these data cannot be easily extracted from the NIH's database because it does not discriminate between radiology and radiation oncology departments. At the start of fiscal year 2013 we extracted records for 952 individual grants, which were active at the time of analysis from the NIH database. Proposals originating from radiation oncology departments were identified manually.more » Descriptive statistics were generated using the JMP statistical software package. Our analysis identified 197 grants in radiation oncology. These proposals came from 134 individual investigators in 43 academic institutions. The majority of the grants (118) were awarded to principal investigators at the full professor level, and 122 principal investigators held a PhD degree. In 79% of the grants, the research topic fell into the field of biology, 13% in the field of medical physics. Only 7.6% of the proposals were clinical investigations. Our data suggest that the field of radiation oncology is underfunded by the NIH and that the current level of support does not match the relevance of radiation oncology for cancer patients or the potential of its academic work force.« less

  3. National Institutes of Health funding in radiation oncology: a snapshot.

    PubMed

    Steinberg, Michael; McBride, William H; Vlashi, Erina; Pajonk, Frank

    2013-06-01

    Currently, pay lines for National Institutes of Health (NIH) grants are at a historical low. In this climate of fierce competition, knowledge about the funding situation in a small field like radiation oncology becomes very important for career planning and recruitment of faculty. Unfortunately, these data cannot be easily extracted from the NIH's database because it does not discriminate between radiology and radiation oncology departments. At the start of fiscal year 2013 we extracted records for 952 individual grants, which were active at the time of analysis from the NIH database. Proposals originating from radiation oncology departments were identified manually. Descriptive statistics were generated using the JMP statistical software package. Our analysis identified 197 grants in radiation oncology. These proposals came from 134 individual investigators in 43 academic institutions. The majority of the grants (118) were awarded to principal investigators at the full professor level, and 122 principal investigators held a PhD degree. In 79% of the grants, the research topic fell into the field of biology, 13% in the field of medical physics. Only 7.6% of the proposals were clinical investigations. Our data suggest that the field of radiation oncology is underfunded by the NIH and that the current level of support does not match the relevance of radiation oncology for cancer patients or the potential of its academic work force. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data.

    PubMed

    Mejia, Amanda F; Nebel, Mary Beth; Eloyan, Ani; Caffo, Brian; Lindquist, Martin A

    2017-07-01

    Outlier detection for high-dimensional (HD) data is a popular topic in modern statistical research. However, one source of HD data that has received relatively little attention is functional magnetic resonance images (fMRI), which consists of hundreds of thousands of measurements sampled at hundreds of time points. At a time when the availability of fMRI data is rapidly growing-primarily through large, publicly available grassroots datasets-automated quality control and outlier detection methods are greatly needed. We propose principal components analysis (PCA) leverage and demonstrate how it can be used to identify outlying time points in an fMRI run. Furthermore, PCA leverage is a measure of the influence of each observation on the estimation of principal components, which are often of interest in fMRI data. We also propose an alternative measure, PCA robust distance, which is less sensitive to outliers and has controllable statistical properties. The proposed methods are validated through simulation studies and are shown to be highly accurate. We also conduct a reliability study using resting-state fMRI data from the Autism Brain Imaging Data Exchange and find that removal of outliers using the proposed methods results in more reliable estimation of subject-level resting-state networks using independent components analysis. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  5. THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures

    PubMed Central

    Theobald, Douglas L.; Wuttke, Deborah S.

    2008-01-01

    Summary THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. PMID:16777907

  6. Lithology and mineralogy recognition from geochemical logging tool data using multivariate statistical analysis.

    PubMed

    Konaté, Ahmed Amara; Ma, Huolin; Pan, Heping; Qin, Zhen; Ahmed, Hafizullah Abba; Dembele, N'dji Dit Jacques

    2017-10-01

    The availability of a deep well that penetrates deep into the Ultra High Pressure (UHP) metamorphic rocks is unusual and consequently offers a unique chance to study the metamorphic rocks. One such borehole is located in the southern part of Donghai County in the Sulu UHP metamorphic belt of Eastern China, from the Chinese Continental Scientific Drilling Main hole. This study reports the results obtained from the analysis of oxide log data. A geochemical logging tool provides in situ, gamma ray spectroscopy measurements of major and trace elements in the borehole. Dry weight percent oxide concentration logs obtained for this study were SiO 2 , K 2 O, TiO 2 , H 2 O, CO 2 , Na 2 O, Fe 2 O 3 , FeO, CaO, MnO, MgO, P 2 O 5 and Al 2 O 3 . Cross plot and Principal Component Analysis methods were applied for lithology characterization and mineralogy description respectively. Cross plot analysis allows lithological variations to be characterized. Principal Component Analysis shows that the oxide logs can be summarized by two components related to the feldspar and hydrous minerals. This study has shown that geochemical logging tool data is accurate and adequate to be tremendously useful in UHP metamorphic rocks analysis. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Statistics as Tools in Library Planning: On the State and Institutional Level.

    ERIC Educational Resources Information Center

    Trezza, Alphonse F.

    The principal uses of statistics in library planning may be illustrated by examples from the state of Illinois. State law specifies that the Illinois State Library compile and publish statistics on libraries. State agencies also play an important and expanding role in this effort. The state library now compiles statistics on all types of…

  8. Assessment of trace elements levels in patients with Type 2 diabetes using multivariate statistical analysis.

    PubMed

    Badran, M; Morsy, R; Soliman, H; Elnimr, T

    2016-01-01

    The trace elements metabolism has been reported to possess specific roles in the pathogenesis and progress of diabetes mellitus. Due to the continuous increase in the population of patients with Type 2 diabetes (T2D), this study aims to assess the levels and inter-relationships of fast blood glucose (FBG) and serum trace elements in Type 2 diabetic patients. This study was conducted on 40 Egyptian Type 2 diabetic patients and 36 healthy volunteers (Hospital of Tanta University, Tanta, Egypt). The blood serum was digested and then used to determine the levels of 24 trace elements using an inductive coupled plasma mass spectroscopy (ICP-MS). Multivariate statistical analysis depended on correlation coefficient, cluster analysis (CA) and principal component analysis (PCA), were used to analysis the data. The results exhibited significant changes in FBG and eight of trace elements, Zn, Cu, Se, Fe, Mn, Cr, Mg, and As, levels in the blood serum of Type 2 diabetic patients relative to those of healthy controls. The statistical analyses using multivariate statistical techniques were obvious in the reduction of the experimental variables, and grouping the trace elements in patients into three clusters. The application of PCA revealed a distinct difference in associations of trace elements and their clustering patterns in control and patients group in particular for Mg, Fe, Cu, and Zn that appeared to be the most crucial factors which related with Type 2 diabetes. Therefore, on the basis of this study, the contributors of trace elements content in Type 2 diabetic patients can be determine and specify with correlation relationship and multivariate statistical analysis, which confirm that the alteration of some essential trace metals may play a role in the development of diabetes mellitus. Copyright © 2015 Elsevier GmbH. All rights reserved.

  9. Representation of Probability Density Functions from Orbit Determination using the Particle Filter

    NASA Technical Reports Server (NTRS)

    Mashiku, Alinda K.; Garrison, James; Carpenter, J. Russell

    2012-01-01

    Statistical orbit determination enables us to obtain estimates of the state and the statistical information of its region of uncertainty. In order to obtain an accurate representation of the probability density function (PDF) that incorporates higher order statistical information, we propose the use of nonlinear estimation methods such as the Particle Filter. The Particle Filter (PF) is capable of providing a PDF representation of the state estimates whose accuracy is dependent on the number of particles or samples used. For this method to be applicable to real case scenarios, we need a way of accurately representing the PDF in a compressed manner with little information loss. Hence we propose using the Independent Component Analysis (ICA) as a non-Gaussian dimensional reduction method that is capable of maintaining higher order statistical information obtained using the PF. Methods such as the Principal Component Analysis (PCA) are based on utilizing up to second order statistics, hence will not suffice in maintaining maximum information content. Both the PCA and the ICA are applied to two scenarios that involve a highly eccentric orbit with a lower apriori uncertainty covariance and a less eccentric orbit with a higher a priori uncertainty covariance, to illustrate the capability of the ICA in relation to the PCA.

  10. Using Structural Equation Modeling To Fit Models Incorporating Principal Components.

    ERIC Educational Resources Information Center

    Dolan, Conor; Bechger, Timo; Molenaar, Peter

    1999-01-01

    Considers models incorporating principal components from the perspectives of structural-equation modeling. These models include the following: (1) the principal-component analysis of patterned matrices; (2) multiple analysis of variance based on principal components; and (3) multigroup principal-components analysis. Discusses fitting these models…

  11. Bioclimatic Classification of Northeast Asia for climate change response

    NASA Astrophysics Data System (ADS)

    Choi, Y.; Jeon, S. W.; Lim, C. H.

    2016-12-01

    As climate change has been getting worse, we should monitor the change of biodiversity, and distribution of species to handle the crisis and take advantage of climate change. The development of bioclimatic map which classifies land into homogenous zones by similar environment properties is the first step to establish a strategy. Statistically derived classifications of land provide useful spatial frameworks to support ecosystem research, monitoring and policy decisions. Many countries are trying to make this kind of map and actively utilize it to ecosystem conservation and management. However, the Northeast Asia including North Korea doesn't have detailed environmental information, and has not built environmental classification map. Therefore, this study presents a bioclimatic map of Northeast Asia based on statistical clustering of bioclimate data. Bioclim data ver1.4 which provided by WorldClim were considered for inclusion in a model. Eight of the most relevant climate variables were selected by correlation analysis, based on previous studies. Principal Components Analysis (PCA) was used to explain 86% of the variation into three independent dimensions, which were subsequently clustered using an ISODATA clustering. The bioclimatic zone of Northeast Asia could consist of 29, 35, and 50 zones. This bioclimatic map has a 30' resolution. To assess the accuracy, the correlation coefficient was calculated between the first principal component values of the classification variables and the vegetation index, Gross Primary Production (GPP). It shows about 0.5 Pearson correlation coefficient. This study constructed Northeast Asia bioclimatic map by statistical method with high resolution, but in order to better reflect the realities, the variety of climate variables should be considered. Also, further studies should do more quantitative and qualitative validation in various ways. Then, this could be used more effectively to support decision making on climate change adaptation.

  12. Modeling Pair Distribution Functions of Rare-Earth Phosphate Glasses Using Principal Component Analysis.

    PubMed

    Cole, Jacqueline M; Cheng, Xie; Payne, Michael C

    2016-11-07

    The use of principal component analysis (PCA) to statistically infer features of local structure from experimental pair distribution function (PDF) data is assessed on a case study of rare-earth phosphate glasses (REPGs). Such glasses, codoped with two rare-earth ions (R and R') of different sizes and optical properties, are of interest to the laser industry. The determination of structure-property relationships in these materials is an important aspect of their technological development. Yet, realizing the local structure of codoped REPGs presents significant challenges relative to their singly doped counterparts; specifically, R and R' are difficult to distinguish in terms of establishing relative material compositions, identifying atomic pairwise correlation profiles in a PDF that are associated with each ion, and resolving peak overlap of such profiles in PDFs. This study demonstrates that PCA can be employed to help overcome these structural complications, by statistically inferring trends in PDFs that exist for a restricted set of experimental data on REPGs, and using these as training data to predict material compositions and PDF profiles in unknown codoped REPGs. The application of these PCA methods to resolve individual atomic pairwise correlations in t(r) signatures is also presented. The training methods developed for these structural predictions are prevalidated by testing their ability to reproduce known physical phenomena, such as the lanthanide contraction, on PDF signatures of the structurally simpler singly doped REPGs. The intrinsic limitations of applying PCA to analyze PDFs relative to the quality control of source data, data processing, and sample definition, are also considered. While this case study is limited to lanthanide-doped REPGs, this type of statistical inference may easily be extended to other inorganic solid-state materials and be exploited in large-scale data-mining efforts that probe many t(r) functions.

  13. Facial anthropometric differences among gender, ethnicity, and age groups.

    PubMed

    Zhuang, Ziqing; Landsittel, Douglas; Benson, Stacey; Roberge, Raymond; Shaffer, Ronald

    2010-06-01

    The impact of race/ethnicity upon facial anthropometric data in the US workforce, on the development of personal protective equipment, has not been investigated to any significant degree. The proliferation of minority populations in the US workforce has increased the need to investigate differences in facial dimensions among these workers. The objective of this study was to determine the face shape and size differences among race and age groups from the National Institute for Occupational Safety and Health survey of 3997 US civilian workers. Survey participants were divided into two gender groups, four racial/ethnic groups, and three age groups. Measurements of height, weight, neck circumference, and 18 facial dimensions were collected using traditional anthropometric techniques. A multivariate analysis of the data was performed using Principal Component Analysis. An exploratory analysis to determine the effect of different demographic factors had on anthropometric features was assessed via a linear model. The 21 anthropometric measurements, body mass index, and the first and second principal component scores were dependent variables, while gender, ethnicity, age, occupation, weight, and height served as independent variables. Gender significantly contributes to size for 19 of 24 dependent variables. African-Americans have statistically shorter, wider, and shallower noses than Caucasians. Hispanic workers have 14 facial features that are significantly larger than Caucasians, while their nose protrusion, height, and head length are significantly shorter. The other ethnic group was composed primarily of Asian subjects and has statistically different dimensions from Caucasians for 16 anthropometric values. Nineteen anthropometric values for subjects at least 45 years of age are statistically different from those measured for subjects between 18 and 29 years of age. Workers employed in manufacturing, fire fighting, healthcare, law enforcement, and other occupational groups have facial features that differ significantly than those in construction. Statistically significant differences in facial anthropometric dimensions (P < 0.05) were noted between males and females, all racial/ethnic groups, and the subjects who were at least 45 years old when compared to workers between 18 and 29 years of age. These findings could be important to the design and manufacture of respirators, as well as employers responsible for supplying respiratory protective equipment to their employees.

  14. Multivariate statistical and lead isotopic analyses approach to identify heavy metal sources in topsoil from the industrial zone of Beijing Capital Iron and Steel Factory.

    PubMed

    Zhu, Guangxu; Guo, Qingjun; Xiao, Huayun; Chen, Tongbin; Yang, Jun

    2017-06-01

    Heavy metals are considered toxic to humans and ecosystems. In the present study, heavy metal concentration in soil was investigated using the single pollution index (PIi), the integrated Nemerow pollution index (PIN), and the geoaccumulation index (Igeo) to determine metal accumulation and its pollution status at the abandoned site of the Capital Iron and Steel Factory in Beijing and its surrounding area. Multivariate statistical (principal component analysis and correlation analysis), geostatistical analysis (ArcGIS tool), combined with stable Pb isotopic ratios, were applied to explore the characteristics of heavy metal pollution and the possible sources of pollutants. The results indicated that heavy metal elements show different degrees of accumulation in the study area, the observed trend of the enrichment factors, and the geoaccumulation index was Hg > Cd > Zn > Cr > Pb > Cu ≈ As > Ni. Hg, Cd, Zn, and Cr were the dominant elements that influenced soil quality in the study area. The Nemerow index method indicated that all of the heavy metals caused serious pollution except Ni. Multivariate statistical analysis indicated that Cd, Zn, Cu, and Pb show obvious correlation and have higher loads on the same principal component, suggesting that they had the same sources, which are related to industrial activities and vehicle emissions. The spatial distribution maps based on ordinary kriging showed that high concentrations of heavy metals were located in the local factory area and in the southeast-northwest part of the study region, corresponding with the predominant wind directions. Analyses of lead isotopes confirmed that Pb in the study soils is predominantly derived from three Pb sources: dust generated during steel production, coal combustion, and the natural background. Moreover, the ternary mixture model based on lead isotope analysis indicates that lead in the study soils originates mainly from anthropogenic sources, which contribute much more than the natural sources. Our study could not only reveal the overall situation of heavy metal contamination, but also identify the specific pollution sources.

  15. Optimization of Dissolution Compartments in a Biorelevant Dissolution Apparatus Golem v2, Supported by Multivariate Analysis.

    PubMed

    Stupák, Ivan; Pavloková, Sylvie; Vysloužil, Jakub; Dohnal, Jiří; Čulen, Martin

    2017-11-23

    Biorelevant dissolution instruments represent an important tool for pharmaceutical research and development. These instruments are designed to simulate the dissolution of drug formulations in conditions most closely mimicking the gastrointestinal tract. In this work, we focused on the optimization of dissolution compartments/vessels for an updated version of the biorelevant dissolution apparatus-Golem v2. We designed eight compartments of uniform size but different inner geometry. The dissolution performance of the compartments was tested using immediate release caffeine tablets and evaluated by standard statistical methods and principal component analysis. Based on two phases of dissolution testing (using 250 and 100 mL of dissolution medium), we selected two compartment types yielding the highest measurement reproducibility. We also confirmed a statistically ssignificant effect of agitation rate and dissolution volume on the extent of drug dissolved and measurement reproducibility.

  16. Provenance establishment of coffee using solution ICP-MS and ICP-AES.

    PubMed

    Valentin, Jenna L; Watling, R John

    2013-11-01

    Statistical interpretation of the concentrations of 59 elements, determined using solution based inductively coupled plasma mass spectrometry (ICP-MS) and inductively coupled plasma emission spectroscopy (ICP-AES), was used to establish the provenance of coffee samples from 15 countries across five continents. Data confirmed that the harvest year, degree of ripeness and whether the coffees were green or roasted had little effect on the elemental composition of the coffees. The application of linear discriminant analysis and principal component analysis of the elemental concentrations permitted up to 96.9% correct classification of the coffee samples according to their continent of origin. When samples from each continent were considered separately, up to 100% correct classification of coffee samples into their countries, and plantations of origin was achieved. This research demonstrates the potential of using elemental composition, in combination with statistical classification methods, for accurate provenance establishment of coffee. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. Improving the Principal Selection Process to Enhance the Opportunities for Women.

    ERIC Educational Resources Information Center

    Chapman, Judith

    1986-01-01

    Presents statistical profiles of Australian women principals and reviews research on school administrator selection in Australia, the United Kingdom, and the United States. To ensure equity, specific recommendations are given concerning vacancy announcements, criteria identification, consideration of evidence, and interviewing and decision-making…

  18. Teacher Contract Non-Renewal: Midwest, Rocky Mountains, and Southeast

    ERIC Educational Resources Information Center

    Nixon, Andy; Dam, Margaret; Packard, Abbot L.

    2012-01-01

    This quantitative study investigated reasons that school principals recommend non-renewal of probationary teachers' contracts. Principal survey results from three regions of the US (Midwest, Rocky Mountains, & Southeast) were analyzed using the Kruskal-Wallis and Mann-Whitney U statistical procedures, while significance was tested applying a…

  19. An Examination of Principal Leadership Styles and Their Influence on School Performance as Measured by Adequate Yearly Progress at Selected Title I Elementary Schools in South Carolina

    ERIC Educational Resources Information Center

    Martin, Tammy Faith

    2012-01-01

    The purpose of this study was to examine principal leadership styles and their influence on school performance as measured by adequate yearly progress at selected Title I schools in South Carolina. The main focus of the research study was to complete descriptive statistics on principal leadership styles in schools that met or did not meet adequate…

  20. Applications of genetic algorithms on the structure-activity relationship analysis of some cinnamamides.

    PubMed

    Hou, T J; Wang, J M; Liao, N; Xu, X J

    1999-01-01

    Quantitative structure-activity relationships (QSARs) for 35 cinnamamides were studied. By using a genetic algorithm (GA), a group of multiple regression models with high fitness scores was generated. From the statistical analyses of the descriptors used in the evolution procedure, the principal features affecting the anticonvulsant activity were found. The significant descriptors include the partition coefficient, the molar refraction, the Hammet sigma constant of the substituents on the benzene ring, and the formation energy of the molecules. It could be found that the steric complementarity and the hydrophobic interaction between the inhibitors and the receptor were very important to the biological activity, while the contribution of the electronic effect was not so obvious. Moreover, by construction of the spline models for these four principal descriptors, the effective range for each descriptor was identified.

  1. Infrared micro-spectroscopic studies of epithelial cells

    PubMed Central

    Romeo, Melissa; Mohlenhoff, Brian; Jennings, Michael; Diem, Max

    2009-01-01

    We report results from a study of human and canine mucosal cells, investigated by infrared micro-spectroscopy, and analyzed by methods of multivariate statistics. We demonstrate that the infrared spectra of individual cells are sensitive to the stage of maturation, and that a distinction between healthy and diseased cells will be possible. Since this report is written for an audience not familiar with infrared micro-spectroscopy, a short introduction into this field is presented along with a summary of principal component analysis. PMID:16797481

  2. Occurrence of pharmaceutically active compounds during 1-year period in wastewaters from four wastewater treatment plants in Seville (Spain).

    PubMed

    Santos, J L; Aparicio, I; Callejón, M; Alonso, E

    2009-05-30

    Several pharmaceutically active compounds have been monitored during 1-year period in influent and effluent wastewater from wastewater treatment plants (WWTPs) to evaluate their temporal evolution and removal from wastewater and to know which variables have influence in their removal rates. Pharmaceutical compounds monitored were four antiinflammatory drugs (diclofenac, ibuprofen, ketoprofen and naproxen), an antiepileptic drug (carbamazepine) and a nervous stimulant (caffeine). All of the pharmaceutically active compounds monitored, except diclofenac, were detected in influent and effluent wastewater. Mean concentrations measured in influent wastewater were 6.17, 0.48, 93.6, 1.83 and 5.41 microg/L for caffeine, carbamazepine, ibuprofen, ketoprofen and naproxen, respectively. Mean concentrations measured in effluent wastewater were 2.02, 0.56, 8.20, 0.84 and 2.10 microg/L for caffeine, carbamazepine, ibuprofen, ketoprofen and naproxen, respectively. Mean removal rates of the pharmaceuticals varied from 8.1% (carbamazepine) to 87.5% (ibuprofen). The existence of relationships between the concentrations of the pharmaceutical compounds, their removal rates, the characterization parameters of influent wastewaters and the WWTP control design parameters has been studied by means of statistical analysis (correlation and principal component analysis). With both statistical analyses, high correlations were obtained between the concentration of the pharmaceutical compounds and the characterization parameters of influent wastewaters; and between the removal rates of the pharmaceutical compounds, the removal rates of the characterization parameters of influent wastewaters and the WWTP hydraulic retention times. Principal component analysis showed the existence of two main components accounting for 76% of the total variability.

  3. The Raman spectrum character of skin tumor induced by UVB

    NASA Astrophysics Data System (ADS)

    Wu, Shulian; Hu, Liangjun; Wang, Yunxia; Li, Yongzeng

    2016-03-01

    In our study, the skin canceration processes induced by UVB were analyzed from the perspective of tissue spectrum. A home-made Raman spectral system with a millimeter order excitation laser spot size combined with a multivariate statistical analysis for monitoring the skin changed irradiated by UVB was studied and the discrimination were evaluated. Raman scattering signals of the SCC and normal skin were acquired. Spectral differences in Raman spectra were revealed. Linear discriminant analysis (LDA) based on principal component analysis (PCA) were employed to generate diagnostic algorithms for the classification of skin SCC and normal. The results indicated that Raman spectroscopy combined with PCA-LDA demonstrated good potential for improving the diagnosis of skin cancers.

  4. Population connectivity of the plating coral Agaricia lamarcki from southwest Puerto Rico

    NASA Astrophysics Data System (ADS)

    Hammerman, Nicholas M.; Rivera-Vicens, Ramon E.; Galaska, Matthew P.; Weil, Ernesto; Appledoorn, Richard S.; Alfaro, Monica; Schizas, Nikolaos V.

    2018-03-01

    Identifying genetic connectivity and discrete population boundaries is an important objective for management of declining Caribbean reef-building corals. A double digest restriction-associated DNA sequencing protocol was utilized to generate 321 single nucleotide polymorphisms to estimate patterns of horizontal and vertical gene flow in the brooding Caribbean plate coral, Agaricia lamarcki. Individual colonies ( n = 59) were sampled from eight locations throughout southwestern Puerto Rico from six shallow ( 10-20 m) and two mesophotic habitats ( 30-40 m). Descriptive summary statistics (fixation index, F ST), analysis of molecular variance, and analysis through landscape and ecological associations and discriminant analysis of principal components estimated high population connectivity with subtle subpopulation structure among all sampling localities.

  5. Principal component analysis of normalized full spectrum mass spectrometry data in multiMS-toolbox: An effective tool to identify important factors for classification of different metabolic patterns and bacterial strains.

    PubMed

    Cejnar, Pavel; Kuckova, Stepanka; Prochazka, Ales; Karamonova, Ludmila; Svobodova, Barbora

    2018-06-15

    Explorative statistical analysis of mass spectrometry data is still a time-consuming step. We analyzed critical factors for application of principal component analysis (PCA) in mass spectrometry and focused on two whole spectrum based normalization techniques and their application in the analysis of registered peak data and, in comparison, in full spectrum data analysis. We used this technique to identify different metabolic patterns in the bacterial culture of Cronobacter sakazakii, an important foodborne pathogen. Two software utilities, the ms-alone, a python-based utility for mass spectrometry data preprocessing and peak extraction, and the multiMS-toolbox, an R software tool for advanced peak registration and detailed explorative statistical analysis, were implemented. The bacterial culture of Cronobacter sakazakii was cultivated on Enterobacter sakazakii Isolation Agar, Blood Agar Base and Tryptone Soya Agar for 24 h and 48 h and applied by the smear method on an Autoflex speed MALDI-TOF mass spectrometer. For three tested cultivation media only two different metabolic patterns of Cronobacter sakazakii were identified using PCA applied on data normalized by two different normalization techniques. Results from matched peak data and subsequent detailed full spectrum analysis identified only two different metabolic patterns - a cultivation on Enterobacter sakazakii Isolation Agar showed significant differences to the cultivation on the other two tested media. The metabolic patterns for all tested cultivation media also proved the dependence on cultivation time. Both whole spectrum based normalization techniques together with the full spectrum PCA allow identification of important discriminative factors in experiments with several variable condition factors avoiding any problems with improper identification of peaks or emphasis on bellow threshold peak data. The amounts of processed data remain still manageable. Both implemented software utilities are available free of charge from http://uprt.vscht.cz/ms. Copyright © 2018 John Wiley & Sons, Ltd.

  6. Multivariate statistical analysis of heavy metal concentration in soils of Yelagiri Hills, Tamilnadu, India--spectroscopical approach.

    PubMed

    Chandrasekaran, A; Ravisankar, R; Harikrishnan, N; Satapathy, K K; Prasad, M V R; Kanagasabapathy, K V

    2015-02-25

    Anthropogenic activities increase the accumulation of heavy metals in the soil environment. Soil pollution significantly reduces environmental quality and affects the human health. In the present study soil samples were collected at different locations of Yelagiri Hills, Tamilnadu, India for heavy metal analysis. The samples were analyzed for twelve selected heavy metals (Mg, Al, K, Ca, Ti, Fe, V, Cr, Mn, Co, Ni and Zn) using energy dispersive X-ray fluorescence (EDXRF) spectroscopy. Heavy metals concentration in soil were investigated using enrichment factor (EF), geo-accumulation index (Igeo), contamination factor (CF) and pollution load index (PLI) to determine metal accumulation, distribution and its pollution status. Heavy metal toxicity risk was assessed using soil quality guidelines (SQGs) given by target and intervention values of Dutch soil standards. The concentration of Ni, Co, Zn, Cr, Mn, Fe, Ti, K, Al, Mg were mainly controlled by natural sources. Multivariate statistical methods such as correlation matrix, principal component analysis and cluster analysis were applied for the identification of heavy metal sources (anthropogenic/natural origin). Geo-statistical methods such as kirging identified hot spots of metal contamination in road areas influenced mainly by presence of natural rocks. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. A new statistical PCA-ICA algorithm for location of R-peaks in ECG.

    PubMed

    Chawla, M P S; Verma, H K; Kumar, Vinod

    2008-09-16

    The success of ICA to separate the independent components from the mixture depends on the properties of the electrocardiogram (ECG) recordings. This paper discusses some of the conditions of independent component analysis (ICA) that could affect the reliability of the separation and evaluation of issues related to the properties of the signals and number of sources. Principal component analysis (PCA) scatter plots are plotted to indicate the diagnostic features in the presence and absence of base-line wander in interpreting the ECG signals. In this analysis, a newly developed statistical algorithm by authors, based on the use of combined PCA-ICA for two correlated channels of 12-channel ECG data is proposed. ICA technique has been successfully implemented in identifying and removal of noise and artifacts from ECG signals. Cleaned ECG signals are obtained using statistical measures like kurtosis and variance of variance after ICA processing. This analysis also paper deals with the detection of QRS complexes in electrocardiograms using combined PCA-ICA algorithm. The efficacy of the combined PCA-ICA algorithm lies in the fact that the location of the R-peaks is bounded from above and below by the location of the cross-over points, hence none of the peaks are ignored or missed.

  8. The Trauma of Adolescent Suicide: A Time for Special Leadership by Principals.

    ERIC Educational Resources Information Center

    Dempsey, Richard A.

    This monograph provides principals and school officials with information about coping with adolescent suicide. Section 1, "Introduction," discusses the uncomfortable nature of the topic, cites statistics, and recommends that preventive programs be developed. Section 2, "Causes of Suicide," analyzes stress and depression among youth and suggests…

  9. 50 CFR 300.183 - Permit holder reporting and recordkeeping requirements.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... person required to obtain a trade permit under § 300.182 retains, at his/her principal place of business... his/her principal place of business, a copy of each biweekly report and all supporting records for a... regulated under this subpart, biweekly reports, statistical documents, catch documents, re-export...

  10. Connecting Principal Leadership, Teacher Collaboration, and Student Achievement

    ERIC Educational Resources Information Center

    Goddard, Yvonne L.; Miller, Robert; Larsen, Ross; Goddard, Roger; Madsen, Jean; Schroeder, Patricia

    2010-01-01

    The purpose of this paper was to test the relationship between principal leadership and teacher collaboration around instructional improvement to determine whether these measures were statistically related and whether, together, they were associated with academic achievement in elementary schools. Data were obtained from 1,600 teachers in 96…

  11. Statistical Data Analyses of Trace Chemical, Biochemical, and Physical Analytical Signatures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Udey, Ruth Norma

    Analytical and bioanalytical chemistry measurement results are most meaningful when interpreted using rigorous statistical treatments of the data. The same data set may provide many dimensions of information depending on the questions asked through the applied statistical methods. Three principal projects illustrated the wealth of information gained through the application of statistical data analyses to diverse problems.

  12. Investigation of cell wall composition related to stem lodging resistance in wheat (Triticum aestivum L.) by FTIR spectroscopy.

    PubMed

    Wang, Jian; Zhu, Jinmao; Huang, RuZhu; Yang, YuSheng

    2012-07-01

    We explored the rapid qualitative analysis of wheat cultivars with good lodging resistances by Fourier transform infrared resonance (FTIR) spectroscopy and multivariate statistical analysis. FTIR imaging showing that wheat stem cell walls were mainly composed of cellulose, pectin, protein, and lignin. Principal components analysis (PCA) was used to eliminate multicollinearity among multiple peak absorptions. PCA revealed the developmental internodes of wheat stems could be distributed from low to high along the load of the second principal component, which was consistent with the corresponding bands of cellulose in the FTIR spectra of the cell walls. Furthermore, four distinct stem populations could also be identified by spectral features related to their corresponding mechanical properties via PCA and cluster analysis. Histochemical staining of four types of wheat stems with various abilities to resist lodging revealed that cellulose contributed more than lignin to the ability to resist lodging. These results strongly suggested that the main cell wall component responsible for these differences was cellulose. Therefore, the combination of multivariate analysis and FTIR could rapidly screen wheat cultivars with good lodging resistance. Furthermore, the application of these methods to a much wider range of cultivars of unknown mechanical properties promises to be of interest.

  13. Automatically detect and track infrared small targets with kernel Fukunaga-Koontz transform and Kalman prediction.

    PubMed

    Liu, Ruiming; Liu, Erqi; Yang, Jie; Zeng, Yong; Wang, Fanglin; Cao, Yuan

    2007-11-01

    Fukunaga-Koontz transform (FKT), stemming from principal component analysis (PCA), is used in many pattern recognition and image-processing fields. It cannot capture the higher-order statistical property of natural images, so its detection performance is not satisfying. PCA has been extended into kernel PCA in order to capture the higher-order statistics. However, thus far there have been no researchers who have definitely proposed kernel FKT (KFKT) and researched its detection performance. For accurately detecting potential small targets from infrared images, we first extend FKT into KFKT to capture the higher-order statistical properties of images. Then a framework based on Kalman prediction and KFKT, which can automatically detect and track small targets, is developed. Results of experiments show that KFKT outperforms FKT and the proposed framework is competent to automatically detect and track infrared point targets.

  14. Automatically detect and track infrared small targets with kernel Fukunaga-Koontz transform and Kalman prediction

    NASA Astrophysics Data System (ADS)

    Liu, Ruiming; Liu, Erqi; Yang, Jie; Zeng, Yong; Wang, Fanglin; Cao, Yuan

    2007-11-01

    Fukunaga-Koontz transform (FKT), stemming from principal component analysis (PCA), is used in many pattern recognition and image-processing fields. It cannot capture the higher-order statistical property of natural images, so its detection performance is not satisfying. PCA has been extended into kernel PCA in order to capture the higher-order statistics. However, thus far there have been no researchers who have definitely proposed kernel FKT (KFKT) and researched its detection performance. For accurately detecting potential small targets from infrared images, we first extend FKT into KFKT to capture the higher-order statistical properties of images. Then a framework based on Kalman prediction and KFKT, which can automatically detect and track small targets, is developed. Results of experiments show that KFKT outperforms FKT and the proposed framework is competent to automatically detect and track infrared point targets.

  15. Batch Statistical Process Monitoring Approach to a Cocrystallization Process.

    PubMed

    Sarraguça, Mafalda C; Ribeiro, Paulo R S; Dos Santos, Adenilson O; Lopes, João A

    2015-12-01

    Cocrystals are defined as crystalline structures composed of two or more compounds that are solid at room temperature held together by noncovalent bonds. Their main advantages are the increase of solubility, bioavailability, permeability, stability, and at the same time retaining active pharmaceutical ingredient bioactivity. The cocrystallization between furosemide and nicotinamide by solvent evaporation was monitored on-line using near-infrared spectroscopy (NIRS) as a process analytical technology tool. The near-infrared spectra were analyzed using principal component analysis. Batch statistical process monitoring was used to create control charts to perceive the process trajectory and define control limits. Normal and non-normal operating condition batches were performed and monitored with NIRS. The use of NIRS associated with batch statistical process models allowed the detection of abnormal variations in critical process parameters, like the amount of solvent or amount of initial components present in the cocrystallization. © 2015 Wiley Periodicals, Inc. and the American Pharmacists Association.

  16. Characterization of Type Ia Supernova Light Curves Using Principal Component Analysis of Sparse Functional Data

    NASA Astrophysics Data System (ADS)

    He, Shiyuan; Wang, Lifan; Huang, Jianhua Z.

    2018-04-01

    With growing data from ongoing and future supernova surveys, it is possible to empirically quantify the shapes of SNIa light curves in more detail, and to quantitatively relate the shape parameters with the intrinsic properties of SNIa. Building such relationships is critical in controlling systematic errors associated with supernova cosmology. Based on a collection of well-observed SNIa samples accumulated in the past years, we construct an empirical SNIa light curve model using a statistical method called the functional principal component analysis (FPCA) for sparse and irregularly sampled functional data. Using this method, the entire light curve of an SNIa is represented by a linear combination of principal component functions, and the SNIa is represented by a few numbers called “principal component scores.” These scores are used to establish relations between light curve shapes and physical quantities such as intrinsic color, interstellar dust reddening, spectral line strength, and spectral classes. These relations allow for descriptions of some critical physical quantities based purely on light curve shape parameters. Our study shows that some important spectral feature information is being encoded in the broad band light curves; for instance, we find that the light curve shapes are correlated with the velocity and velocity gradient of the Si II λ6355 line. This is important for supernova surveys (e.g., LSST and WFIRST). Moreover, the FPCA light curve model is used to construct the entire light curve shape, which in turn is used in a functional linear form to adjust intrinsic luminosity when fitting distance models.

  17. Multivariate analysis for stormwater quality characteristics identification from different urban surface types in macau.

    PubMed

    Huang, J; Du, P; Ao, C; Ho, M; Lei, M; Zhao, D; Wang, Z

    2007-12-01

    Statistical analysis of stormwater runoff data enables general identification of runoff characteristics. Six catchments with different urban surface type including roofs, roadway, park, and residential/commercial in Macau were selected for sampling and study during the period from June 2005 to September 2006. Based on univariate statistical analysis of data sampled, major pollutants discharged from different urban surface type were identified. As for iron roof runoff, Zn is the most significant pollutant. The major pollutants from urban roadway runoff are TSS and COD. Stormwater runoff from commercial/residential and Park catchments show high level of COD, TN, and TP concentration. Principal component analysis was further done for identification of linkages between stormwater quality and urban surface types. Two potential pollution sources were identified for study catchments with different urban surface types. The first one is referred as nutrients losses, soil losses and organic pollutants discharges, the second is related to heavy metals losses. PCA was proved to be a viable tool to explain the type of pollution sources and its mechanism for different urban surface type catchments.

  18. 3D Texture Analysis in Renal Cell Carcinoma Tissue Image Grading

    PubMed Central

    Cho, Nam-Hoon; Choi, Heung-Kook

    2014-01-01

    One of the most significant processes in cancer cell and tissue image analysis is the efficient extraction of features for grading purposes. This research applied two types of three-dimensional texture analysis methods to the extraction of feature values from renal cell carcinoma tissue images, and then evaluated the validity of the methods statistically through grade classification. First, we used a confocal laser scanning microscope to obtain image slices of four grades of renal cell carcinoma, which were then reconstructed into 3D volumes. Next, we extracted quantitative values using a 3D gray level cooccurrence matrix (GLCM) and a 3D wavelet based on two types of basis functions. To evaluate their validity, we predefined 6 different statistical classifiers and applied these to the extracted feature sets. In the grade classification results, 3D Haar wavelet texture features combined with principal component analysis showed the best discrimination results. Classification using 3D wavelet texture features was significantly better than 3D GLCM, suggesting that the former has potential for use in a computer-based grading system. PMID:25371701

  19. SPSS and SAS programs for determining the number of components using parallel analysis and velicer's MAP test.

    PubMed

    O'Connor, B P

    2000-08-01

    Popular statistical software packages do not have the proper procedures for determining the number of components in factor and principal components analyses. Parallel analysis and Velicer's minimum average partial (MAP) test are validated procedures, recommended widely by statisticians. However, many researchers continue to use alternative, simpler, but flawed procedures, such as the eigenvalues-greater-than-one rule. Use of the proper procedures might be increased if these procedures could be conducted within familiar software environments. This paper describes brief and efficient programs for using SPSS and SAS to conduct parallel analyses and the MAP test.

  20. Pulse gas chromatographic study of adsorption of substituted aromatics and heterocyclic molecules on MIL-47 at zero coverage.

    PubMed

    Duerinck, Tim; Couck, Sarah; Vermoortele, Frederik; De Vos, Dirk E; Baron, Gino V; Denayer, Joeri F M

    2012-10-02

    The low coverage adsorptive properties of the MIL-47 metal organic framework toward aromatic and heterocyclic molecules are reported in this paper. The effect of molecular functionality and size on Henry adsorption constants and adsorption enthalpies of alkyl and heteroatom functionalized benzene derivates and heterocyclic molecules was studied using pulse gas chromatography. By means of statistical analysis, experimental data was analyzed and modeled using principal component analysis and partial least-squares regression. Structure-property relationships were established, revealing and confirming several trends. Among the molecular properties governing the adsorption process, vapor pressure, mean polarizability, and dipole moment play a determining role.

  1. Quantitative research.

    PubMed

    Watson, Roger

    2015-04-01

    This article describes the basic tenets of quantitative research. The concepts of dependent and independent variables are addressed and the concept of measurement and its associated issues, such as error, reliability and validity, are explored. Experiments and surveys – the principal research designs in quantitative research – are described and key features explained. The importance of the double-blind randomised controlled trial is emphasised, alongside the importance of longitudinal surveys, as opposed to cross-sectional surveys. Essential features of data storage are covered, with an emphasis on safe, anonymous storage. Finally, the article explores the analysis of quantitative data, considering what may be analysed and the main uses of statistics in analysis.

  2. Statistical shape modeling of human cochlea: alignment and principal component analysis

    NASA Astrophysics Data System (ADS)

    Poznyakovskiy, Anton A.; Zahnert, Thomas; Fischer, Björn; Lasurashvili, Nikoloz; Kalaidzidis, Yannis; Mürbe, Dirk

    2013-02-01

    The modeling of the cochlear labyrinth in living subjects is hampered by insufficient resolution of available clinical imaging methods. These methods usually provide resolutions higher than 125 μm. This is too crude to record the position of basilar membrane and, as a result, keep apart even the scala tympani from other scalae. This problem could be avoided by the means of atlas-based segmentation. The specimens can endure higher radiation loads and, conversely, provide better-resolved images. The resulting surface can be used as the seed for atlas-based segmentation. To serve this purpose, we have developed a statistical shape model (SSM) of human scala tympani based on segmentations obtained from 10 μCT image stacks. After segmentation, we aligned the resulting surfaces using Procrustes alignment. This algorithm was slightly modified to accommodate single models with nodes which do not necessarily correspond to salient features and vary in number between models. We have established correspondence by mutual proximity between nodes. Rather than using the standard Euclidean norm, we have applied an alternative logarithmic norm to improve outlier treatment. The minimization was done using BFGS method. We have also split the surface nodes along an octree to reduce computation cost. Subsequently, we have performed the principal component analysis of the training set with Jacobi eigenvalue algorithm. We expect the resulting method to help acquiring not only better understanding in interindividual variations of cochlear anatomy, but also a step towards individual models for pre-operative diagnostics prior to cochlear implant insertions.

  3. A first application of independent component analysis to extracting structure from stock returns.

    PubMed

    Back, A D; Weigend, A S

    1997-08-01

    This paper explores the application of a signal processing technique known as independent component analysis (ICA) or blind source separation to multivariate financial time series such as a portfolio of stocks. The key idea of ICA is to linearly map the observed multivariate time series into a new space of statistically independent components (ICs). We apply ICA to three years of daily returns of the 28 largest Japanese stocks and compare the results with those obtained using principal component analysis. The results indicate that the estimated ICs fall into two categories, (i) infrequent large shocks (responsible for the major changes in the stock prices), and (ii) frequent smaller fluctuations (contributing little to the overall level of the stocks). We show that the overall stock price can be reconstructed surprisingly well by using a small number of thresholded weighted ICs. In contrast, when using shocks derived from principal components instead of independent components, the reconstructed price is less similar to the original one. ICA is shown to be a potentially powerful method of analyzing and understanding driving mechanisms in financial time series. The application to portfolio optimization is described in Chin and Weigend (1998).

  4. Screening of oil sources by using comprehensive two-dimensional gas chromatography/time-of-flight mass spectrometry and multivariate statistical analysis.

    PubMed

    Zhang, Wanfeng; Zhu, Shukui; He, Sheng; Wang, Yanxin

    2015-02-06

    Using comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC×GC/TOFMS), volatile and semi-volatile organic compounds in crude oil samples from different reservoirs or regions were analyzed for the development of a molecular fingerprint database. Based on the GC×GC/TOFMS fingerprints of crude oils, principal component analysis (PCA) and cluster analysis were used to distinguish the oil sources and find biomarkers. As a supervised technique, the geological characteristics of crude oils, including thermal maturity, sedimentary environment etc., are assigned to the principal components. The results show that tri-aromatic steroid (TAS) series are the suitable marker compounds in crude oils for the oil screening, and the relative abundances of individual TAS compounds have excellent correlation with oil sources. In order to correct the effects of some other external factors except oil sources, the variables were defined as the content ratio of some target compounds and 13 parameters were proposed for the screening of oil sources. With the developed model, the crude oils were easily discriminated, and the result is in good agreement with the practical geological setting. Copyright © 2014 Elsevier B.V. All rights reserved.

  5. What Are the Characteristics of Principals Identified As Effective by Teachers?

    ERIC Educational Resources Information Center

    Fowler, William J., Jr.

    This exploratory study investigated which characteristics of a principal are identified as effective by teachers in the same school setting. The data were obtained from the Schools and Staffing Study of 1988, from the National Center for Education Statistics (NCES). The Teacher Questionnaire of the Schools and Staffing Survey (SASS) questioned…

  6. The balanced survivor average causal effect.

    PubMed

    Greene, Tom; Joffe, Marshall; Hu, Bo; Li, Liang; Boucher, Ken

    2013-05-07

    Statistical analysis of longitudinal outcomes is often complicated by the absence of observable values in patients who die prior to their scheduled measurement. In such cases, the longitudinal data are said to be "truncated by death" to emphasize that the longitudinal measurements are not simply missing, but are undefined after death. Recently, the truncation by death problem has been investigated using the framework of principal stratification to define the target estimand as the survivor average causal effect (SACE), which in the context of a two-group randomized clinical trial is the mean difference in the longitudinal outcome between the treatment and control groups for the principal stratum of always-survivors. The SACE is not identified without untestable assumptions. These assumptions have often been formulated in terms of a monotonicity constraint requiring that the treatment does not reduce survival in any patient, in conjunction with assumed values for mean differences in the longitudinal outcome between certain principal strata. In this paper, we introduce an alternative estimand, the balanced-SACE, which is defined as the average causal effect on the longitudinal outcome in a particular subset of the always-survivors that is balanced with respect to the potential survival times under the treatment and control. We propose a simple estimator of the balanced-SACE that compares the longitudinal outcomes between equivalent fractions of the longest surviving patients between the treatment and control groups and does not require a monotonicity assumption. We provide expressions for the large sample bias of the estimator, along with sensitivity analyses and strategies to minimize this bias. We consider statistical inference under a bootstrap resampling procedure.

  7. An Empirical Cumulus Parameterization Scheme for a Global Spectral Model

    NASA Technical Reports Server (NTRS)

    Rajendran, K.; Krishnamurti, T. N.; Misra, V.; Tao, W.-K.

    2004-01-01

    Realistic vertical heating and drying profiles in a cumulus scheme is important for obtaining accurate weather forecasts. A new empirical cumulus parameterization scheme based on a procedure to improve the vertical distribution of heating and moistening over the tropics is developed. The empirical cumulus parameterization scheme (ECPS) utilizes profiles of Tropical Rainfall Measuring Mission (TRMM) based heating and moistening derived from the European Centre for Medium- Range Weather Forecasts (ECMWF) analysis. A dimension reduction technique through rotated principal component analysis (RPCA) is performed on the vertical profiles of heating (Q1) and drying (Q2) over the convective regions of the tropics, to obtain the dominant modes of variability. Analysis suggests that most of the variance associated with the observed profiles can be explained by retaining the first three modes. The ECPS then applies a statistical approach in which Q1 and Q2 are expressed as a linear combination of the first three dominant principal components which distinctly explain variance in the troposphere as a function of the prevalent large-scale dynamics. The principal component (PC) score which quantifies the contribution of each PC to the corresponding loading profile is estimated through a multiple screening regression method which yields the PC score as a function of the large-scale variables. The profiles of Q1 and Q2 thus obtained are found to match well with the observed profiles. The impact of the ECPS is investigated in a series of short range (1-3 day) prediction experiments using the Florida State University global spectral model (FSUGSM, T126L14). Comparisons between short range ECPS forecasts and those with the modified Kuo scheme show a very marked improvement in the skill in ECPS forecasts. This improvement in the forecast skill with ECPS emphasizes the importance of incorporating realistic vertical distributions of heating and drying in the model cumulus scheme. This also suggests that in the absence of explicit models for convection, the proposed statistical scheme improves the modeling of the vertical distribution of heating and moistening in areas of deep convection.

  8. Investigation of carbon dioxide emission in China by primary component analysis.

    PubMed

    Zhang, Jing; Wang, Cheng-Ming; Liu, Lian; Guo, Hang; Liu, Guo-Dong; Li, Yuan-Wei; Deng, Shi-Huai

    2014-02-15

    Principal component analysis (PCA) is employed to investigate the relationship between CO2 emissions (COEs) stemming from fossil fuel burning and cement manufacturing and their affecting factors. Eight affecting factors, namely, Population (P), Urban Population (UP); the Output Values of Primary Industry (PIOV), Secondary Industry (SIOV), and Tertiary Industry (TIOV); and the Proportions of Primary Industry's Output Value (PPIOV), Secondary Industry's Output Value (PSIOV), and Tertiary Industry's Output Value (PTIOV), are chosen. PCA is employed to eliminate the multicollinearity of the affecting factors. Two principal components, which can explain 92.86% of the variance of the eight affecting factors, are chosen as variables in the regression analysis. Ordinary least square regression is used to estimate multiple linear regression models, in which COEs and the principal components serve as dependent and independent variables, respectively. The results are given in the following. (1) Theoretically, the carbon intensities of PIOV, SIOV, and TIOV are 2573.4693, 552.7036, and 606.0791 kt per one billion $, respectively. The incomplete statistical data, the different statistical standards, and the ideology of self sufficiency and peasantry appear to show that the carbon intensity of PIOV is higher than those of SIOV and TIOV in China. (2) PPIOV, PSIOV, and PTIOV influence the fluctuations of COE. The parameters of PPIOV, PSIOV, and PTIOV are -2706946.7564, 2557300.5450, and 3924767.9807 kt, respectively. As the economic structure of China is strongly tied to technology level, the period when PIOV plays the leading position is characterized by lagging technology and economic developing. Thus, the influence of PPIOV has a negative value. As the increase of PSIOV and PTIOV is always followed by technological innovation and economic development, PSIOV and PTIOV have the opposite influence. (3) The carbon intensities of P and UP are 1.1029 and 1.7862 kt per thousand people, respectively. The carbon intensity of the rural population can be inferred to be lower than 1.1029 kt per thousand people. The characteristics of poverty and the use of bio-energy in rural areas result in a carbon intensity of the rural population that is lower than that of P. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Interannual variability of the frequency and intensity of tropical cyclones striking the California coast

    NASA Astrophysics Data System (ADS)

    Mendez, F. J.; Rueda, A.; Barnard, P.; Mori, N.; Nakajo, S.; Albuquerque, J.

    2016-12-01

    Hurricanes hitting California have a very low ocurrence probability due to typically cool ocean temperature and westward tracks. However, damages associated to these improbable events would be dramatic in Southern California and understanding the oceanographic and atmospheric drivers is of paramount importance for coastal risk management for present and future climates. A statistical analysis of the historical events is very difficult due to the limited resolution of atmospheric and oceanographic forcing data available. In this work, we propose a combination of: (a) climate-based statistical downscaling methods (Espejo et al, 2015); and (b) a synthetic stochastic tropical cyclone (TC) model (Nakajo et al, 2014). To build the statistical downscaling model, Y=f(X), we apply a combination of principal component analysis and the k-means classification algorithm to find representative patterns from large-scale may-to-november averaged monthly anomalies of SST and thermocline depth fields in Tropical Pacific (predictor X) and the associated historical tropical cyclones in Eastern North Pacific basin (predictand Y). As data for the historical occurrence and paths of tropical cyclones are scarce, we apply a stochastic TC model which is based on a Monte Carlo simulation of the joint distribution of track, minimum sea level pressure and translation speed of the historical events in the Eastern Central Pacific Ocean. Results will show the ability of the approach to explain the interannual variability of the frequency and intensity of TCs in Southern California, which is clearly related to post El Niño Eastern Pacific and El Niño Central Pacific. References Espejo, A., Méndez, F.J., Diez, J., Medina, R., Al-Yahyai, S. (2015) Seasonal probabilistic forecasting of tropical cyclone activity in the North Indian Ocean, Journal of Flood Risk Management, DOI: 10.1111/jfr3.12197 Nakajo, S., N. Mori, T. Yasuda, and H. Mase (2014) Global Stochastic Tropical Cyclone Model Based on Principal Component Analysis and Cluster Analysis, Journal of Applied Meteorology and Climatology, DOI: 10.1175/JAMC-D-13-08.1

  10. Reservoir zonation based on statistical analyses: A case study of the Nubian sandstone, Gulf of Suez, Egypt

    NASA Astrophysics Data System (ADS)

    El Sharawy, Mohamed S.; Gaafar, Gamal R.

    2016-12-01

    Both reservoir engineers and petrophysicists have been concerned about dividing a reservoir into zones for engineering and petrophysics purposes. Through decades, several techniques and approaches were introduced. Out of them, statistical reservoir zonation, stratigraphic modified Lorenz (SML) plot and the principal component and clustering analyses techniques were chosen to apply on the Nubian sandstone reservoir of Palaeozoic - Lower Cretaceous age, Gulf of Suez, Egypt, by using five adjacent wells. The studied reservoir consists mainly of sandstone with some intercalation of shale layers with varying thickness from one well to another. The permeability ranged from less than 1 md to more than 1000 md. The statistical reservoir zonation technique, depending on core permeability, indicated that the cored interval of the studied reservoir can be divided into two zones. Using reservoir properties such as porosity, bulk density, acoustic impedance and interval transit time indicated also two zones with an obvious variation in separation depth and zones continuity. The stratigraphic modified Lorenz (SML) plot indicated the presence of more than 9 flow units in the cored interval as well as a high degree of microscopic heterogeneity. On the other hand, principal component and cluster analyses, depending on well logging data (gamma ray, sonic, density and neutron), indicated that the whole reservoir can be divided at least into four electrofacies having a noticeable variation in reservoir quality, as correlated with the measured permeability. Furthermore, continuity or discontinuity of the reservoir zones can be determined using this analysis.

  11. An Independent Filter for Gene Set Testing Based on Spectral Enrichment.

    PubMed

    Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H

    2015-01-01

    Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.

  12. Reproducibility-optimized test statistic for ranking genes in microarray studies.

    PubMed

    Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero

    2008-01-01

    A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.

  13. QSAR study of anthranilic acid sulfonamides as inhibitors of methionine aminopeptidase-2 using LS-SVM and GRNN based on principal components.

    PubMed

    Shahlaei, Mohsen; Sabet, Razieh; Ziari, Maryam Bahman; Moeinifard, Behzad; Fassihi, Afshin; Karbakhsh, Reza

    2010-10-01

    Quantitative relationships between molecular structure and methionine aminopeptidase-2 inhibitory activity of a series of cytotoxic anthranilic acid sulfonamide derivatives were discovered. We have demonstrated the detailed application of two efficient nonlinear methods for evaluation of quantitative structure-activity relationships of the studied compounds. Components produced by principal component analysis as input of developed nonlinear models were used. The performance of the developed models namely PC-GRNN and PC-LS-SVM were tested by several validation methods. The resulted PC-LS-SVM model had a high statistical quality (R(2)=0.91 and R(CV)(2)=0.81) for predicting the cytotoxic activity of the compounds. Comparison between predictability of PC-GRNN and PC-LS-SVM indicates that later method has higher ability to predict the activity of the studied molecules. Copyright (c) 2010 Elsevier Masson SAS. All rights reserved.

  14. Fractal analysis of scatter imaging signatures to distinguish breast pathologies

    NASA Astrophysics Data System (ADS)

    Eguizabal, Alma; Laughney, Ashley M.; Krishnaswamy, Venkataramanan; Wells, Wendy A.; Paulsen, Keith D.; Pogue, Brian W.; López-Higuera, José M.; Conde, Olga M.

    2013-02-01

    Fractal analysis combined with a label-free scattering technique is proposed for describing the pathological architecture of tumors. Clinicians and pathologists are conventionally trained to classify abnormal features such as structural irregularities or high indices of mitosis. The potential of fractal analysis lies in the fact of being a morphometric measure of the irregular structures providing a measure of the object's complexity and self-similarity. As cancer is characterized by disorder and irregularity in tissues, this measure could be related to tumor growth. Fractal analysis has been probed in the understanding of the tumor vasculature network. This work addresses the feasibility of applying fractal analysis to the scattering power map (as a physical modeling) and principal components (as a statistical modeling) provided by a localized reflectance spectroscopic system. Disorder, irregularity and cell size variation in tissue samples is translated into the scattering power and principal components magnitude and its fractal dimension is correlated with the pathologist assessment of the samples. The fractal dimension is computed applying the box-counting technique. Results show that fractal analysis of ex-vivo fresh tissue samples exhibits separated ranges of fractal dimension that could help classifier combining the fractal results with other morphological features. This contrast trend would help in the discrimination of tissues in the intraoperative context and may serve as a useful adjunct to surgeons.

  15. Characterization and Discrimination of Oueslati Virgin Olive Oils from Adult and Young Trees in Different Ripening Stages Using Sterols, Pigments, and Alcohols in Tandem with Chemometrics.

    PubMed

    Chtourou, Fatma; Jabeur, Hazem; Lazzez, Ayda; Bouaziz, Mohamed

    2017-05-03

    Dynamics of squalene, sterol, aliphatic alcohol, pigment, and triterpenic diol accumulations in olive oils from adult and young trees of the Oueslati cultivar were studied for two consecutive years, 2013-2014 and 2014-2015. Data were compared statistically for differences by age of trees, maturation of olive, and year of harvesting. Results showed that the mean campesterol content in olive oil from adult trees at the green stage of maturation was significantly (p < 0.02) above the limit established by IOC legislation. However, the mean values of campesterol and Δ-7-stigmastenol were significantly (p < 0.01) above the limits in oils from young trees at the black stage of ripening. Principal component analysis was applied to alcohols, squalene, pigments, and sterols having noncompliance with the legislation. Then, data of 36 samples were subjected to a discriminant analysis with "maturation" as grouping variable and principal components as input variables. The model revealed clear discrimination of each tree age/maturation stage group.

  16. The role of water chemistry in the distribution of Austropotamobius pallipes (Crustacea Decapoda Astacidae) in Piedmont (Italy).

    PubMed

    Favaro, Livio; Tirelli, Tina; Pessani, Daniela

    2010-01-01

    Over the last decades, the populations of Austropotamobius pallipes have decreased markedly all over Europe. If we evaluate the ecological factors that determine its presence, we will have information that could guide conservation decisions. This study aims to investigate the chemical-physical demands of A. pallipes in NW Italy. To this end, we investigated 98 sites. We performed Principal Component Analysis using chemical-physical parameters, collected in both presence and absence sites. We then used principal components with eigenvalue > 1 to run Discriminant Function Analysis and Logistic Regression. The statistics on the concentration of Ca(2+), water hardness, pH and BOD(5) were significantly different in the presence and in the absence sites. pH and BOD(5) played the most important role in separating the presence from the absence locations. These findings are further evidence that we should reduce dissolved organic matter and fine particles in order to contribute to species management and conservation. Copyright 2009 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  17. Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis.

    PubMed

    Shabri, Ani; Samsudin, Ruhaidah

    2014-01-01

    Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series.

  18. A Principal Component Analysis/Fuzzy Comprehensive Evaluation for Rockburst Potential in Kimberlite

    NASA Astrophysics Data System (ADS)

    Pu, Yuanyuan; Apel, Derek; Xu, Huawei

    2018-02-01

    Kimberlite is an igneous rock which sometimes bears diamonds. Most of the diamonds mined in the world today are found in kimberlite ores. Burst potential in kimberlite has not been investigated, because kimberlite is mostly mined using open-pit mining, which poses very little threat of rock bursting. However, as the mining depth keeps increasing, the mines convert to underground mining methods, which can pose a threat of rock bursting in kimberlite. This paper focuses on the burst potential of kimberlite at a diamond mine in northern Canada. A combined model with the methods of principal component analysis (PCA) and fuzzy comprehensive evaluation (FCE) is developed to process data from 12 different locations in kimberlite pipes. Based on calculated 12 fuzzy evaluation vectors, 8 locations show a moderate burst potential, 2 locations show no burst potential, and 2 locations show strong and violent burst potential, respectively. Using statistical principles, a Mahalanobis distance is adopted to build a comprehensive fuzzy evaluation vector for the whole mine and the final evaluation for burst potential is moderate, which is verified by a practical rockbursting situation at mine site.

  19. Physician performance assessment using a composite quality index.

    PubMed

    Liu, Kaibo; Jain, Shabnam; Shi, Jianjun

    2013-07-10

    Assessing physician performance is important for the purposes of measuring and improving quality of service and reducing healthcare delivery costs. In recent years, physician performance scorecards have been used to provide feedback on individual measures; however, one key challenge is how to develop a composite quality index that combines multiple measures for overall physician performance evaluation. A controversy arises over establishing appropriate weights to combine indicators in multiple dimensions, and cannot be easily resolved. In this study, we proposed a generic unsupervised learning approach to develop a single composite index for physician performance assessment by using non-negative principal component analysis. We developed a new algorithm named iterative quadratic programming to solve the numerical issue in the non-negative principal component analysis approach. We conducted real case studies to demonstrate the performance of the proposed method. We provided interpretations from both statistical and clinical perspectives to evaluate the developed composite ranking score in practice. In addition, we implemented the root cause assessment techniques to explain physician performance for improvement purposes. Copyright © 2012 John Wiley & Sons, Ltd.

  20. Dynamics and spatio-temporal variability of environmental factors in Eastern Australia using functional principal component analysis

    USGS Publications Warehouse

    Szabo, J.K.; Fedriani, E.M.; Segovia-Gonzalez, M. M.; Astheimer, L.B.; Hooper, M.J.

    2010-01-01

    This paper introduces a new technique in ecology to analyze spatial and temporal variability in environmental variables. By using simple statistics, we explore the relations between abiotic and biotic variables that influence animal distributions. However, spatial and temporal variability in rainfall, a key variable in ecological studies, can cause difficulties to any basic model including time evolution. The study was of a landscape scale (three million square kilometers in eastern Australia), mainly over the period of 19982004. We simultaneously considered qualitative spatial (soil and habitat types) and quantitative temporal (rainfall) variables in a Geographical Information System environment. In addition to some techniques commonly used in ecology, we applied a new method, Functional Principal Component Analysis, which proved to be very suitable for this case, as it explained more than 97% of the total variance of the rainfall data, providing us with substitute variables that are easier to manage and are even able to explain rainfall patterns. The main variable came from a habitat classification that showed strong correlations with rainfall values and soil types. ?? 2010 World Scientific Publishing Company.

  1. Crude Oil Price Forecasting Based on Hybridizing Wavelet Multiple Linear Regression Model, Particle Swarm Optimization Techniques, and Principal Component Analysis

    PubMed Central

    Shabri, Ani; Samsudin, Ruhaidah

    2014-01-01

    Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666

  2. Principal Component Relaxation Mode Analysis of an All-Atom Molecular Dynamics Simulation of Human Lysozyme

    NASA Astrophysics Data System (ADS)

    Nagai, Toshiki; Mitsutake, Ayori; Takano, Hiroshi

    2013-02-01

    A new relaxation mode analysis method, which is referred to as the principal component relaxation mode analysis method, has been proposed to handle a large number of degrees of freedom of protein systems. In this method, principal component analysis is carried out first and then relaxation mode analysis is applied to a small number of principal components with large fluctuations. To reduce the contribution of fast relaxation modes in these principal components efficiently, we have also proposed a relaxation mode analysis method using multiple evolution times. The principal component relaxation mode analysis method using two evolution times has been applied to an all-atom molecular dynamics simulation of human lysozyme in aqueous solution. Slow relaxation modes and corresponding relaxation times have been appropriately estimated, demonstrating that the method is applicable to protein systems.

  3. Assessing Statistical Competencies in Clinical and Translational Science Education: One Size Does Not Fit All

    PubMed Central

    Lindsell, Christopher J.; Welty, Leah J.; Mazumdar, Madhu; Thurston, Sally W.; Rahbar, Mohammad H.; Carter, Rickey E.; Pollock, Bradley H.; Cucchiara, Andrew J.; Kopras, Elizabeth J.; Jovanovic, Borko D.; Enders, Felicity T.

    2014-01-01

    Abstract Introduction Statistics is an essential training component for a career in clinical and translational science (CTS). Given the increasing complexity of statistics, learners may have difficulty selecting appropriate courses. Our question was: what depth of statistical knowledge do different CTS learners require? Methods For three types of CTS learners (principal investigator, co‐investigator, informed reader of the literature), each with different backgrounds in research (no previous research experience, reader of the research literature, previous research experience), 18 experts in biostatistics, epidemiology, and research design proposed levels for 21 statistical competencies. Results Statistical competencies were categorized as fundamental, intermediate, or specialized. CTS learners who intend to become independent principal investigators require more specialized training, while those intending to become informed consumers of the medical literature require more fundamental education. For most competencies, less training was proposed for those with more research background. Discussion When selecting statistical coursework, the learner's research background and career goal should guide the decision. Some statistical competencies are considered to be more important than others. Baseline knowledge assessments may help learners identify appropriate coursework. Conclusion Rather than one size fits all, tailoring education to baseline knowledge, learner background, and future goals increases learning potential while minimizing classroom time. PMID:25212569

  4. [Discrimination of types of polyacrylamide based on near infrared spectroscopy coupled with least square support vector machine].

    PubMed

    Zhang, Hong-Guang; Yang, Qin-Min; Lu, Jian-Gang

    2014-04-01

    In this paper, a novel discriminant methodology based on near infrared spectroscopic analysis technique and least square support vector machine was proposed for rapid and nondestructive discrimination of different types of Polyacrylamide. The diffuse reflectance spectra of samples of Non-ionic Polyacrylamide, Anionic Polyacrylamide and Cationic Polyacrylamide were measured. Then principal component analysis method was applied to reduce the dimension of the spectral data and extract of the principal compnents. The first three principal components were used for cluster analysis of the three different types of Polyacrylamide. Then those principal components were also used as inputs of least square support vector machine model. The optimization of the parameters and the number of principal components used as inputs of least square support vector machine model was performed through cross validation based on grid search. 60 samples of each type of Polyacrylamide were collected. Thus a total of 180 samples were obtained. 135 samples, 45 samples for each type of Polyacrylamide, were randomly split into a training set to build calibration model and the rest 45 samples were used as test set to evaluate the performance of the developed model. In addition, 5 Cationic Polyacrylamide samples and 5 Anionic Polyacrylamide samples adulterated with different proportion of Non-ionic Polyacrylamide were also prepared to show the feasibilty of the proposed method to discriminate the adulterated Polyacrylamide samples. The prediction error threshold for each type of Polyacrylamide was determined by F statistical significance test method based on the prediction error of the training set of corresponding type of Polyacrylamide in cross validation. The discrimination accuracy of the built model was 100% for prediction of the test set. The prediction of the model for the 10 mixing samples was also presented, and all mixing samples were accurately discriminated as adulterated samples. The overall results demonstrate that the discrimination method proposed in the present paper can rapidly and nondestructively discriminate the different types of Polyacrylamide and the adulterated Polyacrylamide samples, and offered a new approach to discriminate the types of Polyacrylamide.

  5. Effect of the statin therapy on biochemical laboratory tests--a chemometrics study.

    PubMed

    Durceková, Tatiana; Mocák, Ján; Boronová, Katarína; Balla, Ján

    2011-01-05

    Statins are the first-line choice for lowering total and LDL cholesterol levels and very important medicaments for reducing the risk of coronary artery disease. The aim of this study is therefore assessment of the results of biochemical tests characterizing the condition of 172 patients before and after administration of statins. For this purpose, several chemometric tools, namely principal component analysis, cluster analysis, discriminant analysis, logistic regression, KNN classification, ROC analysis, descriptive statistics and ANOVA were used. Mutual relations of 11 biochemical laboratory tests, the patient's age and gender were investigated in detail. Achieved results enable to evaluate the extent of the statin treatment in each individual case. They may also help in monitoring the dynamic progression of the disease. Copyright © 2010 Elsevier B.V. All rights reserved.

  6. Characteristics of Teachers Nominated for an Accelerated Principal Preparation Program

    ERIC Educational Resources Information Center

    Rios, Steve J.; Reyes-Guerra, Daniel

    2012-01-01

    This article reports the initial evaluation results of a new accelerated, job-embedded principal preparation program funded by a Race to the Top Grant (U.S. Department of Education, 2012a) in Florida. Descriptive statistics, t-tests, and chi-square analyses were used to describe the characteristics of a group of potential applicants nominated to…

  7. Public High School Principals' Perceptions of Academic Reform. Center for Education Statistics Bulletin.

    ERIC Educational Resources Information Center

    Center for Education Statistics (ED/OERI), Washington, DC.

    A survey of public high school principals asked which policies, programs, and practices designed to improve learning were currently in operation at their schools, and whether these policies were instituted or substantially strengthened in the past 5 years. These policies reflect the school-level recommendations for education reform made in "A…

  8. Adolescent Suicide. The Trauma of Adolescent Suicide. A Time for Special Leadership by Principals.

    ERIC Educational Resources Information Center

    Dempsey, Richard A.

    This monograph was written to help principals and other school personnel explore the issues of adolescent suicide and its prevention. Chapter 1 presents statistics on the incidence of adolescent suicides and suicide attempts. Chapter 2, Causes of Suicide, reviews developmental tasks of adolescence, lists several contributors to adolescent suicide,…

  9. A Correlational Study of Principals' Leadership Style and Teacher Absenteeism

    ERIC Educational Resources Information Center

    Carter, Jason

    2010-01-01

    The purpose of this study was to determine whether McGregor's Theory X and Theory Y, gender, age, and years of experience of principals form a composite explaining the variation in teacher absences. It sought to determine whether all or any of these variables would be statistically significant in explaining the variance in absences for teachers.…

  10. Assessment of groundwater quality of Ballia district, Uttar Pradesh, India, with reference to arsenic contamination using multivariate statistical analysis

    NASA Astrophysics Data System (ADS)

    Singh, Asha Lata; Singh, Vipin Kumar

    2018-06-01

    A total of 22 water quality parameters were selected for the analysis of groundwater samples with reference to arsenic contamination. Samples were collected in the pre-monsoon and monsoon seasons of the year 2013. The maximum arsenic concentration in both the pre-monsoon and monsoon seasons was approximately the same, i.e., the maximum arsenic concentration being 75.60 and 74.46 µg/L in pre-monsoon and monsoon, respectively. Out of 72 collected samples, three were below the WHO guideline value of 10 µg/L for arsenic concentration. In 95.83% of the groundwater samples, the arsenic concentration was above the permissible limit. Nickel, manganese, and chromium concentrations were above the permissible limits in nearly all samples except for chromium concentration in a few pre-monsoon samples. However, the total iron concentrations in 23 samples (31.94%) were above the permissible limit. A total of six and seven principal components (PCs) were extracted using principal component analysis during the pre-monsoon and monsoon seasons, respectively, accounting for 76.25 and 78.52% of the total variation during two consecutive seasons. Correlation statistics revealed that the arsenic concentration was positively correlated with phosphate, iron, ammonium, bicarbonate, and manganese concentrations but negatively correlated with oxidation reduction potential (ORP), sulfate concentration, electrical conductivity, and total dissolved solids concentration. The negative correlation of arsenic with ORP suggested reducing conditions prevailing in the groundwater. The trilinear Piper diagram revealed calcium and magnesium enrichment of groundwater with an abundance of chloride ions but no predominance of bicarbonate ions. Thus, the groundwater fell into Ca2+ - Mg2+ - Cl- - SO4 2- category.

  11. The development of an instrument to measure quality of vision: the Quality of Vision (QoV) questionnaire.

    PubMed

    McAlinden, Colm; Pesudovs, Konrad; Moore, Jonathan E

    2010-11-01

    To develop an instrument to measure subjective quality of vision: the Quality of Vision (QoV) questionnaire. A 30-item instrument was designed with 10 symptoms rated in each of three scales (frequency, severity, and bothersome). The QoV was completed by 900 subjects in groups of spectacle wearers, contact lens wearers, and those having had laser refractive surgery, intraocular refractive surgery, or eye disease and investigated with Rasch analysis and traditional statistics. Validity and reliability were assessed by Rasch fit statistics, principal components analysis (PCA), person separation, differential item functioning (DIF), item targeting, construct validity (correlation with visual acuity, contrast sensitivity, total root mean square [RMS] higher order aberrations [HOA]), and test-retest reliability (two-way random intraclass correlation coefficients [ICC] and 95% repeatability coefficients [R(c)]). Rasch analysis demonstrated good precision, reliability, and internal consistency for all three scales (mean square infit and outfit within 0.81-1.27; PCA >60% variance explained by the principal component; person separation 2.08, 2.10, and 2.01 respectively; and minimal DIF). Construct validity was indicated by strong correlations with visual acuity, contrast sensitivity and RMS HOA. Test-retest reliability was evidenced by a minimum ICC of 0.867 and a minimum 95% R(c) of 1.55 units. The QoV Questionnaire consists of a Rasch-tested, linear-scaled, 30-item instrument on three scales providing a QoV score in terms of symptom frequency, severity, and bothersome. It is suitable for measuring QoV in patients with all types of refractive correction, eye surgery, and eye disease that cause QoV problems.

  12. Characterizing pigments with hyperspectral imaging variable false-color composites

    NASA Astrophysics Data System (ADS)

    Hayem-Ghez, Anita; Ravaud, Elisabeth; Boust, Clotilde; Bastian, Gilles; Menu, Michel; Brodie-Linder, Nancy

    2015-11-01

    Hyperspectral imaging has been used for pigment characterization on paintings for the last 10 years. It is a noninvasive technique, which mixes the power of spectrophotometry and that of imaging technologies. We have access to a visible and near-infrared hyperspectral camera, ranging from 400 to 1000 nm in 80-160 spectral bands. In order to treat the large amount of data that this imaging technique generates, one can use statistical tools such as principal component analysis (PCA). To conduct the characterization of pigments, researchers mostly use PCA, convex geometry algorithms and the comparison of resulting clusters to database spectra with a specific tolerance (like the Spectral Angle Mapper tool on the dedicated software ENVI). Our approach originates from false-color photography and aims at providing a simple tool to identify pigments thanks to imaging spectroscopy. It can be considered as a quick first analysis to see the principal pigments of a painting, before using a more complete multivariate statistical tool. We study pigment spectra, for each kind of hue (blue, green, red and yellow) to identify the wavelength maximizing spectral differences. The case of red pigments is most interesting because our methodology can discriminate the red pigments very well—even red lakes, which are always difficult to identify. As for the yellow and blue categories, it represents a good progress of IRFC photography for pigment discrimination. We apply our methodology to study the pigments on a painting by Eustache Le Sueur, a French painter of the seventeenth century. We compare the results to other noninvasive analysis like X-ray fluorescence and optical microscopy. Finally, we draw conclusions about the advantages and limits of the variable false-color image method using hyperspectral imaging.

  13. Principal component analysis of biometric traits to reveal body confirmation in local hill cattle of Himalayan state of Himachal Pradesh, India.

    PubMed

    Verma, Deepak; Sankhyan, Varun; Katoch, Sanjeet; Thakur, Yash Pal

    2015-12-01

    In the present study, biometric traits (body length [BL], heart girth [HG], paunch girth (PG), forelimb length (FLL), hind limb length (HLL), face length, forehead width, forehead length, height at hump, hump length (HL), hook to hook distance, pin to pin distance, tail length (TL), TL up to switch, horn length, horn circumference, and ear length were studied in 218 adult hill cattle of Himachal Pradesh for phenotypic characterization. Morphological and biometrical observations were recorded on 218 hill cattle randomly selected from different districts within the breeding tract. Multivariate statistics and principal component analysis are used to account for the maximum portion of variation present in the original set of variables with a minimum number of composite variables through Statistical software, SAS 9.2. Five components were extracted which accounted for 65.9% of variance. The first component explained general body confirmation and explained 34.7% variation. It was represented by significant loading for BL, HG, PG, FLL, and HLL. Communality estimate ranged from 0.41 (HL) to 0.88 (TL). Second, third, fourth, and fifth component had a high loading for tail characteristics, horn characteristics, facial biometrics, and rear body, respectively. The result of component analysis of biometric traits suggested that indigenous hill cattle of Himachal Pradesh are small and compact size cattle with a medium hump, horizontally placed short ears, and a long tail. The study also revealed that factors extracted from the present investigation could be used in breeding programs with sufficient reduction in the number of biometric traits to be recorded to explain the body confirmation.

  14. New predictor of aortic enlargement in uncomplicated type B aortic dissection based on elliptic Fourier analysis.

    PubMed

    Sato, Hiroshi; Ito, Toshiro; Kuroda, Yosuke; Uchiyama, Hiroki; Watanabe, Toshitaka; Yasuda, Naomi; Nakazawa, Junji; Harada, Ryo; Kawaharada, Nobuyoshi

    2017-12-01

    This study aimed to re-examine the conventional predictive factors for dissected aortic enlargement, such as the aortic and false lumen diameter and to consider whether the morphological elements of the dissected aorta could be predictors by quantifying the 'shape' of the true lumen based on elliptic Fourier analysis. A total of 80 patients with uncomplicated type B aortic dissection were included. The patients were divided into 'Enlargement group' and 'No Change group.' Between the 2 groups, the mean systolic blood pressure during follow-up, aortic and false lumen maximum diameters, and analysed morphological data were compared using each statistical method. The maximum aortic and false lumen diameters were significantly larger in the Enlargement group than in the No Change group (39.3 vs 35.9 mm; P = 0.0058) (23.5 vs 18.2 mm; P = 0.000095). The principal component 1, which is the data calculated by elliptic Fourier analysis, was significantly lower in the Enlargement group than in the No Change group (0.020 vs - 0.072; P = 0.000049). The mean systolic blood pressure ≥130 mmHg, aortic diameter, false lumen diameter and principal component 1 were included in the Cox proportional hazard model as covariates to determine the significant predictive variable. Principal component 1 demonstrated the only significance with aortic enlargement on multivariate analysis (odds ratio = 0.32; P = 0.048). The analysed and calculated morphological data of the shape of the true lumen can be more effective predictive factors of aortic enlargement of type B dissection than the conventional factors. © The Author 2017. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery. All rights reserved.

  15. Forensic analysis of Salvia divinorum using multivariate statistical procedures. Part I: discrimination from related Salvia species.

    PubMed

    Willard, Melissa A Bodnar; McGuffin, Victoria L; Smith, Ruth Waddell

    2012-01-01

    Salvia divinorum is a hallucinogenic herb that is internationally regulated. In this study, salvinorin A, the active compound in S. divinorum, was extracted from S. divinorum plant leaves using a 5-min extraction with dichloromethane. Four additional Salvia species (Salvia officinalis, Salvia guaranitica, Salvia splendens, and Salvia nemorosa) were extracted using this procedure, and all extracts were analyzed by gas chromatography-mass spectrometry. Differentiation of S. divinorum from other Salvia species was successful based on visual assessment of the resulting chromatograms. To provide a more objective comparison, the total ion chromatograms (TICs) were subjected to principal components analysis (PCA). Prior to PCA, the TICs were subjected to a series of data pretreatment procedures to minimize non-chemical sources of variance in the data set. Successful discrimination of S. divinorum from the other four Salvia species was possible based on visual assessment of the PCA scores plot. To provide a numerical assessment of the discrimination, a series of statistical procedures such as Euclidean distance measurement, hierarchical cluster analysis, Student's t tests, Wilcoxon rank-sum tests, and Pearson product moment correlation were also applied to the PCA scores. The statistical procedures were then compared to determine the advantages and disadvantages for forensic applications.

  16. Integrated data management for clinical studies: automatic transformation of data models with semantic annotations for principal investigators, data managers and statisticians.

    PubMed

    Dugas, Martin; Dugas-Breit, Susanne

    2014-01-01

    Design, execution and analysis of clinical studies involves several stakeholders with different professional backgrounds. Typically, principle investigators are familiar with standard office tools, data managers apply electronic data capture (EDC) systems and statisticians work with statistics software. Case report forms (CRFs) specify the data model of study subjects, evolve over time and consist of hundreds to thousands of data items per study. To avoid erroneous manual transformation work, a converting tool for different representations of study data models was designed. It can convert between office format, EDC and statistics format. In addition, it supports semantic annotations, which enable precise definitions for data items. A reference implementation is available as open source package ODMconverter at http://cran.r-project.org.

  17. Statistical mixture design selective extraction of compounds with antioxidant activity and total polyphenol content from Trichilia catigua.

    PubMed

    Lonni, Audrey Alesandra Stinghen Garcia; Longhini, Renata; Lopes, Gisely Cristiny; de Mello, João Carlos Palazzo; Scarminio, Ieda Spacino

    2012-03-16

    Statistical design mixtures of water, methanol, acetone and ethanol were used to extract material from Trichilia catigua (Meliaceae) barks to study the effects of different solvents and their mixtures on its yield, total polyphenol content and antioxidant activity. The experimental results and their response surface models showed that quaternary mixtures with approximately equal proportions of all four solvents provided the highest yields, total polyphenol contents and antioxidant activities of the crude extracts followed by ternary design mixtures. Principal component and hierarchical clustering analysis of the HPLC-DAD spectra of the chromatographic peaks of 1:1:1:1 water-methanol-acetone-ethanol mixture extracts indicate the presence of cinchonains, gallic acid derivatives, natural polyphenols, flavanoids, catechins, and epicatechins. Copyright © 2011 Elsevier B.V. All rights reserved.

  18. Electron microprobe analysis program for biological specimens: BIOMAP

    NASA Technical Reports Server (NTRS)

    Edwards, B. F.

    1972-01-01

    BIOMAP is a Univac 1108 compatible program which facilitates the electron probe microanalysis of biological specimens. Input data are X-ray intensity data from biological samples, the X-ray intensity and composition data from a standard sample and the electron probe operating parameters. Outputs are estimates of the weight percentages of the analyzed elements, the distribution of these estimates for sets of red blood cells and the probabilities for correlation between elemental concentrations. An optional feature statistically estimates the X-ray intensity and residual background of a principal standard relative to a series of standards.

  19. Foot anthropometry and morphology phenomena.

    PubMed

    Agić, Ante; Nikolić, Vasilije; Mijović, Budimir

    2006-12-01

    Foot structure description is important for many reasons. The foot anthropometric morphology phenomena are analyzed together with hidden biomechanical functionality in order to fully characterize foot structure and function. For younger Croatian population the scatter data of the individual foot variables were interpolated by multivariate statistics. Foot structure descriptors are influenced by many factors, as a style of life, race, climate, and things of the great importance in human society. Dominant descriptors are determined by principal component analysis. Some practical recommendation and conclusion for medical, sportswear and footwear practice are highlighted.

  20. Differentiation of aflatoxigenic and non-aflatoxigenic strains of Aspergilli by FT-IR spectroscopy.

    PubMed

    Atkinson, Curtis; Pechanova, Olga; Sparks, Darrell L; Brown, Ashli; Rodriguez, Jose M

    2014-01-01

    Fourier transform infrared spectroscopy (FT-IR) is a well-established and widely accepted methodology to identify and differentiate diverse microbial species. In this study, FT-IR was used to differentiate 20 strains of ubiquitous and agronomically important phytopathogens of Aspergillus flavus and Aspergillus parasiticus. By analyzing their spectral profiles via principal component and cluster analysis, differentiation was achieved between the aflatoxin-producing and nonproducing strains of both fungal species. This study thus indicates that FT-IR coupled to multivariate statistics can rapidly differentiate strains of Aspergilli based on their toxigenicity.

  1. Reliability analysis of structural ceramics subjected to biaxial flexure

    NASA Technical Reports Server (NTRS)

    Chao, Luen-Yuan; Shetty, Dinesh K.

    1991-01-01

    The reliability of alumina disks subjected to biaxial flexure is predicted on the basis of statistical fracture theory using a critical strain energy release rate fracture criterion. Results on a sintered silicon nitride are consistent with reliability predictions based on pore-initiated penny-shaped cracks with preferred orientation normal to the maximum principal stress. Assumptions with regard to flaw types and their orientations in each ceramic can be justified by fractography. It is shown that there are no universal guidelines for selecting fracture criteria or assuming flaw orientations in reliability analyses.

  2. Bi-telescopic, deep, simultaneous meteor observations

    NASA Technical Reports Server (NTRS)

    Taff, L. G.

    1986-01-01

    A statistical summary is presented of 10 hours of observing sporadic meteors and two meteor showers using the Experimental Test System of the Lincoln Laboratory. The observatory is briefly described along with the real-time and post-processing hardware, the analysis, and the data reduction. The principal observational results are given for the sporadic meteor zenithal hourly rates. The unique properties of the observatory include twin telescopes to allow the discrimination of meteors by parallax, deep limiting magnitude, good time resolution, and sophisticated real-time and post-observing video processing.

  3. The influence of foot hyperpronation on pelvic biomechanics during stance phase of the gait: A biomechanical simulation study.

    PubMed

    Yazdani, Farzaneh; Razeghi, Mohsen; Karimi, Mohammad Taghi; Raeisi Shahraki, Hadi; Salimi Bani, Milad

    2018-05-01

    Despite the theoretical link between foot hyperpronation and biomechanical dysfunction of the pelvis, the literature lacks evidence that confirms this assumption in truly hyperpronated feet subjects during gait. Changes in the kinematic pattern of the pelvic segment were assessed in 15 persons with hyperpronated feet and compared to a control group of 15 persons with normally aligned feet during the stance phase of gait based on biomechanical musculoskeletal simulation. Kinematic and kinetic data were collected while participants walked at a comfortable self-selected speed. A generic OpenSim musculoskeletal model with 23 degrees of freedom and 92 muscles was scaled for each participant. OpenSim inverse kinematic analysis was applied to calculate segment angles in the sagittal, frontal and horizontal planes. Principal component analysis was employed as a data reduction technique, as well as a computational tool to obtain principal component scores. Independent-sample t-test was used to detect group differences. The difference between groups in scores for the first principal component in the sagittal plane was statistically significant (p = 0.01; effect size = 1.06), but differences between principal component scores in the frontal and horizontal planes were not significant. The hyperpronation group had greater anterior pelvic tilt during 20%-80% of the stance phase. In conclusion, in persons with hyperpronation we studied the role of the pelvic segment was mainly to maintain postural balance in the sagittal plane by increasing anterior pelvic inclination. Since anterior pelvic tilt may be associated with low back symptoms, the evaluation of foot posture should be considered in assessing the patients with low back and pelvic dysfunction.

  4. Regionalization of precipitation characteristics in Iran's Lake Urmia basin

    NASA Astrophysics Data System (ADS)

    Fazel, Nasim; Berndtsson, Ronny; Uvo, Cintia Bertacchi; Madani, Kaveh; Kløve, Bjørn

    2018-04-01

    Lake Urmia in northwest Iran, once one of the largest hypersaline lakes in the world, has shrunk by almost 90% in area and 80% in volume during the last four decades. To improve the understanding of regional differences in water availability throughout the region and to refine the existing information on precipitation variability, this study investigated the spatial pattern of precipitation for the Lake Urmia basin. Daily rainfall time series from 122 precipitation stations with different record lengths were used to extract 15 statistical descriptors comprising 25th percentile, 75th percentile, and coefficient of variation for annual and seasonal total precipitation. Principal component analysis in association with cluster analysis identified three main homogeneous precipitation groups in the lake basin. The first sub-region (group 1) includes stations located in the center and southeast; the second sub-region (group 2) covers mostly northern and northeastern part of the basin, and the third sub-region (group 3) covers the western and southern edges of the basin. Results of principal component (PC) and clustering analyses showed that seasonal precipitation variation is the most important feature controlling the spatial pattern of precipitation in the lake basin. The 25th and 75th percentiles of winter and autumn are the most important variables controlling the spatial pattern of the first rotated principal component explaining about 32% of the total variance. Summer and spring precipitation variations are the most important variables in the second and third rotated principal components, respectively. Seasonal variation in precipitation amount and seasonality are explained by topography and influenced by the lake and westerly winds that are related to the strength of the North Atlantic Oscillation. Despite using incomplete time series with different lengths, the identified sub-regions are physically meaningful.

  5. A study of two unsupervised data driven statistical methodologies for detecting and classifying damages in structural health monitoring

    NASA Astrophysics Data System (ADS)

    Tibaduiza, D.-A.; Torres-Arredondo, M.-A.; Mujica, L. E.; Rodellar, J.; Fritzen, C.-P.

    2013-12-01

    This article is concerned with the practical use of Multiway Principal Component Analysis (MPCA), Discrete Wavelet Transform (DWT), Squared Prediction Error (SPE) measures and Self-Organizing Maps (SOM) to detect and classify damages in mechanical structures. The formalism is based on a distributed piezoelectric active sensor network for the excitation and detection of structural dynamic responses. Statistical models are built using PCA when the structure is known to be healthy either directly from the dynamic responses or from wavelet coefficients at different scales representing Time-frequency information. Different damages on the tested structures are simulated by adding masses at different positions. The data from the structure in different states (damaged or not) are then projected into the different principal component models by each actuator in order to obtain the input feature vectors for a SOM from the scores and the SPE measures. An aircraft fuselage from an Airbus A320 and a multi-layered carbon fiber reinforced plastic (CFRP) plate are used as examples to test the approaches. Results are presented, compared and discussed in order to determine their potential in structural health monitoring. These results showed that all the simulated damages were detectable and the selected features proved capable of separating all damage conditions from the undamaged state for both approaches.

  6. PCA as a practical indicator of OPLS-DA model reliability.

    PubMed

    Worley, Bradley; Powers, Robert

    Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) are powerful statistical modeling tools that provide insights into separations between experimental groups based on high-dimensional spectral measurements from NMR, MS or other analytical instrumentation. However, when used without validation, these tools may lead investigators to statistically unreliable conclusions. This danger is especially real for Partial Least Squares (PLS) and OPLS, which aggressively force separations between experimental groups. As a result, OPLS-DA is often used as an alternative method when PCA fails to expose group separation, but this practice is highly dangerous. Without rigorous validation, OPLS-DA can easily yield statistically unreliable group separation. A Monte Carlo analysis of PCA group separations and OPLS-DA cross-validation metrics was performed on NMR datasets with statistically significant separations in scores-space. A linearly increasing amount of Gaussian noise was added to each data matrix followed by the construction and validation of PCA and OPLS-DA models. With increasing added noise, the PCA scores-space distance between groups rapidly decreased and the OPLS-DA cross-validation statistics simultaneously deteriorated. A decrease in correlation between the estimated loadings (added noise) and the true (original) loadings was also observed. While the validity of the OPLS-DA model diminished with increasing added noise, the group separation in scores-space remained basically unaffected. Supported by the results of Monte Carlo analyses of PCA group separations and OPLS-DA cross-validation metrics, we provide practical guidelines and cross-validatory recommendations for reliable inference from PCA and OPLS-DA models.

  7. Application of statistical shape analysis for the estimation of bone and forensic age using the shapes of the 2nd, 3rd, and 4th cervical vertebrae in a young Japanese population.

    PubMed

    Rhee, Chang-Hoon; Shin, Sang Min; Choi, Yong-Seok; Yamaguchi, Tetsutaro; Maki, Koutaro; Kim, Yong-Il; Kim, Seong-Sik; Park, Soo-Byung; Son, Woo-Sung

    2015-12-01

    From computed tomographic images, the dentocentral synchondrosis can be identified in the second cervical vertebra. This can demarcate the border between the odontoid process and the body of the 2nd cervical vertebra and serve as a good model for the prediction of bone and forensic age. Nevertheless, until now, there has been no application of the 2nd cervical vertebra based on the dentocentral synchondrosis. In this study, statistical shape analysis was used to build bone and forensic age estimation regression models. Following the principles of statistical shape analysis and principal components analysis, we used cone-beam computed tomography (CBCT) to evaluate a Japanese population (35 males and 45 females, from 5 to 19 years old). The narrowest prediction intervals among the multivariate regression models were 19.63 for bone age and 2.99 for forensic age. There was no significant difference between form space and shape space in the bone and forensic age estimation models. However, for gender comparison, the bone and forensic age estimation models for males had the higher explanatory power. This study derived an improved objective and quantitative method for bone and forensic age estimation based on only the 2nd, 3rd and 4th cervical vertebral shapes. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  8. Population Analysis of Disabled Children by Departments in France

    NASA Astrophysics Data System (ADS)

    Meidatuzzahra, Diah; Kuswanto, Heri; Pech, Nicolas; Etchegaray, Amélie

    2017-06-01

    In this study, a statistical analysis is performed by model the variations of the disabled about 0-19 years old population among French departments. The aim is to classify the departments according to their profile determinants (socioeconomic and behavioural profiles). The analysis is focused on two types of methods: principal component analysis (PCA) and multiple correspondences factorial analysis (MCA) to review which one is the best methods for interpretation of the correlation between the determinants of disability (independent variable). The PCA is the best method for interpretation of the correlation between the determinants of disability (independent variable). The PCA reduces 14 determinants of disability to 4 axes, keeps 80% of total information, and classifies them into 7 classes. The MCA reduces the determinants to 3 axes, retains only 30% of information, and classifies them into 4 classes.

  9. Bayesian and Phylogenic Approaches for Studying Relationships among Table Olive Cultivars.

    PubMed

    Ben Ayed, Rayda; Ennouri, Karim; Ben Amar, Fathi; Moreau, Fabienne; Triki, Mohamed Ali; Rebai, Ahmed

    2017-08-01

    To enhance table olive tree authentication, relationship, and productivity, we consider the analysis of 18 worldwide table olive cultivars (Olea europaea L.) based on morphological, biological, and physicochemical markers analyzed by bioinformatic and biostatistic tools. Accordingly, we assess the relationships between the studied varieties, on the one hand, and the potential productivity-quantitative parameter links on the other hand. The bioinformatic analysis based on the graphical representation of the matrix of Euclidean distances, the principal components analysis, unweighted pair group method with arithmetic mean, and principal coordinate analysis (PCoA) revealed three major clusters which were not correlated with the geographic origin. The statistical analysis based on Kendall's and Spearman correlation coefficients suggests two highly significant associations with both fruit color and pollinization and the productivity character. These results are confirmed by the multiple linear regression prediction models. In fact, based on the coefficient of determination (R 2 ) value, the best model demonstrated the power of the pollinization on the tree productivity (R 2  = 0.846). Moreover, the derived directed acyclic graph showed that only two direct influences are detected: effect of tolerance on fruit and stone symmetry on side and effect of tolerance on stone form and oil content on the other side. This work provides better understanding of the diversity available in worldwide table olive cultivars and supplies an important contribution for olive breeding and authenticity.

  10. Hyperspectral imaging coupled with chemometric analysis for non-invasive differentiation of black pens

    NASA Astrophysics Data System (ADS)

    Chlebda, Damian K.; Majda, Alicja; Łojewski, Tomasz; Łojewska, Joanna

    2016-11-01

    Differentiation of the written text can be performed with a non-invasive and non-contact tool that connects conventional imaging methods with spectroscopy. Hyperspectral imaging (HSI) is a relatively new and rapid analytical technique that can be applied in forensic science disciplines. It allows an image of the sample to be acquired, with full spectral information within every pixel. For this paper, HSI and three statistical methods (hierarchical cluster analysis, principal component analysis, and spectral angle mapper) were used to distinguish between traces of modern black gel pen inks. Non-invasiveness and high efficiency are among the unquestionable advantages of ink differentiation using HSI. It is also less time-consuming than traditional methods such as chromatography. In this study, a set of 45 modern gel pen ink marks deposited on a paper sheet were registered. The spectral characteristics embodied in every pixel were extracted from an image and analysed using statistical methods, externally and directly on the hypercube. As a result, different black gel inks deposited on paper can be distinguished and classified into several groups, in a non-invasive manner.

  11. A Dimensionally Reduced Clustering Methodology for Heterogeneous Occupational Medicine Data Mining.

    PubMed

    Saâdaoui, Foued; Bertrand, Pierre R; Boudet, Gil; Rouffiac, Karine; Dutheil, Frédéric; Chamoux, Alain

    2015-10-01

    Clustering is a set of techniques of the statistical learning aimed at finding structures of heterogeneous partitions grouping homogenous data called clusters. There are several fields in which clustering was successfully applied, such as medicine, biology, finance, economics, etc. In this paper, we introduce the notion of clustering in multifactorial data analysis problems. A case study is conducted for an occupational medicine problem with the purpose of analyzing patterns in a population of 813 individuals. To reduce the data set dimensionality, we base our approach on the Principal Component Analysis (PCA), which is the statistical tool most commonly used in factorial analysis. However, the problems in nature, especially in medicine, are often based on heterogeneous-type qualitative-quantitative measurements, whereas PCA only processes quantitative ones. Besides, qualitative data are originally unobservable quantitative responses that are usually binary-coded. Hence, we propose a new set of strategies allowing to simultaneously handle quantitative and qualitative data. The principle of this approach is to perform a projection of the qualitative variables on the subspaces spanned by quantitative ones. Subsequently, an optimal model is allocated to the resulting PCA-regressed subspaces.

  12. Data Analysis & Statistical Methods for Command File Errors

    NASA Technical Reports Server (NTRS)

    Meshkat, Leila; Waggoner, Bruce; Bryant, Larry

    2014-01-01

    This paper explains current work on modeling for managing the risk of command file errors. It is focused on analyzing actual data from a JPL spaceflight mission to build models for evaluating and predicting error rates as a function of several key variables. We constructed a rich dataset by considering the number of errors, the number of files radiated, including the number commands and blocks in each file, as well as subjective estimates of workload and operational novelty. We have assessed these data using different curve fitting and distribution fitting techniques, such as multiple regression analysis, and maximum likelihood estimation to see how much of the variability in the error rates can be explained with these. We have also used goodness of fit testing strategies and principal component analysis to further assess our data. Finally, we constructed a model of expected error rates based on the what these statistics bore out as critical drivers to the error rate. This model allows project management to evaluate the error rate against a theoretically expected rate as well as anticipate future error rates.

  13. Multivariate statistical analysis strategy for multiple misfire detection in internal combustion engines

    NASA Astrophysics Data System (ADS)

    Hu, Chongqing; Li, Aihua; Zhao, Xingyang

    2011-02-01

    This paper proposes a multivariate statistical analysis approach to processing the instantaneous engine speed signal for the purpose of locating multiple misfire events in internal combustion engines. The state of each cylinder is described with a characteristic vector extracted from the instantaneous engine speed signal following a three-step procedure. These characteristic vectors are considered as the values of various procedure parameters of an engine cycle. Therefore, determination of occurrence of misfire events and identification of misfiring cylinders can be accomplished by a principal component analysis (PCA) based pattern recognition methodology. The proposed algorithm can be implemented easily in practice because the threshold can be defined adaptively without the information of operating conditions. Besides, the effect of torsional vibration on the engine speed waveform is interpreted as the presence of super powerful cylinder, which is also isolated by the algorithm. The misfiring cylinder and the super powerful cylinder are often adjacent in the firing sequence, thus missing detections and false alarms can be avoided effectively by checking the relationship between the cylinders.

  14. Monitoring the metering performance of an electronic voltage transformer on-line based on cyber-physics correlation analysis

    NASA Astrophysics Data System (ADS)

    Zhang, Zhu; Li, Hongbin; Tang, Dengping; Hu, Chen; Jiao, Yang

    2017-10-01

    Metering performance is the key parameter of an electronic voltage transformer (EVT), and it requires high accuracy. The conventional off-line calibration method using a standard voltage transformer is not suitable for the key equipment in a smart substation, which needs on-line monitoring. In this article, we propose a method for monitoring the metering performance of an EVT on-line based on cyber-physics correlation analysis. By the electrical and physical properties of a substation running in three-phase symmetry, the principal component analysis method is used to separate the metering deviation caused by the primary fluctuation and the EVT anomaly. The characteristic statistics of the measured data during operation are extracted, and the metering performance of the EVT is evaluated by analyzing the change in statistics. The experimental results show that the method successfully monitors the metering deviation of a Class 0.2 EVT accurately. The method demonstrates the accurate evaluation of on-line monitoring of the metering performance on an EVT without a standard voltage transformer.

  15. Geographical origin discrimination of lentils (Lens culinaris Medik.) using 1H NMR fingerprinting and multivariate statistical analyses.

    PubMed

    Longobardi, Francesco; Innamorato, Valentina; Di Gioia, Annalisa; Ventrella, Andrea; Lippolis, Vincenzo; Logrieco, Antonio F; Catucci, Lucia; Agostiano, Angela

    2017-12-15

    Lentil samples coming from two different countries, i.e. Italy and Canada, were analysed using untargeted 1 H NMR fingerprinting in combination with chemometrics in order to build models able to classify them according to their geographical origin. For such aim, Soft Independent Modelling of Class Analogy (SIMCA), k-Nearest Neighbor (k-NN), Principal Component Analysis followed by Linear Discriminant Analysis (PCA-LDA) and Partial Least Squares-Discriminant Analysis (PLS-DA) were applied to the NMR data and the results were compared. The best combination of average recognition (100%) and cross-validation prediction abilities (96.7%) was obtained for the PCA-LDA. All the statistical models were validated both by using a test set and by carrying out a Monte Carlo Cross Validation: the obtained performances were found to be satisfying for all the models, with prediction abilities higher than 95% demonstrating the suitability of the developed methods. Finally, the metabolites that mostly contributed to the lentil discrimination were indicated. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Factorial and Item-Level Invariance of a Principal Perspectives Survey: German and U.S. Principals.

    PubMed

    Wang, Chuang; Hancock, Dawson R; Muller, Ulrich

    This study examined the factorial and item-level invariance of a survey of principals' job satisfaction and perspectives about reasons and barriers to becoming a principal with a sample of US principals and another sample of German principals. Confirmatory factor analysis (CFA) and differential item functioning (DIF) analysis were employed at the test and item level, respectively. A single group CFA was conducted first, and the model was found to fit the data collected. The factorial invariance between the German and the US principals was tested through three steps: (a) configural invariance; (b) measurement invariance; and (c) structural invariance. The results suggest that the survey is a viable measure of principals' job satisfaction and perspectives about reasons and barriers to becoming a principal because principals from two different cultures shared a similar pattern on all three constructs. The DIF analysis further revealed that 22 out of the 28 items functioned similarly between German and US principals.

  17. Wavelet decomposition based principal component analysis for face recognition using MATLAB

    NASA Astrophysics Data System (ADS)

    Sharma, Mahesh Kumar; Sharma, Shashikant; Leeprechanon, Nopbhorn; Ranjan, Aashish

    2016-03-01

    For the realization of face recognition systems in the static as well as in the real time frame, algorithms such as principal component analysis, independent component analysis, linear discriminate analysis, neural networks and genetic algorithms are used for decades. This paper discusses an approach which is a wavelet decomposition based principal component analysis for face recognition. Principal component analysis is chosen over other algorithms due to its relative simplicity, efficiency, and robustness features. The term face recognition stands for identifying a person from his facial gestures and having resemblance with factor analysis in some sense, i.e. extraction of the principal component of an image. Principal component analysis is subjected to some drawbacks, mainly the poor discriminatory power and the large computational load in finding eigenvectors, in particular. These drawbacks can be greatly reduced by combining both wavelet transform decomposition for feature extraction and principal component analysis for pattern representation and classification together, by analyzing the facial gestures into space and time domain, where, frequency and time are used interchangeably. From the experimental results, it is envisaged that this face recognition method has made a significant percentage improvement in recognition rate as well as having a better computational efficiency.

  18. Open dumps in the Hellenic prefecture of Laconia: statistical analysis of characteristics and restoration prioritization on the basis of a field survey.

    PubMed

    Tsatsarelis, Thomas; Antonopoulos, Ioannis; Karagiannidis, Avraam; Perkoulidis, George

    2007-10-01

    This study presents an assessment of the current status of open dumps in Laconia prefecture of Peloponnese in southern Greece, where all open dumps are targeted for closure by 2008. An extensive field survey was conducted in 2005 to register existing sites in the prefecture. The data collected included the site area and age, waste depth, type of disposed waste, distance from nearest populated area, local geographical features and observed practices of open burning and soil coverage. On the basis of the collected data, a GIS database was developed, and the above parameters were statistically analysed. Subsequently, a decision tool for the restoration of open dumps was implemented, which led to the prioritization of site restorations and specific decisions about appropriate restoration steps for each site. The sites requiring restoration were then further classified using Principal Component Analysis, in order to categorize them into groups suitable for similar restoration work, thus facilitating fund allocation and subsequent restoration project management.

  19. Statistical and clustering analysis for disturbances: A case study of voltage dips in wind farms

    DOE PAGES

    Garcia-Sanchez, Tania; Gomez-Lazaro, Emilio; Muljadi, Eduard; ...

    2016-01-28

    This study proposes and evaluates an alternative statistical methodology to analyze a large number of voltage dips. For a given voltage dip, a set of lengths is first identified to characterize the root mean square (rms) voltage evolution along the disturbance, deduced from partial linearized time intervals and trajectories. Principal component analysis and K-means clustering processes are then applied to identify rms-voltage patterns and propose a reduced number of representative rms-voltage profiles from the linearized trajectories. This reduced group of averaged rms-voltage profiles enables the representation of a large amount of disturbances, which offers a visual and graphical representation ofmore » their evolution along the events, aspects that were not previously considered in other contributions. The complete process is evaluated on real voltage dips collected in intense field-measurement campaigns carried out in a wind farm in Spain among different years. The results are included in this paper.« less

  20. Data on xylem sap proteins from Mn- and Fe-deficient tomato plants obtained using shotgun proteomics.

    PubMed

    Ceballos-Laita, Laura; Gutierrez-Carbonell, Elain; Takahashi, Daisuke; Abadía, Anunciación; Uemura, Matsuo; Abadía, Javier; López-Millán, Ana Flor

    2018-04-01

    This article contains consolidated proteomic data obtained from xylem sap collected from tomato plants grown in Fe- and Mn-sufficient control, as well as Fe-deficient and Mn-deficient conditions. Data presented here cover proteins identified and quantified by shotgun proteomics and Progenesis LC-MS analyses: proteins identified with at least two peptides and showing changes statistically significant (ANOVA; p ≤ 0.05) and above a biologically relevant selected threshold (fold ≥ 2) between treatments are listed. The comparison between Fe-deficient, Mn-deficient and control xylem sap samples using a multivariate statistical data analysis (Principal Component Analysis, PCA) is also included. Data included in this article are discussed in depth in the research article entitled "Effects of Fe and Mn deficiencies on the protein profiles of tomato ( Solanum lycopersicum) xylem sap as revealed by shotgun analyses" [1]. This dataset is made available to support the cited study as well to extend analyses at a later stage.

  1. A statistical motion model based on biomechanical simulations for data fusion during image-guided prostate interventions.

    PubMed

    Hu, Yipeng; Morgan, Dominic; Ahmed, Hashim Uddin; Pendsé, Doug; Sahu, Mahua; Allen, Clare; Emberton, Mark; Hawkes, David; Barratt, Dean

    2008-01-01

    A method is described for generating a patient-specific, statistical motion model (SMM) of the prostate gland. Finite element analysis (FEA) is used to simulate the motion of the gland using an ultrasound-based 3D FE model over a range of plausible boundary conditions and soft-tissue properties. By applying principal component analysis to the displacements of the FE mesh node points inside the gland, the simulated deformations are then used as training data to construct the SMM. The SMM is used to both predict the displacement field over the whole gland and constrain a deformable surface registration algorithm, given only a small number of target points on the surface of the deformed gland. Using 3D transrectal ultrasound images of the prostates of five patients, acquired before and after imposing a physical deformation, to evaluate the accuracy of predicted landmark displacements, the mean target registration error was found to be less than 1.9 mm.

  2. Effective Thermal Inactivation of the Spores of Bacillus cereus Biofilms Using Microwave.

    PubMed

    Park, Hyong Seok; Yang, Jungwoo; Choi, Hee Jung; Kim, Kyoung Heon

    2017-07-28

    Microwave sterilization was performed to inactivate the spores of biofilms of Bacillus cereus involved in foodborne illness. The sterilization conditions, such as the amount of water and the operating temperature and treatment time, were optimized using statistical analysis based on 15 runs of experimental results designed by the Box-Behnken method. Statistical analysis showed that the optimal conditions for the inactivation of B. cereus biofilms were 14 ml of water, 108°C of temperature, and 15 min of treatment time. Interestingly, response surface plots showed that the amount of water is the most important factor for microwave sterilization under the present conditions. Complete inactivation by microwaves was achieved in 5 min, and the inactivation efficiency by microwave was obviously higher than that by conventional steam autoclave. Finally, confocal laser scanning microscopy images showed that the principal effect of microwave treatment was cell membrane disruption. Thus, this study can contribute to the development of a process to control food-associated pathogens.

  3. Quality of stormwater runoff discharged from Massachusetts highways, 2005-07

    USGS Publications Warehouse

    Smith, Kirk P.; Granato, Gregory E.

    2010-01-01

    The U.S. Geological Survey (USGS), in cooperation with U.S. Department of Transportation Federal Highway Administration and the Massachusetts Department of Transportation, conducted a field study from September 2005 through September 2007 to characterize the quality of highway runoff for a wide range of constituents. The highways studied had annual average daily traffic (AADT) volumes from about 3,000 to more than 190,000 vehicles per day. Highway-monitoring stations were installed at 12 locations in Massachusetts on 8 highways. The 12 monitoring stations were subdivided into 4 primary, 4 secondary, and 4 test stations. Each site contained a 100-percent impervious drainage area that included two or more catch basins sharing a common outflow pipe. Paired primary and secondary stations were located within a few miles of each other on a limited-access section of the same highway. Most of the data were collected at the primary and secondary stations, which were located on four principal highways (Route 119, Route 2, Interstate 495, and Interstate 95). The secondary stations were operated simultaneously with the primary stations for at least a year. Data from the four test stations (Route 8, Interstate 195, Interstate 190, and Interstate 93) were used to determine the transferability of the data collected from the principal highways to other highways characterized by different construction techniques, land use, and geography. Automatic-monitoring techniques were used to collect composite samples of highway runoff and make continuous measurements of several physical characteristics. Flowweighted samples of highway runoff were collected automatically during approximately 140 rain and mixed rain, sleet, and snowstorms. These samples were analyzed for physical characteristics and concentrations of 6 dissolved major ions, total nutrients, 8 total-recoverable metals, suspended sediment, and 85 semivolatile organic compounds (SVOCs), which include priority polyaromatic hydrocarbons (PAHs), phthalate esters, and other anthropogenic or naturally occurring organic compounds. The distribution of particle size of suspended sediment also was determined for composite samples of highway runoff. Samples of highway runoff were collected year round and under various dry antecedent conditions throughout the 2-year sampling period. In addition to samples of highway runoff, supplemental samples also were collected of sediment in highway runoff, background soils, berm materials, maintenance sands, deicing compounds, and vegetation matter. These additional samples were collected near or on the highways to support data analysis. There were few statistically significant differences between populations of constituent concentrations in samples from the primary and secondary stations on the same principal highways (Mann-Whitney test, 95-percent confidence level). Similarly, there were few statistically significant differences between populations of constituent concentrations for the four principal highways (data from the paired primary and secondary stations for each principal highway) and populations for test stations with similar AADT volumes. Exceptions to this include several total-recoverable metals for stations on Route 2 and Interstate 195 (highways with moderate AADT volumes), and for stations on Interstate 95 and Interstate 93 (highways with high AADT volumes). Supplemental data collected during this study indicate that many of these differences may be explained by the quantity, as well as the quality, of the sediment in samples of highway runoff. Nonparametric statistical methods also were used to test for differences between populations of sample constituent concentrations among the four principal highways that differed mainly in traffic volume. These results indicate that there were few statistically significant differences (Mann-Whitney test, 95-percent confidence level) for populations of concentrations of most total-recoverable metals

  4. Using assemblage data in ecological indicators: A comparison and evaluation of commonly available statistical tools

    USGS Publications Warehouse

    Smith, Joseph M.; Mather, Martha E.

    2012-01-01

    Ecological indicators are science-based tools used to assess how human activities have impacted environmental resources. For monitoring and environmental assessment, existing species assemblage data can be used to make these comparisons through time or across sites. An impediment to using assemblage data, however, is that these data are complex and need to be simplified in an ecologically meaningful way. Because multivariate statistics are mathematical relationships, statistical groupings may not make ecological sense and will not have utility as indicators. Our goal was to define a process to select defensible and ecologically interpretable statistical simplifications of assemblage data in which researchers and managers can have confidence. For this, we chose a suite of statistical methods, compared the groupings that resulted from these analyses, identified convergence among groupings, then we interpreted the groupings using species and ecological guilds. When we tested this approach using a statewide stream fish dataset, not all statistical methods worked equally well. For our dataset, logistic regression (Log), detrended correspondence analysis (DCA), cluster analysis (CL), and non-metric multidimensional scaling (NMDS) provided consistent, simplified output. Specifically, the Log, DCA, CL-1, and NMDS-1 groupings were ≥60% similar to each other, overlapped with the fluvial-specialist ecological guild, and contained a common subset of species. Groupings based on number of species (e.g., Log, DCA, CL and NMDS) outperformed groupings based on abundance [e.g., principal components analysis (PCA) and Poisson regression]. Although the specific methods that worked on our test dataset have generality, here we are advocating a process (e.g., identifying convergent groupings with redundant species composition that are ecologically interpretable) rather than the automatic use of any single statistical tool. We summarize this process in step-by-step guidance for the future use of these commonly available ecological and statistical methods in preparing assemblage data for use in ecological indicators.

  5. A statistical human rib cage geometry model accounting for variations by age, sex, stature and body mass index.

    PubMed

    Shi, Xiangnan; Cao, Libo; Reed, Matthew P; Rupp, Jonathan D; Hoff, Carrie N; Hu, Jingwen

    2014-07-18

    In this study, we developed a statistical rib cage geometry model accounting for variations by age, sex, stature and body mass index (BMI). Thorax CT scans were obtained from 89 subjects approximately evenly distributed among 8 age groups and both sexes. Threshold-based CT image segmentation was performed to extract the rib geometries, and a total of 464 landmarks on the left side of each subject׳s ribcage were collected to describe the size and shape of the rib cage as well as the cross-sectional geometry of each rib. Principal component analysis and multivariate regression analysis were conducted to predict rib cage geometry as a function of age, sex, stature, and BMI, all of which showed strong effects on rib cage geometry. Except for BMI, all parameters also showed significant effects on rib cross-sectional area using a linear mixed model. This statistical rib cage geometry model can serve as a geometric basis for developing a parametric human thorax finite element model for quantifying effects from different human attributes on thoracic injury risks. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Using radar imagery for crop discrimination: a statistical and conditional probability study

    USGS Publications Warehouse

    Haralick, R.M.; Caspall, F.; Simonett, D.S.

    1970-01-01

    A number of the constraints with which remote sensing must contend in crop studies are outlined. They include sensor, identification accuracy, and congruencing constraints; the nature of the answers demanded of the sensor system; and the complex temporal variances of crops in large areas. Attention is then focused on several methods which may be used in the statistical analysis of multidimensional remote sensing data.Crop discrimination for radar K-band imagery is investigated by three methods. The first one uses a Bayes decision rule, the second a nearest-neighbor spatial conditional probability approach, and the third the standard statistical techniques of cluster analysis and principal axes representation.Results indicate that crop type and percent of cover significantly affect the strength of the radar return signal. Sugar beets, corn, and very bare ground are easily distinguishable, sorghum, alfalfa, and young wheat are harder to distinguish. Distinguishability will be improved if the imagery is examined in time sequence so that changes between times of planning, maturation, and harvest provide additional discriminant tools. A comparison between radar and photography indicates that radar performed surprisingly well in crop discrimination in western Kansas and warrants further study.

  7. Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, paranoid personality disorder diagnosis: a unitary or a two-dimensional construct?

    PubMed

    Falkum, Erik; Pedersen, Geir; Karterud, Sigmund

    2009-01-01

    This article examines reliability and validity aspects of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) paranoid personality disorder (PPD) diagnosis. Patients with personality disorders (n = 930) from the Norwegian network of psychotherapeutic day hospitals, of which 114 had PPD, were included in the study. Frequency distribution, chi(2), correlations, reliability statistics, exploratory, and confirmatory factor analyses were performed. The distribution of PPD criteria revealed no distinct boundary between patients with and without PPD. Diagnostic category membership was obtained in 37 of 64 theoretically possible ways. The PPD criteria formed a separate factor in a principal component analysis, whereas a confirmatory factor analysis indicated that the DSM-IV PPD construct consists of 2 separate dimensions as follows: suspiciousness and hostility. The reliability of the unitary PPD scale was only 0.70, probably partly due to the apparent 2-dimensionality of the construct. Persistent unwarranted doubts about the loyalty of friends had the highest diagnostic efficiency, whereas unwarranted accusations of infidelity of partner had particularly poor indicator properties. The reliability and validity of the unitary PPD construct may be questioned. The 2-dimensional PPD model should be further explored.

  8. Damages detection in cylindrical metallic specimens by means of statistical baseline models and updated daily temperature profiles

    NASA Astrophysics Data System (ADS)

    Villamizar-Mejia, Rodolfo; Mujica-Delgado, Luis-Eduardo; Ruiz-Ordóñez, Magda-Liliana; Camacho-Navarro, Jhonatan; Moreno-Beltrán, Gustavo

    2017-05-01

    In previous works, damage detection of metallic specimens exposed to temperature changes has been achieved by using a statistical baseline model based on Principal Component Analysis (PCA), piezodiagnostics principle and taking into account temperature effect by augmenting the baseline model or by using several baseline models according to the current temperature. In this paper a new approach is presented, where damage detection is based in a new index that combine Q and T2 statistical indices with current temperature measurements. Experimental tests were achieved in a carbon-steel pipe of 1m length and 1.5 inches diameter, instrumented with piezodevices acting as actuators or sensors. A PCA baseline model was obtained to a temperature of 21º and then T2 and Q statistical indices were obtained for a 24h temperature profile. Also, mass adding at different points of pipe between sensor and actuator was used as damage. By using the combined index the temperature contribution can be separated and a better differentiation of damages respect to undamaged cases can be graphically obtained.

  9. Development and evaluation of statistical shape modeling for principal inner organs on torso CT images.

    PubMed

    Zhou, Xiangrong; Xu, Rui; Hara, Takeshi; Hirano, Yasushi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Hoshi, Hiroaki; Kido, Shoji; Fujita, Hiroshi

    2014-07-01

    The shapes of the inner organs are important information for medical image analysis. Statistical shape modeling provides a way of quantifying and measuring shape variations of the inner organs in different patients. In this study, we developed a universal scheme that can be used for building the statistical shape models for different inner organs efficiently. This scheme combines the traditional point distribution modeling with a group-wise optimization method based on a measure called minimum description length to provide a practical means for 3D organ shape modeling. In experiments, the proposed scheme was applied to the building of five statistical shape models for hearts, livers, spleens, and right and left kidneys by use of 50 cases of 3D torso CT images. The performance of these models was evaluated by three measures: model compactness, model generalization, and model specificity. The experimental results showed that the constructed shape models have good "compactness" and satisfied the "generalization" performance for different organ shape representations; however, the "specificity" of these models should be improved in the future.

  10. Using statistical text classification to identify health information technology incidents

    PubMed Central

    Chai, Kevin E K; Anthony, Stephen; Coiera, Enrico; Magrabi, Farah

    2013-01-01

    Objective To examine the feasibility of using statistical text classification to automatically identify health information technology (HIT) incidents in the USA Food and Drug Administration (FDA) Manufacturer and User Facility Device Experience (MAUDE) database. Design We used a subset of 570 272 incidents including 1534 HIT incidents reported to MAUDE between 1 January 2008 and 1 July 2010. Text classifiers using regularized logistic regression were evaluated with both ‘balanced’ (50% HIT) and ‘stratified’ (0.297% HIT) datasets for training, validation, and testing. Dataset preparation, feature extraction, feature selection, cross-validation, classification, performance evaluation, and error analysis were performed iteratively to further improve the classifiers. Feature-selection techniques such as removing short words and stop words, stemming, lemmatization, and principal component analysis were examined. Measurements κ statistic, F1 score, precision and recall. Results Classification performance was similar on both the stratified (0.954 F1 score) and balanced (0.995 F1 score) datasets. Stemming was the most effective technique, reducing the feature set size to 79% while maintaining comparable performance. Training with balanced datasets improved recall (0.989) but reduced precision (0.165). Conclusions Statistical text classification appears to be a feasible method for identifying HIT reports within large databases of incidents. Automated identification should enable more HIT problems to be detected, analyzed, and addressed in a timely manner. Semi-supervised learning may be necessary when applying machine learning to big data analysis of patient safety incidents and requires further investigation. PMID:23666777

  11. The Relation between Factor Score Estimates, Image Scores, and Principal Component Scores

    ERIC Educational Resources Information Center

    Velicer, Wayne F.

    1976-01-01

    Investigates the relation between factor score estimates, principal component scores, and image scores. The three methods compared are maximum likelihood factor analysis, principal component analysis, and a variant of rescaled image analysis. (RC)

  12. A Study of the Effect of Secondary School Leadership Styles on Student Achievement in Selected Secondary School in Louisiana

    ERIC Educational Resources Information Center

    Harris, Cydnie Ellen Smith

    2012-01-01

    The effect of the leadership style of the secondary school principal on student achievement in select public schools in Louisiana was examined in this study. The null hypothesis was that there was no statistically significant difference between principal leadership style and student academic achievement. The researcher submitted the LEAD-Self…

  13. The Relationship between Knowledge Management and Organizational Learning with the Effectiveness of Ordinary and Smart Secondary School Principals

    ERIC Educational Resources Information Center

    Khammar, Zahra; Heidarzadegan, Alireza; Balaghat, Seyed Reza; Salehi, Hadi

    2013-01-01

    This study aimed to investigate the relationship between knowledge management and organizational learning with the effectiveness of ordinary and smart high school principals in Zahedan Pre-province. The statistical community of this research is 1350 male and female teachers teaching in ordinary and smart students of high schools in that 300 ones…

  14. Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis.

    PubMed

    Fernández-Arjona, María Del Mar; Grondona, Jesús M; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D

    2017-01-01

    It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor.

  15. Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis

    PubMed Central

    Fernández-Arjona, María del Mar; Grondona, Jesús M.; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D.

    2017-01-01

    It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable morphological change upon neuraminidase induced inflammation.Hierarchical cluster and principal components analysis allow morphological classification of microglia.Brain location of microglia is a relevant factor. PMID:28848398

  16. Functional data analysis of sleeping energy expenditure.

    PubMed

    Lee, Jong Soo; Zakeri, Issa F; Butte, Nancy F

    2017-01-01

    Adequate sleep is crucial during childhood for metabolic health, and physical and cognitive development. Inadequate sleep can disrupt metabolic homeostasis and alter sleeping energy expenditure (SEE). Functional data analysis methods were applied to SEE data to elucidate the population structure of SEE and to discriminate SEE between obese and non-obese children. Minute-by-minute SEE in 109 children, ages 5-18, was measured in room respiration calorimeters. A smoothing spline method was applied to the calorimetric data to extract the true smoothing function for each subject. Functional principal component analysis was used to capture the important modes of variation of the functional data and to identify differences in SEE patterns. Combinations of functional principal component analysis and classifier algorithm were used to classify SEE. Smoothing effectively removed instrumentation noise inherent in the room calorimeter data, providing more accurate data for analysis of the dynamics of SEE. SEE exhibited declining but subtly undulating patterns throughout the night. Mean SEE was markedly higher in obese than non-obese children, as expected due to their greater body mass. SEE was higher among the obese than non-obese children (p<0.01); however, the weight-adjusted mean SEE was not statistically different (p>0.1, after post hoc testing). Functional principal component scores for the first two components explained 77.8% of the variance in SEE and also differed between groups (p = 0.037). Logistic regression, support vector machine or random forest classification methods were able to distinguish weight-adjusted SEE between obese and non-obese participants with good classification rates (62-64%). Our results implicate other factors, yet to be uncovered, that affect the weight-adjusted SEE of obese and non-obese children. Functional data analysis revealed differences in the structure of SEE between obese and non-obese children that may contribute to disruption of metabolic homeostasis.

  17. Professional Standards and Performance Evaluation for Principals in China: A Policy Analysis of the Development of Principal Standards

    ERIC Educational Resources Information Center

    Liu, Shujie; Xu, Xianxuan; Grant, Leslie; Strong, James; Fang, Zheng

    2017-01-01

    This article presents the results of an interpretive policy analysis of China's Ministry of Education Standards (2013) for the professional practice of principals. In addition to revealing the evolution of the evaluation of principals in China and the processes by which this policy is formulated, a comparative analysis was conducted to compare it…

  18. Chemical discrimination of lubricant marketing types using direct analysis in real time time-of-flight mass spectrometry.

    PubMed

    Maric, Mark; Harvey, Lauren; Tomcsak, Maren; Solano, Angelique; Bridge, Candice

    2017-06-30

    In comparison to other violent crimes, sexual assaults suffer from very low prosecution and conviction rates especially in the absence of DNA evidence. As a result, the forensic community needs to utilize other forms of trace contact evidence, like lubricant evidence, in order to provide a link between the victim and the assailant. In this study, 90 personal bottled and condom lubricants from the three main marketing types, silicone-based, water-based and condoms, were characterized by direct analysis in real time time of flight mass spectrometry (DART-TOFMS). The instrumental data was analyzed by multivariate statistics including hierarchal cluster analysis, principal component analysis, and linear discriminant analysis. By interpreting the mass spectral data with multivariate statistics, 12 discrete groupings were identified, indicating inherent chemical diversity not only between but within the three main marketing groups. A number of unique chemical markers, both major and minor, were identified, other than the three main chemical components (i.e. PEG, PDMS and nonoxynol-9) currently used for lubricant classification. The data was validated by a stratified 20% withheld cross-validation which demonstrated that there was minimal overlap between the groupings. Based on the groupings identified and unique features of each group, a highly discriminating statistical model was then developed that aims to provide the foundation for the development of a forensic lubricant database that may eventually be applied to casework. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  19. Big-Data RHEED analysis for understanding epitaxial film growth processes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P

    Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in-situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED image, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the dataset are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of RHEED image sequence.more » This approach is illustrated for growth of LaxCa1-xMnO3 films grown on etched (001) SrTiO3 substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the assymetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.« less

  20. Analysis of Exhaled Breath Volatile Organic Compounds in Inflammatory Bowel Disease: A Pilot Study.

    PubMed

    Hicks, Lucy C; Huang, Juzheng; Kumar, Sacheen; Powles, Sam T; Orchard, Timothy R; Hanna, George B; Williams, Horace R T

    2015-09-01

    Distinguishing between the inflammatory bowel diseases [IBD], Crohn's disease [CD] and ulcerative colitis [UC], is important for determining management and prognosis. Selected ion flow tube mass spectrometry [SIFT-MS] may be used to analyse volatile organic compounds [VOCs] in exhaled breath: these may be altered in disease states, and distinguishing breath VOC profiles can be identified. The aim of this pilot study was to identify, quantify, and analyse VOCs present in the breath of IBD patients and controls, potentially providing insights into disease pathogenesis and complementing current diagnostic algorithms. SIFT-MS breath profiling of 56 individuals [20 UC, 18 CD, and 18 healthy controls] was undertaken. Multivariate analysis included principal components analysis and partial least squares discriminant analysis with orthogonal signal correction [OSC-PLS-DA]. Receiver operating characteristic [ROC] analysis was performed for each comparative analysis using statistically significant VOCs. OSC-PLS-DA modelling was able to distinguish both CD and UC from healthy controls and from one other with good sensitivity and specificity. ROC analysis using combinations of statistically significant VOCs [dimethyl sulphide, hydrogen sulphide, hydrogen cyanide, ammonia, butanal, and nonanal] gave integrated areas under the curve of 0.86 [CD vs healthy controls], 0.74 [UC vs healthy controls], and 0.83 [CD vs UC]. Exhaled breath VOC profiling was able to distinguish IBD patients from controls, as well as to separate UC from CD, using both multivariate and univariate statistical techniques. Copyright © 2015 European Crohn’s and Colitis Organisation (ECCO). Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  1. Two worlds collide: Image analysis methods for quantifying structural variation in cluster molecular dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steenbergen, K. G., E-mail: kgsteen@gmail.com; Gaston, N.

    2014-02-14

    Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement formore » a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.« less

  2. Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio

    2018-01-01

    The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.

  3. Two worlds collide: image analysis methods for quantifying structural variation in cluster molecular dynamics.

    PubMed

    Steenbergen, K G; Gaston, N

    2014-02-14

    Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement for a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.

  4. Trends in stratospheric ozone profiles using functional mixed models

    NASA Astrophysics Data System (ADS)

    Park, A.; Guillas, S.; Petropavlovskikh, I.

    2013-11-01

    This paper is devoted to the modeling of altitude-dependent patterns of ozone variations over time. Umkehr ozone profiles (quarter of Umkehr layer) from 1978 to 2011 are investigated at two locations: Boulder (USA) and Arosa (Switzerland). The study consists of two statistical stages. First we approximate ozone profiles employing an appropriate basis. To capture primary modes of ozone variations without losing essential information, a functional principal component analysis is performed. It penalizes roughness of the function and smooths excessive variations in the shape of the ozone profiles. As a result, data-driven basis functions (empirical basis functions) are obtained. The coefficients (principal component scores) corresponding to the empirical basis functions represent dominant temporal evolution in the shape of ozone profiles. We use those time series coefficients in the second statistical step to reveal the important sources of the patterns and variations in the profiles. We estimate the effects of covariates - month, year (trend), quasi-biennial oscillation, the solar cycle, the Arctic oscillation, the El Niño/Southern Oscillation cycle and the Eliassen-Palm flux - on the principal component scores of ozone profiles using additive mixed effects models. The effects are represented as smooth functions and the smooth functions are estimated by penalized regression splines. We also impose a heteroscedastic error structure that reflects the observed seasonality in the errors. The more complex error structure enables us to provide more accurate estimates of influences and trends, together with enhanced uncertainty quantification. Also, we are able to capture fine variations in the time evolution of the profiles, such as the semi-annual oscillation. We conclude by showing the trends by altitude over Boulder and Arosa, as well as for total column ozone. There are great variations in the trends across altitudes, which highlights the benefits of modeling ozone profiles.

  5. Managerial Ownership in Nursing Homes: Staffing, Quality, and Financial Performance.

    PubMed

    Huang, Sean Shenghsiu; Bowblis, John R

    2017-06-20

    Ownership of nursing homes (NHs) has primarily focused broadly on differences between for-profit (FP), nonprofit (NFP), and government-operated facilities. Yet, among FPs, the understanding of detailed ownership structures at individual NHs is rather limited. Particularly, NH administrators may hold significant equity interests in their facilities, leading to heterogeneous financial incentives and NH outcomes. Through the principal-agent theory, this article studies how managerial ownership of individual facilities affects NH outcomes. We use a unique panel dataset of Ohio NHs (2005-2010) to empirically examine the relationship between managerial equity ownership and NH staffing, quality, and financial performance. We identify facility administrators as owner-managers if they have more than 5% of the equity stakes or are relatives of the owners. The statistical analysis is based on the pooled ordinary least squares and NH-fixed effect models. We find that owner-managed NHs are associated with higher nursing staff levels compared to other FP NHs. Surprisingly, despite higher staffing levels, owner-managed NHs are not associated with better quality and we find no statistically significant difference in financial performance between owner-managed and nonowner-managed FP NHs. Our results do not support the principal-agent model and we offer alternative explanations for future research. Our findings provide empirical evidence that NH ownership structures are more nuanced than simply broadly categorizing facilities as FP or NFP, and our results do not fully align with the standard principal-agent model. The role of managerial ownership should be considered in future NH research and policy discussions. © The Author 2017. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  6. Principal Leadership for Technology-enhanced Learning in Science

    NASA Astrophysics Data System (ADS)

    Gerard, Libby F.; Bowyer, Jane B.; Linn, Marcia C.

    2008-02-01

    Reforms such as technology-enhanced instruction require principal leadership. Yet, many principals report that they need help to guide implementation of science and technology reforms. We identify strategies for helping principals provide this leadership. A two-phase design is employed. In the first phase we elicit principals' varied ideas about the Technology-enhanced Learning in Science (TELS) curriculum materials being implemented by teachers in their schools, and in the second phase we engage principals in a leadership workshop designed based on the ideas they generated. Analysis uses an emergent coding scheme to categorize principals' ideas, and a knowledge integration framework to capture the development of these ideas. The analysis suggests that principals frame their thinking about the implementation of TELS in terms of: principal leadership, curriculum, educational policy, teacher learning, student outcomes and financial resources. They seek to improve their own knowledge to support this reform. The principals organize their ideas around individual school goals and current political issues. Principals prefer professional development activities that engage them in reviewing curricula and student work with other principals. Based on the analysis, this study offers guidelines for creating learning opportunities that enhance principals' leadership abilities in technology and science reform.

  7. Atomic-scale phase composition through multivariate statistical analysis of atom probe tomography data.

    PubMed

    Keenan, Michael R; Smentkowski, Vincent S; Ulfig, Robert M; Oltman, Edward; Larson, David J; Kelly, Thomas F

    2011-06-01

    We demonstrate for the first time that multivariate statistical analysis techniques can be applied to atom probe tomography data to estimate the chemical composition of a sample at the full spatial resolution of the atom probe in three dimensions. Whereas the raw atom probe data provide the specific identity of an atom at a precise location, the multivariate results can be interpreted in terms of the probabilities that an atom representing a particular chemical phase is situated there. When aggregated to the size scale of a single atom (∼0.2 nm), atom probe spectral-image datasets are huge and extremely sparse. In fact, the average spectrum will have somewhat less than one total count per spectrum due to imperfect detection efficiency. These conditions, under which the variance in the data is completely dominated by counting noise, test the limits of multivariate analysis, and an extensive discussion of how to extract the chemical information is presented. Efficient numerical approaches to performing principal component analysis (PCA) on these datasets, which may number hundreds of millions of individual spectra, are put forward, and it is shown that PCA can be computed in a few seconds on a typical laptop computer.

  8. Sequential analysis of hydrochemical data for watershed characterization.

    PubMed

    Thyne, Geoffrey; Güler, Cüneyt; Poeter, Eileen

    2004-01-01

    A methodology for characterizing the hydrogeology of watersheds using hydrochemical data that combine statistical, geochemical, and spatial techniques is presented. Surface water and ground water base flow and spring runoff samples (180 total) from a single watershed are first classified using hierarchical cluster analysis. The statistical clusters are analyzed for spatial coherence confirming that the clusters have a geological basis corresponding to topographic flowpaths and showing that the fractured rock aquifer behaves as an equivalent porous medium on the watershed scale. Then principal component analysis (PCA) is used to determine the sources of variation between parameters. PCA analysis shows that the variations within the dataset are related to variations in calcium, magnesium, SO4, and HCO3, which are derived from natural weathering reactions, and pH, NO3, and chlorine, which indicate anthropogenic impact. PHREEQC modeling is used to quantitatively describe the natural hydrochemical evolution for the watershed and aid in discrimination of samples that have an anthropogenic component. Finally, the seasonal changes in the water chemistry of individual sites were analyzed to better characterize the spatial variability of vertical hydraulic conductivity. The integrated result provides a method to characterize the hydrogeology of the watershed that fully utilizes traditional data.

  9. Statistical shape analysis of clavicular cortical bone with applications to the development of mean and boundary shape models.

    PubMed

    Lu, Yuan-Chiao; Untaroiu, Costin D

    2013-09-01

    During car collisions, the shoulder belt exposes the occupant's clavicle to large loading conditions which often leads to a bone fracture. To better understand the geometric variability of clavicular cortical bone which may influence its injury tolerance, twenty human clavicles were evaluated using statistical shape analysis. The interior and exterior clavicular cortical bone surfaces were reconstructed from CT-scan images. Registration between one selected template and the remaining 19 clavicle models was conducted to remove translation and rotation differences. The correspondences of landmarks between the models were then established using coordinates and surface normals. Three registration methods were compared: the LM-ICP method; the global method; and the SHREC method. The LM-ICP registration method showed better performance than the global and SHREC registration methods, in terms of compactness, generalization, and specificity. The first four principal components obtained by using the LM-ICP registration method account for 61% and 67% of the overall anatomical variation for the exterior and interior cortical bone shapes, respectively. The length was found to be the most significant variation mode of the human clavicle. The mean and two boundary shape models were created using the four most significant principal components to investigate the size and shape variation of clavicular cortical bone. In the future, boundary shape models could be used to develop probabilistic finite element models which may help to better understand the variability in biomechanical responses and injuries to the clavicle. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  10. A theoretical study in extracting the essential features and dynamics of molecular motions: Intrinsic geometry methods for PF(5) pseudorotations and statistical methods for argon clusters

    NASA Astrophysics Data System (ADS)

    Panahi, Nima S.

    We studied the problem of understanding and computing the essential features and dynamics of molecular motions through the development of two theories for two different systems. First, we studied the process of the Berry Pseudorotation of PF5 and the rotations it induces in the molecule through its natural and intrinsic geometric nature by setting it in the language of fiber bundles and graph theory. With these tools, we successfully extracted the essentials of the process' loops and induced rotations. The infinite number of pseudorotation loops were broken down into a small set of essential loops called "super loops", with their intrinsic properties and link to the physical movements of the molecule extensively studied. In addition, only the three "self-edge loops" generated any induced rotations, and then only a finite number of classes of them. Second, we studied applying the statistical methods of Principal Components Analysis (PCA) and Principal Coordinate Analysis (PCO) to capture only the most important changes in Argon clusters so as to reduce computational costs and graph the potential energy surface (PES) in three dimensions respectively. Both methods proved successful, but PCA was only partially successful since one will only see advantages for PES database systems much larger than those both currently being studied and those that can be computationally studied in the next few decades to come. In addition, PCA is only needed for the very rare case of a PES database that does not already include Hessian eigenvalues.

  11. The linkage between geopotential height and monthly precipitation in Iran

    NASA Astrophysics Data System (ADS)

    Shirvani, Amin; Fadaei, Amir Sabetan; Landman, Willem A.

    2018-04-01

    This paper investigates the linkage between large-scale atmospheric circulation and monthly precipitation during November to April over Iran. Canonical correlation analysis (CCA) is used to set up the statistical linkage between the 850 hPa geopotential height large-scale circulation and monthly precipitation over Iran for the period 1968-2010. The monthly precipitation dataset for 50 synoptic stations distributed in different climate regions of Iran is considered as the response variable in the CCA. The monthly geopotential height reanalysis dataset over an area between 10° N and 60° N and from 20° E to 80° E is utilized as the explanatory variable in the CCA. Principal component analysis (PCA) as a pre-filter is used for data reduction for both explanatory and response variables before applying CCA. The optimal number of principal components and canonical variables to be retained in the CCA equations is determined using the highest average cross-validated Kendall's tau value. The 850 hPa geopotential height pattern over the Red Sea, Saudi Arabia, and Persian Gulf is found to be the major pattern related to Iranian monthly precipitation. The Pearson correlation between the area averaged of the observed and predicted precipitation over the study area for Jan, Feb, March, April, November, and December months are statistically significant at the 5% significance level and are 0.78, 0.80, 0.82, 0.74, 0.79, and 0.61, respectively. The relative operating characteristic (ROC) indicates that the highest scores for the above- and below-normal precipitation categories are, respectively, for February and April and the lowest scores found for December.

  12. Nonlinear multivariate and time series analysis by neural network methods

    NASA Astrophysics Data System (ADS)

    Hsieh, William W.

    2004-03-01

    Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.

  13. ACCOUNTING FOR CALIBRATION UNCERTAINTIES IN X-RAY ANALYSIS: EFFECTIVE AREAS IN SPECTRAL FITTING

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Hyunsook; Kashyap, Vinay L.; Drake, Jeremy J.

    2011-04-20

    While considerable advance has been made to account for statistical uncertainties in astronomical analyses, systematic instrumental uncertainties have been generally ignored. This can be crucial to a proper interpretation of analysis results because instrumental calibration uncertainty is a form of systematic uncertainty. Ignoring it can underestimate error bars and introduce bias into the fitted values of model parameters. Accounting for such uncertainties currently requires extensive case-specific simulations if using existing analysis packages. Here, we present general statistical methods that incorporate calibration uncertainties into spectral analysis of high-energy data. We first present a method based on multiple imputation that can bemore » applied with any fitting method, but is necessarily approximate. We then describe a more exact Bayesian approach that works in conjunction with a Markov chain Monte Carlo based fitting. We explore methods for improving computational efficiency, and in particular detail a method of summarizing calibration uncertainties with a principal component analysis of samples of plausible calibration files. This method is implemented using recently codified Chandra effective area uncertainties for low-resolution spectral analysis and is verified using both simulated and actual Chandra data. Our procedure for incorporating effective area uncertainty is easily generalized to other types of calibration uncertainties.« less

  14. Wavelet methodology to improve single unit isolation in primary motor cortex cells

    PubMed Central

    Ortiz-Rosario, Alexis; Adeli, Hojjat; Buford, John A.

    2016-01-01

    The proper isolation of action potentials recorded extracellularly from neural tissue is an active area of research in the fields of neuroscience and biomedical signal processing. This paper presents an isolation methodology for neural recordings using the wavelet transform (WT), a statistical thresholding scheme, and the principal component analysis (PCA) algorithm. The effectiveness of five different mother wavelets was investigated: biorthogonal, Daubachies, discrete Meyer, symmetric, and Coifman; along with three different wavelet coefficient thresholding schemes: fixed form threshold, Stein’s unbiased estimate of risk, and minimax; and two different thresholding rules: soft and hard thresholding. The signal quality was evaluated using three different statistical measures: mean-squared error, root-mean squared, and signal to noise ratio. The clustering quality was evaluated using two different statistical measures: isolation distance, and L-ratio. This research shows that the selection of the mother wavelet has a strong influence on the clustering and isolation of single unit neural activity, with the Daubachies 4 wavelet and minimax thresholding scheme performing the best. PMID:25794461

  15. Modeling pair distribution functions of rare-earth phosphate glasses using principal component analysis

    DOE PAGES

    Cole, Jacqueline M.; Cheng, Xie; Payne, Michael C.

    2016-10-18

    The use of principal component analysis (PCA) to statistically infer features of local structure from experimental pair distribution function (PDF) data is assessed on a case study of rare-earth phosphate glasses (REPGs). Such glasses, co-doped with two rare-earth ions (R and R’) of different sizes and optical properties, are of interest to the laser industry. The determination of structure-property relationships in these materials is an important aspect of their technological development. Yet, realizing the local structure of co-doped REPGs presents significant challenges relative to their singly-doped counterparts; specifically, R and R’ are difficult to distinguish in terms of establishing relativemore » material compositions, identifying atomic pairwise correlation profiles in a PDF that are associated with each ion, and resolving peak overlap of such profiles in PDFs. This study demonstrates that PCA can be employed to help overcome these structural complications, by statistically inferring trends in PDFs that exist for a restricted set of experimental data on REPGs, and using these as training data to predict material compositions and PDF profiles in unknown co-doped REPGs. The application of these PCA methods to resolve individual atomic pairwise correlations in t(r) signatures is also presented. The training methods developed for these structural predictions are pre-validated by testing their ability to reproduce known physical phenomena, such as the lanthanide contraction, on PDF signatures of the structurally simpler singly-doped REPGs. The intrinsic limitations of applying PCA to analyze PDFs relative to the quality control of source data, data processing, and sample definition, are also considered. Furthermore, while this case study is limited to lanthanide-doped REPGs, this type of statistical inference may easily be extended to other inorganic solid-state materials, and be exploited in large-scale data-mining efforts that probe many t(r) functions.« less

  16. Utilizing Hierarchical Clustering to improve Efficiency of Self-Organizing Feature Map to Identify Hydrological Homogeneous Regions

    NASA Astrophysics Data System (ADS)

    Farsadnia, Farhad; Ghahreman, Bijan

    2016-04-01

    Hydrologic homogeneous group identification is considered both fundamental and applied research in hydrology. Clustering methods are among conventional methods to assess the hydrological homogeneous regions. Recently, Self-Organizing feature Map (SOM) method has been applied in some studies. However, the main problem of this method is the interpretation on the output map of this approach. Therefore, SOM is used as input to other clustering algorithms. The aim of this study is to apply a two-level Self-Organizing feature map and Ward hierarchical clustering method to determine the hydrologic homogenous regions in North and Razavi Khorasan provinces. At first by principal component analysis, we reduced SOM input matrix dimension, then the SOM was used to form a two-dimensional features map. To determine homogeneous regions for flood frequency analysis, SOM output nodes were used as input into the Ward method. Generally, the regions identified by the clustering algorithms are not statistically homogeneous. Consequently, they have to be adjusted to improve their homogeneity. After adjustment of the homogeneity regions by L-moment tests, five hydrologic homogeneous regions were identified. Finally, adjusted regions were created by a two-level SOM and then the best regional distribution function and associated parameters were selected by the L-moment approach. The results showed that the combination of self-organizing maps and Ward hierarchical clustering by principal components as input is more effective than the hierarchical method, by principal components or standardized inputs to achieve hydrologic homogeneous regions.

  17. Stress distribution in maxillary first molar periodontium using straight pull headgear with vertical and horizontal tubes: A finite element analysis.

    PubMed

    Feizbakhsh, Masood; Kadkhodaei, Mahmoud; Zandian, Dana; Hosseinpour, Zahra

    2017-01-01

    One of the most effective ways for distal movement of molars to treat Class II malocclusion is using extraoral force through a headgear device. The purpose of this study was the comparison of stress distribution in maxillary first molar periodontium using straight pull headgear in vertical and horizontal tubes through finite element method. Based on the real geometry model, a basic model of the first molar and maxillary bone was obtained using three-dimensional imaging of the skull. After the geometric modeling of periodontium components through CATIA software and the definition of mechanical properties and element classification, a force of 150 g for each headgear was defined in ABAQUS software. Consequently, Von Mises and Principal stresses were evaluated. The statistical analysis was performed using T-paired and Wilcoxon nonparametric tests. Extension of areas with Von Mises and Principal stresses utilizing straight pull headgear with a vertical tube was not different from that of using a horizontal tube, but the numerical value of the Von Mises stress in the vertical tube was significantly reduced ( P < 0/05). On the other hand, the difference of the principal stress between both tubes was not significant ( P > 0/05). Based on the results, when force applied to the straight pull headgear with a vertical tube, Von Mises stress was reduced significantly in comparison with the horizontal tube. Therefore, to correct the mesiolingual movement of the maxillary first molar, vertical headgear tube is recommended.

  18. La estadistica en el planeamiento educativo (Statistics in Educational Planning).

    ERIC Educational Resources Information Center

    Leon Pacheco, Tomas

    1971-01-01

    This document is an English-language abstract (approximately 1500 words) summarizing the author's definitions of the principal physical and human characteristics of elementary and secondary education as presently constituted in Mexico so that school personnel may comply with Mexican regulations that force them to supply educational statistics. For…

  19. The hoard of Beçin—non-destructive analysis of the silver coins

    NASA Astrophysics Data System (ADS)

    Rodrigues, M.; Schreiner, M.; Mäder, M.; Melcher, M.; Guerra, M.; Salomon, J.; Radtke, M.; Alram, M.; Schindel, N.

    2010-05-01

    We report the results of an analytical investigation on 416 silver-copper coins stemming from the Ottoman Empire (end of 16th and beginning of 17th centuries), using synchrotron micro X-ray fluorescence analysis (SRXRF). In the past, analyses had already been conducted with energy dispersive X-ray fluorescence analysis (EDXRF), scanning electron microscopy with energy dispersive X-ray spectrometry (SEM/EDX) and proton induced X-ray emission spectroscopy (PIXE). With this combination of techniques it was possible to confirm the fineness of the coinage as well as to study the provenance of the alloy used for the coins. For the interpretation of the data statistical analysis (principal component analysis—PCA) has been performed. A definite local assignment was explored and significant clustering was obtained regarding the minor and trace elements composing the coin alloys.

  20. Use of Cusp Catastrophe for Risk Analysis of Navigational Environment: A Case Study of Three Gorges Reservoir Area

    PubMed Central

    Hao, Guozhu

    2016-01-01

    A water traffic system is a huge, nonlinear, complex system, and its stability is affected by various factors. Water traffic accidents can be considered to be a kind of mutation of a water traffic system caused by the coupling of multiple navigational environment factors. In this study, the catastrophe theory, principal component analysis (PCA), and multivariate statistics are integrated to establish a situation recognition model for a navigational environment with the aim of performing a quantitative analysis of the situation of this environment via the extraction and classification of its key influencing factors; in this model, the natural environment and traffic environment are considered to be two control variables. The Three Gorges Reservoir area of the Yangtze River is considered as an example, and six critical factors, i.e., the visibility, wind, current velocity, route intersection, channel dimension, and traffic flow, are classified into two principal components: the natural environment and traffic environment. These two components are assumed to have the greatest influence on the navigation risk. Then, the cusp catastrophe model is employed to identify the safety situation of the regional navigational environment in the Three Gorges Reservoir area. The simulation results indicate that the situation of the navigational environment of this area is gradually worsening from downstream to upstream. PMID:27391057

  1. Adoption of health information technologies by physicians for clinical practice: The Andalusian case.

    PubMed

    Villalba-Mora, Elena; Casas, Isabel; Lupiañez-Villanueva, Francisco; Maghiros, Ioannis

    2015-07-01

    We investigated the level of adoption of Health Information Technologies (HIT) services, and the factors that influence this, amongst specialised and primary care physicians; in Andalusia, Spain. We analysed the physicians' responses to an online survey. First, we performed a statistical descriptive analysis of the data; thereafter, a principal component analysis; and finally an order logit model to explain the effect of the use in the adoption and to analyse which are the existing barriers. The principal component analysis revealed three main uses of Health Information Technologies: Electronic Health Records (EHR), ePrescription and patient management and telemedicine services. Results from an ordered logit model showed that the frequency of use of HIT is associated with the physicians' perceived usefulness. Lack of financing appeared as a common barrier to the adoption of the three types of services. For ePrescription and patient management, the physician's lack of skills is still a barrier. In the case of telemedicine services, lack of security and lack of interest amongst professionals are the existing barriers. EHR functionalities are fully adopted, in terms of perceived usefulness. EPrescription and patient management are almost fully adopted, while telemedicine is in an early stage of adoption. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  2. Use of Cusp Catastrophe for Risk Analysis of Navigational Environment: A Case Study of Three Gorges Reservoir Area.

    PubMed

    Jiang, Dan; Hao, Guozhu; Huang, Liwen; Zhang, Dan

    2016-01-01

    A water traffic system is a huge, nonlinear, complex system, and its stability is affected by various factors. Water traffic accidents can be considered to be a kind of mutation of a water traffic system caused by the coupling of multiple navigational environment factors. In this study, the catastrophe theory, principal component analysis (PCA), and multivariate statistics are integrated to establish a situation recognition model for a navigational environment with the aim of performing a quantitative analysis of the situation of this environment via the extraction and classification of its key influencing factors; in this model, the natural environment and traffic environment are considered to be two control variables. The Three Gorges Reservoir area of the Yangtze River is considered as an example, and six critical factors, i.e., the visibility, wind, current velocity, route intersection, channel dimension, and traffic flow, are classified into two principal components: the natural environment and traffic environment. These two components are assumed to have the greatest influence on the navigation risk. Then, the cusp catastrophe model is employed to identify the safety situation of the regional navigational environment in the Three Gorges Reservoir area. The simulation results indicate that the situation of the navigational environment of this area is gradually worsening from downstream to upstream.

  3. Chemometric investigation of light-shade effects on essential oil yield and morphology of Moroccan Myrtus communis L.

    PubMed

    Fadil, Mouhcine; Farah, Abdellah; Ihssane, Bouchaib; Haloui, Taoufik; Lebrazi, Sara; Zghari, Badreddine; Rachiq, Saâd

    2016-01-01

    To investigate the effect of environmental factors such as light and shade on essential oil yield and morphological traits of Moroccan Myrtus communis, a chemometric study was conducted on 20 individuals growing under two contrasting light environments. The study of individual's parameters by principal component analysis has shown that essential oil yield, altitude, and leaves thickness were positively correlated between them and negatively correlated with plants height, leaves length and leaves width. Principal component analysis and hierarchical cluster analysis have also shown that the individuals of each sampling site were grouped separately. The one-way ANOVA test has confirmed the effect of light and shade on essential oil yield and morphological parameters by showing a statistically significant difference between them from the shaded side to the sunny one. Finally, the multiple linear model containing main, interaction and quadratic terms was chosen for the modeling of essential oil yield in terms of morphological parameters. Sun plants have a small height, small leaves length and width, but they are thicker and richer in essential oil than shade plants which have shown almost the opposite. The highlighted multiple linear model can be used to predict essential oil yield in the studied area.

  4. TU-FG-201-05: Varian MPC as a Statistical Process Control Tool

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carver, A; Rowbottom, C

    Purpose: Quality assurance in radiotherapy requires the measurement of various machine parameters to ensure they remain within permitted values over time. In Truebeam release 2.0 the Machine Performance Check (MPC) was released allowing beam output and machine axis movements to be assessed in a single test. We aim to evaluate the Varian Machine Performance Check (MPC) as a tool for Statistical Process Control (SPC). Methods: Varian’s MPC tool was used on three Truebeam and one EDGE linac for a period of approximately one year. MPC was commissioned against independent systems. After this period the data were reviewed to determine whethermore » or not the MPC was useful as a process control tool. Analyses on individual tests were analysed using Shewhart control plots, using Matlab for analysis. Principal component analysis was used to determine if a multivariate model was of any benefit in analysing the data. Results: Control charts were found to be useful to detect beam output changes, worn T-nuts and jaw calibration issues. Upper and lower control limits were defined at the 95% level. Multivariate SPC was performed using Principal Component Analysis. We found little evidence of clustering beyond that which might be naively expected such as beam uniformity and beam output. Whilst this makes multivariate analysis of little use it suggests that each test is giving independent information. Conclusion: The variety of independent parameters tested in MPC makes it a sensitive tool for routine machine QA. We have determined that using control charts in our QA programme would rapidly detect changes in machine performance. The use of control charts allows large quantities of tests to be performed on all linacs without visual inspection of all results. The use of control limits alerts users when data are inconsistent with previous measurements before they become out of specification. A. Carver has received a speaker’s honorarium from Varian.« less

  5. The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets.

    PubMed

    Carroll, Adam J; Badger, Murray R; Harvey Millar, A

    2010-07-14

    Standardization of analytical approaches and reporting methods via community-wide collaboration can work synergistically with web-tool development to result in rapid community-driven expansion of online data repositories suitable for data mining and meta-analysis. In metabolomics, the inter-laboratory reproducibility of gas-chromatography/mass-spectrometry (GC/MS) makes it an obvious target for such development. While a number of web-tools offer access to datasets and/or tools for raw data processing and statistical analysis, none of these systems are currently set up to act as a public repository by easily accepting, processing and presenting publicly submitted GC/MS metabolomics datasets for public re-analysis. Here, we present MetabolomeExpress, a new File Transfer Protocol (FTP) server and web-tool for the online storage, processing, visualisation and statistical re-analysis of publicly submitted GC/MS metabolomics datasets. Users may search a quality-controlled database of metabolite response statistics from publicly submitted datasets by a number of parameters (eg. metabolite, species, organ/biofluid etc.). Users may also perform meta-analysis comparisons of multiple independent experiments or re-analyse public primary datasets via user-friendly tools for t-test, principal components analysis, hierarchical cluster analysis and correlation analysis. They may interact with chromatograms, mass spectra and peak detection results via an integrated raw data viewer. Researchers who register for a free account may upload (via FTP) their own data to the server for online processing via a novel raw data processing pipeline. MetabolomeExpress https://www.metabolome-express.org provides a new opportunity for the general metabolomics community to transparently present online the raw and processed GC/MS data underlying their metabolomics publications. Transparent sharing of these data will allow researchers to assess data quality and draw their own insights from published metabolomics datasets.

  6. Leadership and Personality: Is There a Relationship between Self-Assessed Leadership Traits and Self-Assessed Personality Traits of Female Elementary School Principals in the Hampton Roads Area of Virginia?

    ERIC Educational Resources Information Center

    Ireland, Lakisha Nicole

    2017-01-01

    This study attempted to determine if there were statistically significant relationships between leadership traits and personality traits of female elementary school principals who serve in school districts located within the Hampton Roads area of Virginia. This study examined randomly selected participants from three school divisions. These…

  7. "A Compliment Is All I Need"--Teachers Telling Principals How to Promote Their Staff's Self-Efficacy

    ERIC Educational Resources Information Center

    Kass, Efrat

    2013-01-01

    The purpose of the present study is to compare the perceptions of teachers representing opposite ends of the self-efficacy spectrum regarding the effects of the principal's behavior on their professional self-efficacy. In the first quantitative stage, a statistical procedure was conducted to identify the two groups of teachers: a group of 16…

  8. Histogram of gradient and binarized statistical image features of wavelet subband-based palmprint features extraction

    NASA Astrophysics Data System (ADS)

    Attallah, Bilal; Serir, Amina; Chahir, Youssef; Boudjelal, Abdelwahhab

    2017-11-01

    Palmprint recognition systems are dependent on feature extraction. A method of feature extraction using higher discrimination information was developed to characterize palmprint images. In this method, two individual feature extraction techniques are applied to a discrete wavelet transform of a palmprint image, and their outputs are fused. The two techniques used in the fusion are the histogram of gradient and the binarized statistical image features. They are then evaluated using an extreme learning machine classifier before selecting a feature based on principal component analysis. Three palmprint databases, the Hong Kong Polytechnic University (PolyU) Multispectral Palmprint Database, Hong Kong PolyU Palmprint Database II, and the Delhi Touchless (IIDT) Palmprint Database, are used in this study. The study shows that our method effectively identifies and verifies palmprints and outperforms other methods based on feature extraction.

  9. Monitoring of an antigen manufacturing process.

    PubMed

    Zavatti, Vanessa; Budman, Hector; Legge, Raymond; Tamer, Melih

    2016-06-01

    Fluorescence spectroscopy in combination with multivariate statistical methods was employed as a tool for monitoring the manufacturing process of pertactin (PRN), one of the virulence factors of Bordetella pertussis utilized in whopping cough vaccines. Fluorophores such as amino acids and co-enzymes were detected throughout the process. The fluorescence data collected at different stages of the fermentation and purification process were treated employing principal component analysis (PCA). Through PCA, it was feasible to identify sources of variability in PRN production. Then, partial least square (PLS) was employed to correlate the fluorescence spectra obtained from pure PRN samples and the final protein content measured by a Kjeldahl test from these samples. In view that a statistically significant correlation was found between fluorescence and PRN levels, this approach could be further used as a method to predict the final protein content.

  10. THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures.

    PubMed

    Theobald, Douglas L; Wuttke, Deborah S

    2006-09-01

    THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. ANSI C source code and selected binaries for various computing platforms are available under the GNU open source license from http://monkshood.colorado.edu/theseus/ or http://www.theseus3d.org.

  11. Use of Principal Components Analysis to Explain Controls on Nutrient Fluxes to the Chesapeake Bay

    NASA Astrophysics Data System (ADS)

    Rice, K. C.; Mills, A. L.

    2017-12-01

    The Chesapeake Bay watershed, on the east coast of the United States, encompasses about 166,000-square kilometers (km2) of diverse land use, which includes a mixture of forested, agricultural, and developed land. The watershed is now managed under a Total Daily Maximum Load (TMDL), which requires implementation of management actions by 2025 that are sufficient to reduce nitrogen, phosphorus, and suspended-sediment fluxes to the Chesapeake Bay and restore the bay's water quality. We analyzed nutrient and sediment data along with land-use and climatic variables in nine sub watersheds to better understand the drivers of flux within the watershed and to provide relevant management implications. The nine sub watersheds range in area from 300 to 30,000 km2, and the analysis period was 1985-2014. The 31 variables specific to each sub watershed were highly statistically significantly correlated, so Principal Components Analysis was used to reduce the dimensionality of the dataset. The analysis revealed that about 80% of the variability in the whole dataset can be explained by discharge, flux, and concentration of nutrients and sediment. The first two principal components (PCs) explained about 68% of the total variance. PC1 loaded strongly on discharge and flux, and PC2 loaded on concentration. The PC scores of both PC1 and PC2 varied by season. Subsequent analysis of PC1 scores versus PC2 scores, broken out by sub watershed, revealed management implications. Some of the largest sub watersheds are largely driven by discharge, and consequently large fluxes. In contrast, some of the smaller sub watersheds are more variable in nutrient concentrations than discharge and flux. Our results suggest that, given no change in discharge, a reduction in nutrient flux to the streams in the smaller watersheds could result in a proportionately larger decrease in fluxes of nutrients down the river to the bay, than in the larger watersheds.

  12. Principal component directed partial least squares analysis for combining nuclear magnetic resonance and mass spectrometry data in metabolomics: application to the detection of breast cancer.

    PubMed

    Gu, Haiwei; Pan, Zhengzheng; Xi, Bowei; Asiago, Vincent; Musselman, Brian; Raftery, Daniel

    2011-02-07

    Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, (1)H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology. Copyright © 2010 Elsevier B.V. All rights reserved.

  13. Development of a scale for attitude toward condom use for migrant workers in India.

    PubMed

    Talukdar, Arunansu; Bal, Runa; Sanyal, Debasis; Roy, Krishnendu; Talukdar, Payel Sengupta

    2008-02-01

    The propaganda for the use of condoms remains one of the mainstay for prevention of human immunodeficiency virus (HIV) transmission. In spite of the proven efficacy of condom, some moral, social and psychological obstacles are still prevalent, hindering the use of condoms. The study tried to construct a short condom-attitude scale for use among the migrant workers, a major bridge population in India. The study was conducted among the male migrant workers who were 18-49 years old, sexually active and had heard about condoms and were engaged in nonformal jobs. We recruited 234 and 280 candidates for Phase 1 and Phase 2 respectively. Ten items from the original 40-item Brown's ATC (attitude towards condom) scale were selected in Phase 1. After analysis of Phase 1 results, using principal component analysis six items were found appropriate for measuring attitude towards condom use. These six items were then administered in another group in Phase 2. Utilizing Pearson's correlations, scale items were examined in terms of their mean response scores and the correlation matrix between items. Cornbach's alpha and construct validity were also assessed for the entire sample. Study subjects were categorized as condom users and nonusers. The scale structure was explored by analyzing response scores with respect to the items, using principal component analysis followed by varimax rotation analysis. Principal component analysis revealed that the first factor accounted for 71% of the variance, with eigenvalue greater than one. Eigenvalues of the second factor was less than one. Application of screen test suggests only one factor was dominant. Mean score of six items among condom users was 20.45 and that among nonusers was 16.67, which was statistically significant (P<0.01). Cornbach's alpha coefficient was 0.92. This tailor-made attitude-toward-condom-use scale, targeted for most vulnerable people in India, can be included in any rapid survey for assessing the existing beliefs and attitudes toward condoms and also for evaluating efficacy of an intervention program.

  14. Biennial Survey of Education in the United States, 1936-1938. Bulletin, 1940, No. 2. Chapter I: Statistical Summary of Education, 1937-38

    ERIC Educational Resources Information Center

    Foster, Emily M.

    1942-01-01

    The U.S. Office of Education is required by law to collect statistics to show the condition and progress of education. Statistics can be made available, on a national scale, to the extent that school administrators, principals, and college officials cooperate on a voluntary basis with the Office of Education in making the facts available. This…

  15. A Label-Free Fluorescent Array Sensor Utilizing Liposome Encapsulating Calcein for Discriminating Target Proteins by Principal Component Analysis

    PubMed Central

    Imamura, Ryota; Murata, Naoki; Shimanouchi, Toshinori; Yamashita, Kaoru; Fukuzawa, Masayuki; Noda, Minoru

    2017-01-01

    A new fluorescent arrayed biosensor has been developed to discriminate species and concentrations of target proteins by using plural different phospholipid liposome species encapsulating fluorescent molecules, utilizing differences in permeation of the fluorescent molecules through the membrane to modulate liposome-target protein interactions. This approach proposes a basically new label-free fluorescent sensor, compared with the common technique of developed fluorescent array sensors with labeling. We have confirmed a high output intensity of fluorescence emission related to characteristics of the fluorescent molecules dependent on their concentrations when they leak from inside the liposomes through the perturbed lipid membrane. After taking an array image of the fluorescence emission from the sensor using a CMOS imager, the output intensities of the fluorescence were analyzed by a principal component analysis (PCA) statistical method. It is found from PCA plots that different protein species with several concentrations were successfully discriminated by using the different lipid membranes with high cumulative contribution ratio. We also confirmed that the accuracy of the discrimination by the array sensor with a single shot is higher than that of a single sensor with multiple shots. PMID:28714873

  16. A Label-Free Fluorescent Array Sensor Utilizing Liposome Encapsulating Calcein for Discriminating Target Proteins by Principal Component Analysis.

    PubMed

    Imamura, Ryota; Murata, Naoki; Shimanouchi, Toshinori; Yamashita, Kaoru; Fukuzawa, Masayuki; Noda, Minoru

    2017-07-15

    A new fluorescent arrayed biosensor has been developed to discriminate species and concentrations of target proteins by using plural different phospholipid liposome species encapsulating fluorescent molecules, utilizing differences in permeation of the fluorescent molecules through the membrane to modulate liposome-target protein interactions. This approach proposes a basically new label-free fluorescent sensor, compared with the common technique of developed fluorescent array sensors with labeling. We have confirmed a high output intensity of fluorescence emission related to characteristics of the fluorescent molecules dependent on their concentrations when they leak from inside the liposomes through the perturbed lipid membrane. After taking an array image of the fluorescence emission from the sensor using a CMOS imager, the output intensities of the fluorescence were analyzed by a principal component analysis (PCA) statistical method. It is found from PCA plots that different protein species with several concentrations were successfully discriminated by using the different lipid membranes with high cumulative contribution ratio. We also confirmed that the accuracy of the discrimination by the array sensor with a single shot is higher than that of a single sensor with multiple shots.

  17. High-dimensional inference with the generalized Hopfield model: principal component analysis and corrections.

    PubMed

    Cocco, S; Monasson, R; Sessak, V

    2011-05-01

    We consider the problem of inferring the interactions between a set of N binary variables from the knowledge of their frequencies and pairwise correlations. The inference framework is based on the Hopfield model, a special case of the Ising model where the interaction matrix is defined through a set of patterns in the variable space, and is of rank much smaller than N. We show that maximum likelihood inference is deeply related to principal component analysis when the amplitude of the pattern components ξ is negligible compared to √N. Using techniques from statistical mechanics, we calculate the corrections to the patterns to the first order in ξ/√N. We stress the need to generalize the Hopfield model and include both attractive and repulsive patterns in order to correctly infer networks with sparse and strong interactions. We present a simple geometrical criterion to decide how many attractive and repulsive patterns should be considered as a function of the sampling noise. We moreover discuss how many sampled configurations are required for a good inference, as a function of the system size N and of the amplitude ξ. The inference approach is illustrated on synthetic and biological data.

  18. Chemical Polymorphism of Essential Oils of Artemisia vulgaris Growing Wild in Lithuania.

    PubMed

    Judzentiene, Asta; Budiene, Jurga

    2018-02-01

    Compositional variability of mugwort (Artemisia vulgaris L.) essential oils has been investigated in the study. Plant material (over ground parts at full flowering stage) was collected from forty-four wild populations in Lithuania. The oils from aerial parts were obtained by hydrodistillation and analyzed by GC(FID) and GC/MS. In total, up to 111 components were determined in the oils. As the major constituents were found: sabinene, 1,8-cineole, artemisia ketone, both thujone isomers, camphor, cis-chrysanthenyl acetate, davanone and davanone B. The compositional data were subjected to statistical analysis. The application of PCA (Principal Component Analysis) and AHC (Agglomerative Hierarchical Clustering) allowed grouping the oils into six clusters. AHC permitted to distinguish an artemisia ketone chemotype, which, to the best of our knowledge, is very scarce. Additionally, two rare cis-chrysanthenyl acetate and sabinene oil types were determined for the plants growing in Lithuania. Besides, davanone was found for the first time as a principal component in mugwort oils. The performed study revealed significant chemical polymorphism of essential oils in mugwort plants native to Lithuania; it has expanded our chemotaxonomic knowledge both of A. vulgaris species and Artemisia genus. © 2018 Wiley-VHCA AG, Zurich, Switzerland.

  19. Derivation of Boundary Manikins: A Principal Component Analysis

    NASA Technical Reports Server (NTRS)

    Young, Karen; Margerum, Sarah; Barr, Abbe; Ferrer, Mike A.; Rajulu, Sudhakar

    2008-01-01

    When designing any human-system interface, it is critical to provide realistic anthropometry to properly represent how a person fits within a given space. This study aimed to identify a minimum number of boundary manikins or representative models of subjects anthropometry from a target population, which would realistically represent the population. The boundary manikin anthropometry was derived using, Principal Component Analysis (PCA). PCA is a statistical approach to reduce a multi-dimensional dataset using eigenvectors and eigenvalues. The measurements used in the PCA were identified as those measurements critical for suit and cockpit design. The PCA yielded a total of 26 manikins per gender, as well as their anthropometry from the target population. Reduction techniques were implemented to reduce this number further with a final result of 20 female and 22 male subjects. The anthropometry of the boundary manikins was then be used to create 3D digital models (to be discussed in subsequent papers) intended for use by designers to test components of their space suit design, to verify that the requirements specified in the Human Systems Integration Requirements (HSIR) document are met. The end-goal is to allow for designers to generate suits which accommodate the diverse anthropometry of the user population.

  20. Application of Principal Component Analysis to NIR Spectra of Phyllosilicates: A Technique for Identifying Phyllosilicates on Mars

    NASA Technical Reports Server (NTRS)

    Rampe, E. B.; Lanza, N. L.

    2012-01-01

    Orbital near-infrared (NIR) reflectance spectra of the martian surface from the OMEGA and CRISM instruments have identified a variety of phyllosilicates in Noachian terrains. The types of phyllosilicates present on Mars have important implications for the aqueous environments in which they formed, and, thus, for recognizing locales that may have been habitable. Current identifications of phyllosilicates from martian NIR data are based on the positions of spectral absorptions relative to laboratory data of well-characterized samples and from spectral ratios; however, some phyllosilicates can be difficult to distinguish from one another with these methods (i.e. illite vs. muscovite). Here we employ a multivariate statistical technique, principal component analysis (PCA), to differentiate between spectrally similar phyllosilicate minerals. PCA is commonly used in a variety of industries (pharmaceutical, agricultural, viticultural) to discriminate between samples. Previous work using PCA to analyze raw NIR reflectance data from mineral mixtures has shown that this is a viable technique for identifying mineral types, abundances, and particle sizes. Here, we evaluate PCA of second-derivative NIR reflectance data as a method for classifying phyllosilicates and test whether this method can be used to identify phyllosilicates on Mars.

  1. A Novel Acoustic Sensor Approach to Classify Seeds Based on Sound Absorption Spectra

    PubMed Central

    Gasso-Tortajada, Vicent; Ward, Alastair J.; Mansur, Hasib; Brøchner, Torben; Sørensen, Claus G.; Green, Ole

    2010-01-01

    A non-destructive and novel in situ acoustic sensor approach based on the sound absorption spectra was developed for identifying and classifying different seed types. The absorption coefficient spectra were determined by using the impedance tube measurement method. Subsequently, a multivariate statistical analysis, i.e., principal component analysis (PCA), was performed as a way to generate a classification of the seeds based on the soft independent modelling of class analogy (SIMCA) method. The results show that the sound absorption coefficient spectra of different seed types present characteristic patterns which are highly dependent on seed size and shape. In general, seed particle size and sphericity were inversely related with the absorption coefficient. PCA presented reliable grouping capabilities within the diverse seed types, since the 95% of the total spectral variance was described by the first two principal components. Furthermore, the SIMCA classification model based on the absorption spectra achieved optimal results as 100% of the evaluation samples were correctly classified. This study contains the initial structuring of an innovative method that will present new possibilities in agriculture and industry for classifying and determining physical properties of seeds and other materials. PMID:22163455

  2. Characterizing Variability of Modular Brain Connectivity with Constrained Principal Component Analysis

    PubMed Central

    Hirayama, Jun-ichiro; Hyvärinen, Aapo; Kiviniemi, Vesa; Kawanabe, Motoaki; Yamashita, Okito

    2016-01-01

    Characterizing the variability of resting-state functional brain connectivity across subjects and/or over time has recently attracted much attention. Principal component analysis (PCA) serves as a fundamental statistical technique for such analyses. However, performing PCA on high-dimensional connectivity matrices yields complicated “eigenconnectivity” patterns, for which systematic interpretation is a challenging issue. Here, we overcome this issue with a novel constrained PCA method for connectivity matrices by extending the idea of the previously proposed orthogonal connectivity factorization method. Our new method, modular connectivity factorization (MCF), explicitly introduces the modularity of brain networks as a parametric constraint on eigenconnectivity matrices. In particular, MCF analyzes the variability in both intra- and inter-module connectivities, simultaneously finding network modules in a principled, data-driven manner. The parametric constraint provides a compact module-based visualization scheme with which the result can be intuitively interpreted. We develop an optimization algorithm to solve the constrained PCA problem and validate our method in simulation studies and with a resting-state functional connectivity MRI dataset of 986 subjects. The results show that the proposed MCF method successfully reveals the underlying modular eigenconnectivity patterns in more general situations and is a promising alternative to existing methods. PMID:28002474

  3. A SAR and QSAR study of new artemisinin compounds with antimalarial activity.

    PubMed

    Santos, Cleydson Breno R; Vieira, Josinete B; Lobato, Cleison C; Hage-Melim, Lorane I S; Souto, Raimundo N P; Lima, Clarissa S; Costa, Elizabeth V M; Brasil, Davi S B; Macêdo, Williams Jorge C; Carvalho, José Carlos T

    2013-12-30

    The Hartree-Fock method and the 6-31G** basis set were employed to calculate the molecular properties of artemisinin and 20 derivatives with antimalarial activity. Maps of molecular electrostatic potential (MEPs) and molecular docking were used to investigate the interaction between ligands and the receptor (heme). Principal component analysis and hierarchical cluster analysis were employed to select the most important descriptors related to activity. The correlation between biological activity and molecular properties was obtained using the partial least squares and principal component regression methods. The regression PLS and PCR models built in this study were also used to predict the antimalarial activity of 30 new artemisinin compounds with unknown activity. The models obtained showed not only statistical significance but also predictive ability. The significant molecular descriptors related to the compounds with antimalarial activity were the hydration energy (HE), the charge on the O11 oxygen atom (QO11), the torsion angle O1-O2-Fe-N2 (D2) and the maximum rate of R/Sanderson Electronegativity (RTe+). These variables led to a physical and structural explanation of the molecular properties that should be selected for when designing new ligands to be used as antimalarial agents.

  4. The assessment of facial variation in 4747 British school children.

    PubMed

    Toma, Arshed M; Zhurov, Alexei I; Playle, Rebecca; Marshall, David; Rosin, Paul L; Richmond, Stephen

    2012-12-01

    The aim of this study is to identify key components contributing to facial variation in a large population-based sample of 15.5-year-old children (2514 females and 2233 males). The subjects were recruited from the Avon Longitudinal Study of Parents and Children. Three-dimensional facial images were obtained for each subject using two high-resolution Konica Minolta laser scanners. Twenty-one reproducible facial landmarks were identified and their coordinates were recorded. The facial images were registered using Procrustes analysis. Principal component analysis was then employed to identify independent groups of correlated coordinates. For the total data set, 14 principal components (PCs) were identified which explained 82 per cent of the total variance, with the first three components accounting for 46 per cent of the variance. Similar results were obtained for males and females separately with only subtle gender differences in some PCs. Facial features may be treated as a multidimensional statistical continuum with respect to the PCs. The first three PCs characterize the face in terms of height, width, and prominence of the nose. The derived PCs may be useful to identify and classify faces according to a scale of normality.

  5. Intercomparison of air quality data using principal component analysis, and forecasting of PM₁₀ and PM₂.₅ concentrations using artificial neural networks, in Thessaloniki and Helsinki.

    PubMed

    Voukantsis, Dimitris; Karatzas, Kostas; Kukkonen, Jaakko; Räsänen, Teemu; Karppinen, Ari; Kolehmainen, Mikko

    2011-03-01

    In this paper we propose a methodology consisting of specific computational intelligence methods, i.e. principal component analysis and artificial neural networks, in order to inter-compare air quality and meteorological data, and to forecast the concentration levels for environmental parameters of interest (air pollutants). We demonstrate these methods to data monitored in the urban areas of Thessaloniki and Helsinki in Greece and Finland, respectively. For this purpose, we applied the principal component analysis method in order to inter-compare the patterns of air pollution in the two selected cities. Then, we proceeded with the development of air quality forecasting models for both studied areas. On this basis, we formulated and employed a novel hybrid scheme in the selection process of input variables for the forecasting models, involving a combination of linear regression and artificial neural networks (multi-layer perceptron) models. The latter ones were used for the forecasting of the daily mean concentrations of PM₁₀ and PM₂.₅ for the next day. Results demonstrated an index of agreement between measured and modelled daily averaged PM₁₀ concentrations, between 0.80 and 0.85, while the kappa index for the forecasting of the daily averaged PM₁₀ concentrations reached 60% for both cities. Compared with previous corresponding studies, these statistical parameters indicate an improved performance of air quality parameters forecasting. It was also found that the performance of the models for the forecasting of the daily mean concentrations of PM₁₀ was not substantially different for both cities, despite the major differences of the two urban environments under consideration. Copyright © 2011 Elsevier B.V. All rights reserved.

  6. Career Paths in Educational Leadership: Examining Principals' Narratives

    ERIC Educational Resources Information Center

    Parylo, Oksana; Zepeda, Sally J.; Bengtson, Ed

    2012-01-01

    This qualitative study analyzes the career path narratives of active principals. Structural narrative analysis was supplemented with sociolinguistic theory and thematic narrative analysis to discern the similarities and differences, as well as the patterns in the language used by participating principals. Thematic analysis found four major themes…

  7. [Application of chemometrics in composition-activity relationship research of traditional Chinese medicine].

    PubMed

    Han, Sheng-Nan

    2014-07-01

    Chemometrics is a new branch of chemistry which is widely applied to various fields of analytical chemistry. Chemometrics can use theories and methods of mathematics, statistics, computer science and other related disciplines to optimize the chemical measurement process and maximize access to acquire chemical information and other information on material systems by analyzing chemical measurement data. In recent years, traditional Chinese medicine has attracted widespread attention. In the research of traditional Chinese medicine, it has been a key problem that how to interpret the relationship between various chemical components and its efficacy, which seriously restricts the modernization of Chinese medicine. As chemometrics brings the multivariate analysis methods into the chemical research, it has been applied as an effective research tool in the composition-activity relationship research of Chinese medicine. This article reviews the applications of chemometrics methods in the composition-activity relationship research in recent years. The applications of multivariate statistical analysis methods (such as regression analysis, correlation analysis, principal component analysis, etc. ) and artificial neural network (such as back propagation artificial neural network, radical basis function neural network, support vector machine, etc. ) are summarized, including the brief fundamental principles, the research contents and the advantages and disadvantages. Finally, the existing main problems and prospects of its future researches are proposed.

  8. Characterization and discrimination of raw and vinegar-baked Bupleuri radix based on UHPLC-Q-TOF-MS coupled with multivariate statistical analysis.

    PubMed

    Lei, Tianli; Chen, Shifeng; Wang, Kai; Zhang, Dandan; Dong, Lin; Lv, Chongning; Wang, Jing; Lu, Jincai

    2018-02-01

    Bupleuri Radix is a commonly used herb in clinic, and raw and vinegar-baked Bupleuri Radix are both documented in the Pharmacopoeia of People's Republic of China. According to the theories of traditional Chinese medicine, Bupleuri Radix possesses different therapeutic effects before and after processing. However, the chemical mechanism of this processing is still unknown. In this study, ultra-high-performance liquid chromatography with quadruple time-of-flight mass spectrometry coupled with multivariate statistical analysis including principal component analysis and orthogonal partial least square-discriminant analysis was developed to holistically compare the difference between raw and vinegar-baked Bupleuri Radix for the first time. As a result, 50 peaks in raw and processed Bupleuri Radix were detected, respectively, and a total of 49 peak chemical compounds were identified. Saikosaponin a, saikosaponin d, saikosaponin b 3 , saikosaponin e, saikosaponin c, saikosaponin b 2 , saikosaponin b 1 , 4''-O-acetyl-saikosaponin d, hyperoside and 3',4'-dimethoxy quercetin were explored as potential markers of raw and vinegar-baked Bupleuri Radix. This study has been successfully applied for global analysis of raw and vinegar-processed samples. Furthermore, the underlying hepatoprotective mechanism of Bupleuri Radix was predicted, which was related to the changes of chemical profiling. Copyright © 2017 John Wiley & Sons, Ltd.

  9. Multivariate analysis of cytokine profiles in pregnancy complications.

    PubMed

    Azizieh, Fawaz; Dingle, Kamaludin; Raghupathy, Raj; Johnson, Kjell; VanderPlas, Jacob; Ansari, Ali

    2018-03-01

    The immunoregulation to tolerate the semiallogeneic fetus during pregnancy includes a harmonious dynamic balance between anti- and pro-inflammatory cytokines. Several earlier studies reported significantly different levels and/or ratios of several cytokines in complicated pregnancy as compared to normal pregnancy. However, as cytokines operate in networks with potentially complex interactions, it is also interesting to compare groups with multi-cytokine data sets, with multivariate analysis. Such analysis will further examine how great the differences are, and which cytokines are more different than others. Various multivariate statistical tools, such as Cramer test, classification and regression trees, partial least squares regression figures, 2-dimensional Kolmogorov-Smirmov test, principal component analysis and gap statistic, were used to compare cytokine data of normal vs anomalous groups of different pregnancy complications. Multivariate analysis assisted in examining if the groups were different, how strongly they differed, in what ways they differed and further reported evidence for subgroups in 1 group (pregnancy-induced hypertension), possibly indicating multiple causes for the complication. This work contributes to a better understanding of cytokines interaction and may have important implications on targeting cytokine balance modulation or design of future medications or interventions that best direct management or prevention from an immunological approach. © 2018 The Authors. American Journal of Reproductive Immunology Published by John Wiley & Sons Ltd.

  10. The bio-optical properties of CDOM as descriptor of lake stratification.

    PubMed

    Bracchini, Luca; Dattilo, Arduino Massimo; Hull, Vincent; Loiselle, Steven Arthur; Martini, Silvia; Rossi, Claudio; Santinelli, Chiara; Seritti, Alfredo

    2006-11-01

    Multivariate statistical techniques are used to demonstrate the fundamental role of CDOM optical properties in the description of water masses during the summer stratification of a deep lake. PC1 was linked with dissolved species and PC2 with suspended particles. In the first principal component that the role of CDOM bio-optical properties give a better description of the stratification of the Salto Lake with respect to temperature. The proposed multivariate approach can be used for the analysis of different stratified aquatic ecosystems in relation to interaction between bio-optical properties and stratification of the water body.

  11. Area estimation using multiyear designs and partial crop identification

    NASA Technical Reports Server (NTRS)

    Sielken, R. L., Jr.

    1984-01-01

    Statistical procedures were developed for large area assessments using both satellite and conventional data. Crop acreages, other ground cover indices, and measures of change were the principal characteristics of interest. These characteristics are capable of being estimated from samples collected possibly from several sources at varying times, with different levels of identification. Multiyear analysis techniques were extended to include partially identified samples; the best current year sampling design corresponding to a given sampling history was determined; weights reflecting the precision or confidence in each observation were identified and utilized, and the variation in estimates incorporating partially identified samples were quantified.

  12. Color enhancement of landsat agricultural imagery: JPL LACIE image processing support task

    NASA Technical Reports Server (NTRS)

    Madura, D. P.; Soha, J. M.; Green, W. B.; Wherry, D. B.; Lewis, S. D.

    1978-01-01

    Color enhancement techniques were applied to LACIE LANDSAT segments to determine if such enhancement can assist analysis in crop identification. The procedure involved increasing the color range by removing correlation between components. First, a principal component transformation was performed, followed by contrast enhancement to equalize component variances, followed by an inverse transformation to restore familiar color relationships. Filtering was applied to lower order components to reduce color speckle in the enhanced products. Use of single acquisition and multiple acquisition statistics to control the enhancement were compared, and the effects of normalization investigated. Evaluation is left to LACIE personnel.

  13. Implementation of quality by design principles in the development of microsponges as drug delivery carriers: Identification and optimization of critical factors using multivariate statistical analyses and design of experiments studies.

    PubMed

    Simonoska Crcarevska, Maja; Dimitrovska, Aneta; Sibinovska, Nadica; Mladenovska, Kristina; Slavevska Raicki, Renata; Glavas Dodov, Marija

    2015-07-15

    Microsponges drug delivery system (MDDC) was prepared by double emulsion-solvent-diffusion technique using rotor-stator homogenization. Quality by design (QbD) concept was implemented for the development of MDDC with potential to be incorporated into semisolid dosage form (gel). Quality target product profile (QTPP) and critical quality attributes (CQA) were defined and identified, accordingly. Critical material attributes (CMA) and Critical process parameters (CPP) were identified using quality risk management (QRM) tool, failure mode, effects and criticality analysis (FMECA). CMA and CPP were identified based on results obtained from principal component analysis (PCA-X&Y) and partial least squares (PLS) statistical analysis along with literature data, product and process knowledge and understanding. FMECA identified amount of ethylcellulose, chitosan, acetone, dichloromethane, span 80, tween 80 and water ratio in primary/multiple emulsions as CMA and rotation speed and stirrer type used for organic solvent removal as CPP. The relationship between identified CPP and particle size as CQA was described in the design space using design of experiments - one-factor response surface method. Obtained results from statistically designed experiments enabled establishment of mathematical models and equations that were used for detailed characterization of influence of identified CPP upon MDDC particle size and particle size distribution and their subsequent optimization. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Linear Discriminant Analysis Achieves High Classification Accuracy for the BOLD fMRI Response to Naturalistic Movie Stimuli

    PubMed Central

    Mandelkow, Hendrik; de Zwart, Jacco A.; Duyn, Jeff H.

    2016-01-01

    Naturalistic stimuli like movies evoke complex perceptual processes, which are of great interest in the study of human cognition by functional MRI (fMRI). However, conventional fMRI analysis based on statistical parametric mapping (SPM) and the general linear model (GLM) is hampered by a lack of accurate parametric models of the BOLD response to complex stimuli. In this situation, statistical machine-learning methods, a.k.a. multivariate pattern analysis (MVPA), have received growing attention for their ability to generate stimulus response models in a data-driven fashion. However, machine-learning methods typically require large amounts of training data as well as computational resources. In the past, this has largely limited their application to fMRI experiments involving small sets of stimulus categories and small regions of interest in the brain. By contrast, the present study compares several classification algorithms known as Nearest Neighbor (NN), Gaussian Naïve Bayes (GNB), and (regularized) Linear Discriminant Analysis (LDA) in terms of their classification accuracy in discriminating the global fMRI response patterns evoked by a large number of naturalistic visual stimuli presented as a movie. Results show that LDA regularized by principal component analysis (PCA) achieved high classification accuracies, above 90% on average for single fMRI volumes acquired 2 s apart during a 300 s movie (chance level 0.7% = 2 s/300 s). The largest source of classification errors were autocorrelations in the BOLD signal compounded by the similarity of consecutive stimuli. All classifiers performed best when given input features from a large region of interest comprising around 25% of the voxels that responded significantly to the visual stimulus. Consistent with this, the most informative principal components represented widespread distributions of co-activated brain regions that were similar between subjects and may represent functional networks. In light of these results, the combination of naturalistic movie stimuli and classification analysis in fMRI experiments may prove to be a sensitive tool for the assessment of changes in natural cognitive processes under experimental manipulation. PMID:27065832

  15. Forest statistics for Southeast Georgia, 1996

    Treesearch

    Michael T. Thompson; Raymond M. Sheffield

    1997-01-01

    This report highlights the principal findings of the seventh forest survey of Southeast Georgia. Field work began in November 1995 and was completed in November 1996. Six previous surveys, completed in 1934, 1952, 1960, 1971, 1981, and 1988 provide statistics for measuring changes and trends over the past 62 years. This report primarily emphasizes the changes and...

  16. Artificial neural networks and multiple linear regression model using principal components to estimate rainfall over South America

    NASA Astrophysics Data System (ADS)

    Soares dos Santos, T.; Mendes, D.; Rodrigues Torres, R.

    2016-01-01

    Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANNs) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon; northeastern Brazil; and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model output and observed monthly precipitation. We used general circulation model (GCM) experiments for the 20th century (RCP historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANNs significantly outperform the MLR downscaling of monthly precipitation variability.

  17. Artificial neural networks and multiple linear regression model using principal components to estimate rainfall over South America

    NASA Astrophysics Data System (ADS)

    dos Santos, T. S.; Mendes, D.; Torres, R. R.

    2015-08-01

    Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANN) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon, Northeastern Brazil and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model out- put and observed monthly precipitation. We used GCMs experiments for the 20th century (RCP Historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANN significantly outperforms the MLR downscaling of monthly precipitation variability.

  18. Optimizing protection efforts for amphibian conservation in Mediterranean landscapes

    NASA Astrophysics Data System (ADS)

    García-Muñoz, Enrique; Ceacero, Francisco; Carretero, Miguel A.; Pedrajas-Pulido, Luis; Parra, Gema; Guerrero, Francisco

    2013-05-01

    Amphibians epitomize the modern biodiversity crisis, and attract great attention from the scientific community since a complex puzzle of factors has influence on their disappearance. However, these factors are multiple and spatially variable, and declining in each locality is due to a particular combination of causes. This study shows a suitable statistical procedure to determine threats to amphibian species in medium size administrative areas. For our study case, ten biological and ecological variables feasible to affect the survival of 15 amphibian species were categorized and reduced through Principal Component Analysis. The principal components extracted were related to ecological plasticity, reproductive potential, and specificity of breeding habitats. Finally, the factor scores of species were joined in a presence-absence matrix that gives us information to identify where and why conservation management are requires. In summary, this methodology provides the necessary information to maximize benefits of conservation measures in small areas by identifying which ecological factors need management efforts and where should we focus them on.

  19. Monitoring and evaluation of the water quality of Budeasa Reservoir-Arges River, Romania.

    PubMed

    Ion, Antoanela; Vladescu, Luminita; Badea, Irinel Adriana; Comanescu, Laura

    2016-09-01

    The purpose of this study was to monitor and record the specific characteristics and properties of the Arges River water in the Budeasa Reservoir (the principal water resources of municipal tap water of the big Romanian city Pitesti and surrounding area) for a period of 5 years (2005-2009). The monitored physical and chemical parameters were turbidity, pH, electrical conductivity, chemical oxygen demand, 5 days biochemical oxygen demand, free dissolved oxygen, nitrite, nitrate, ammonia nitrogen, chloride, total dissolved iron ions, sulfate, manganese, phosphate, total alkalinity, and total hardness. The results were discussed in correlation with the precipitation values during the study. Monthly and annual values of each parameter determined in the period January 2005-December 2009 were used as a basis for the classification of Budeasa Reservoir water, according to the European legislation, as well as for assessing its quality as a drinking water supply. Principal component analysis and Pearson correlation coefficients were used as statistical procedures in order to evaluate the data obtained during this study.

  20. Craters on Earth, Moon, and Mars: Multivariate classification and mode of origin

    USGS Publications Warehouse

    Pike, R.J.

    1974-01-01

    Testing extraterrestrial craters and candidate terrestrial analogs for morphologic similitude is treated as a problem in numerical taxonomy. According to a principal-components solution and a cluster analysis, 402 representative craters on the Earth, the Moon, and Mars divide into two major classes of contrasting shapes and modes of origin. Craters of net accumulation of material (cratered lunar domes, Martian "calderas," and all terrestrial volcanoes except maars and tuff rings) group apart from craters of excavation (terrestrial meteorite impact and experimental explosion craters, typical Martian craters, and all other lunar craters). Maars and tuff rings belong to neither group but are transitional. The classification criteria are four independent attributes of topographic geometry derived from seven descriptive variables by the principal-components transformation. Morphometric differences between crater bowl and raised rim constitute the strongest of the four components. Although single topographic variables cannot confidently predict the genesis of individual extraterrestrial craters, multivariate statistical models constructed from several variables can distinguish consistently between large impact craters and volcanoes. ?? 1974.

  1. Development and Validation of the Caring Loneliness Scale.

    PubMed

    Karhe, Liisa; Kaunonen, Marja; Koivisto, Anna-Maija

    2016-12-01

    The Caring Loneliness Scale (CARLOS) includes 5 categories derived from earlier qualitative research. This article assesses the reliability and construct validity of a scale designed to measure patient experiences of loneliness in a professional caring relationship. Statistical analysis with 4 different sample sizes included Cronbach's alpha and exploratory factor analysis with principal axis factoring extraction. The sample size of 250 gave the most useful and comprehensible structure, but all 4 samples yielded underlying content of loneliness experiences. The initial 5 categories were reduced to 4 factors with 24 items and Cronbach's alpha ranging from .77 to .90. The findings support the reliability and validity of CARLOS for the assessment of Finnish breast cancer and heart surgery patients' experiences but as all instruments, further validation is needed.

  2. Statistical performance and information content of time lag analysis and redundancy analysis in time series modeling.

    PubMed

    Angeler, David G; Viedma, Olga; Moreno, José M

    2009-11-01

    Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.

  3. FADTTS: functional analysis of diffusion tensor tract statistics.

    PubMed

    Zhu, Hongtu; Kong, Linglong; Li, Runze; Styner, Martin; Gerig, Guido; Lin, Weili; Gilmore, John H

    2011-06-01

    The aim of this paper is to present a functional analysis of a diffusion tensor tract statistics (FADTTS) pipeline for delineating the association between multiple diffusion properties along major white matter fiber bundles with a set of covariates of interest, such as age, diagnostic status and gender, and the structure of the variability of these white matter tract properties in various diffusion tensor imaging studies. The FADTTS integrates five statistical tools: (i) a multivariate varying coefficient model for allowing the varying coefficient functions in terms of arc length to characterize the varying associations between fiber bundle diffusion properties and a set of covariates, (ii) a weighted least squares estimation of the varying coefficient functions, (iii) a functional principal component analysis to delineate the structure of the variability in fiber bundle diffusion properties, (iv) a global test statistic to test hypotheses of interest, and (v) a simultaneous confidence band to quantify the uncertainty in the estimated coefficient functions. Simulated data are used to evaluate the finite sample performance of FADTTS. We apply FADTTS to investigate the development of white matter diffusivities along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment. FADTTS can be used to facilitate the understanding of normal brain development, the neural bases of neuropsychiatric disorders, and the joint effects of environmental and genetic factors on white matter fiber bundles. The advantages of FADTTS compared with the other existing approaches are that they are capable of modeling the structured inter-subject variability, testing the joint effects, and constructing their simultaneous confidence bands. However, FADTTS is not crucial for estimation and reduces to the functional analysis method for the single measure. Copyright © 2011 Elsevier Inc. All rights reserved.

  4. Trends in Public and Private School Principal Demographics and Qualifications: 1987-88 to 2011-12. Stats in Brief. NCES 2016-189

    ERIC Educational Resources Information Center

    Hill, Jason; Ottem, Randolph; DeRoche, John

    2016-01-01

    Using data from seven administrations of the Schools and Staffing Survey (SASS), this Statistics in Brief examines trends in public and private school principal demographics, experience, and compensation over 25 years, from 1987-88 through 2011-12. Data are drawn from the 1987-88, 1990-91, 1993-94, 1999-2000, 2003-04, 2007-08, and 2011-12 survey…

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Harris, Michael D.; Dater, Manasi; Whitaker, Ross

    In this study, statistical shape modeling (SSM) was used to quantify three-dimensional (3D) variation and morphologic differences between femurs with and without cam femoroacetabular impingement (FAI). 3D surfaces were generated from CT scans of femurs from 41 controls and 30 cam FAI patients. SSM correspondence particles were optimally positioned on each surface using a gradient descent energy function. Mean shapes for control and patient groups were defined from the resulting particle configurations. Morphological differences between group mean shapes and between the control mean and individual patients were calculated. Principal component analysis was used to describe anatomical variation present in bothmore » groups. The first 6 modes (or principal components) captured statistically significant shape variations, which comprised 84% of cumulative variation among the femurs. Shape variation was greatest in femoral offset, greater trochanter height, and the head-neck junction. The mean cam femur shape protruded above the control mean by a maximum of 3.3 mm with sustained protrusions of 2.5-3.0 mm along the anterolateral head-neck junction and distally along the anterior neck, corresponding well with reported cam lesion locations and soft-tissue damage. This study provides initial evidence that SSM can describe variations in femoral morphology in both controls and cam FAI patients and may be useful for developing new measurements of pathological anatomy. SSM may also be applied to characterize cam FAI severity and provide templates to guide patient-specific surgical resection of bone.« less

  6. Use of multispectral Ikonos imagery for discriminating between conventional and conservation agricultural tillage practices

    USGS Publications Warehouse

    Vina, Andres; Peters, Albert J.; Ji, Lei

    2003-01-01

    There is a global concern about the increase in atmospheric concentrations of greenhouse gases. One method being discussed to encourage greenhouse gas mitigation efforts is based on a trading system whereby carbon emitters can buy effective mitigation efforts from farmers implementing conservation tillage practices. These practices sequester carbon from the atmosphere, and such a trading system would require a low-cost and accurate method of verification. Remote sensing technology can offer such a verification technique. This paper is focused on the use of standard image processing procedures applied to a multispectral Ikonos image, to determine whether it is possible to validate that farmers have complied with agreements to implement conservation tillage practices. A principal component analysis (PCA) was performed in order to isolate image variance in cropped fields. Analyses of variance (ANOVA) statistical procedures were used to evaluate the capability of each Ikonos band and each principal component to discriminate between conventional and conservation tillage practices. A logistic regression model was implemented on the principal component most effective in discriminating between conventional and conservation tillage, in order to produce a map of the probability of conventional tillage. The Ikonos imagery, in combination with ground-reference information, proved to be a useful tool for verification of conservation tillage practices.

  7. Modal recovery of sea-level variability in the South China Sea using merged altimeter data

    NASA Astrophysics Data System (ADS)

    Jiang, Haoyu; Chen, Ge

    2015-09-01

    Using 20 years (1993-2012) of merged data recorded by contemporary multi-altimeter missions, a variety of sea-level variability modes are recovered in the South China Sea employing three-dimensional harmonic extraction. In terms of the long-term variation, the South China Sea is estimated to have a rising sea-level linear trend of 5.39 mm/a over these 20 years. Among the modes extracted, the seven most statistically significant periodic or quasi-periodic modes are identified as principal modes. The geographical distributions of the magnitudes and phases of the modes are displayed. In terms of intraannual and annual regimes, two principal modes with strict semiannual and annual periods are found, with the annual variability having the largest amplitudes among the seven modes. For interannual and decadal regimes, five principal modes at approximately 18, 21, 23, 28, and 112 months are found with the most mode-active region being to the east of Vietnam. For the phase distributions, a series of amphidromes are observed as twins, termed "amphidrome twins", comprising rotating dipole systems. The stability of periodic modes is investigated employing joint spatiotemporal analysis of latitude/longitude sections. Results show that all periodic modes are robust, revealing the richness and complexity of sea-level modes in the South China Sea.

  8. Temporal and spatial assessment of river surface water quality using multivariate statistical techniques: a study in Can Tho City, a Mekong Delta area, Vietnam.

    PubMed

    Phung, Dung; Huang, Cunrui; Rutherford, Shannon; Dwirahmadi, Febi; Chu, Cordia; Wang, Xiaoming; Nguyen, Minh; Nguyen, Nga Huy; Do, Cuong Manh; Nguyen, Trung Hieu; Dinh, Tuan Anh Diep

    2015-05-01

    The present study is an evaluation of temporal/spatial variations of surface water quality using multivariate statistical techniques, comprising cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA). Eleven water quality parameters were monitored at 38 different sites in Can Tho City, a Mekong Delta area of Vietnam from 2008 to 2012. Hierarchical cluster analysis grouped the 38 sampling sites into three clusters, representing mixed urban-rural areas, agricultural areas and industrial zone. FA/PCA resulted in three latent factors for the entire research location, three for cluster 1, four for cluster 2, and four for cluster 3 explaining 60, 60.2, 80.9, and 70% of the total variance in the respective water quality. The varifactors from FA indicated that the parameters responsible for water quality variations are related to erosion from disturbed land or inflow of effluent from sewage plants and industry, discharges from wastewater treatment plants and domestic wastewater, agricultural activities and industrial effluents, and contamination by sewage waste with faecal coliform bacteria through sewer and septic systems. Discriminant analysis (DA) revealed that nephelometric turbidity units (NTU), chemical oxygen demand (COD) and NH₃ are the discriminating parameters in space, affording 67% correct assignation in spatial analysis; pH and NO₂ are the discriminating parameters according to season, assigning approximately 60% of cases correctly. The findings suggest a possible revised sampling strategy that can reduce the number of sampling sites and the indicator parameters responsible for large variations in water quality. This study demonstrates the usefulness of multivariate statistical techniques for evaluation of temporal/spatial variations in water quality assessment and management.

  9. Big-data reflection high energy electron diffraction analysis for understanding epitaxial film growth processes.

    PubMed

    Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P; Kalinin, Sergei V

    2014-10-28

    Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED images, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the data set are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of a RHEED image sequence. This approach is illustrated for growth of La(x)Ca(1-x)MnO(3) films grown on etched (001) SrTiO(3) substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the asymmetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.

  10. An Introductory Application of Principal Components to Cricket Data

    ERIC Educational Resources Information Center

    Manage, Ananda B. W.; Scariano, Stephen M.

    2013-01-01

    Principal Component Analysis is widely used in applied multivariate data analysis, and this article shows how to motivate student interest in this topic using cricket sports data. Here, principal component analysis is successfully used to rank the cricket batsmen and bowlers who played in the 2012 Indian Premier League (IPL) competition. In…

  11. Least Principal Components Analysis (LPCA): An Alternative to Regression Analysis.

    ERIC Educational Resources Information Center

    Olson, Jeffery E.

    Often, all of the variables in a model are latent, random, or subject to measurement error, or there is not an obvious dependent variable. When any of these conditions exist, an appropriate method for estimating the linear relationships among the variables is Least Principal Components Analysis. Least Principal Components are robust, consistent,…

  12. An Updated Review of Meat Authenticity Methods and Applications.

    PubMed

    Vlachos, Antonios; Arvanitoyannis, Ioannis S; Tserkezou, Persefoni

    2016-05-18

    Adulteration of foods is a serious economic problem concerning most foodstuffs, and in particular meat products. Since high-priced meat demand premium prices, producers of meat-based products might be tempted to blend these products with lower cost meat. Moreover, the labeled meat contents may not be met. Both types of adulteration are difficult to detect and lead to deterioration of product quality. For the consumer, it is of outmost importance to guarantee both authenticity and compliance with product labeling. The purpose of this article is to review the state of the art of meat authenticity with analytical and immunochemical methods with the focus on the issue of geographic origin and sensory characteristics. This review is also intended to provide an overview of the various currently applied statistical analyses (multivariate analysis (MAV), such as principal component analysis, discriminant analysis, cluster analysis, etc.) and their effectiveness for meat authenticity.

  13. ELICIT: An alternative imprecise weight elicitation technique for use in multi-criteria decision analysis for healthcare.

    PubMed

    Diaby, Vakaramoko; Sanogo, Vassiki; Moussa, Kouame Richard

    2016-01-01

    In this paper, the readers are introduced to ELICIT, an imprecise weight elicitation technique for multicriteria decision analysis for healthcare. The application of ELICIT consists of two steps: the rank ordering of evaluation criteria based on decision-makers' (DMs) preferences using the principal component analysis; and the estimation of criteria weights and their descriptive statistics using the variable interdependent analysis and the Monte Carlo method. The application of ELICIT is illustrated with a hypothetical case study involving the elicitation of weights for five criteria used to select the best device for eye surgery. The criteria were ranked from 1-5, based on a strict preference relationship established by the DMs. For each criterion, the deterministic weight was estimated as well as the standard deviation and 95% credibility interval. ELICIT is appropriate in situations where only ordinal DMs' preferences are available to elicit decision criteria weights.

  14. Evaluation of the environmental contamination at an abandoned mining site using multivariate statistical techniques--the Rodalquilar (Southern Spain) mining district.

    PubMed

    Bagur, M G; Morales, S; López-Chicano, M

    2009-11-15

    Unsupervised and supervised pattern recognition techniques such as hierarchical cluster analysis, principal component analysis, factor analysis and linear discriminant analysis have been applied to water samples recollected in Rodalquilar mining district (Southern Spain) in order to identify different sources of environmental pollution caused by the abandoned mining industry. The effect of the mining activity on waters was monitored determining the concentration of eleven elements (Mn, Ba, Co, Cu, Zn, As, Cd, Sb, Hg, Au and Pb) by inductively coupled plasma mass spectrometry (ICP-MS). The Box-Cox transformation has been used to transform the data set in normal form in order to minimize the non-normal distribution of the geochemical data. The environmental impact is affected mainly by the mining activity developed in the zone, the acid drainage and finally by the chemical treatment used for the benefit of gold.

  15. Functional principal component analysis of glomerular filtration rate curves after kidney transplant.

    PubMed

    Dong, Jianghu J; Wang, Liangliang; Gill, Jagbir; Cao, Jiguo

    2017-01-01

    This article is motivated by some longitudinal clinical data of kidney transplant recipients, where kidney function progression is recorded as the estimated glomerular filtration rates at multiple time points post kidney transplantation. We propose to use the functional principal component analysis method to explore the major source of variations of glomerular filtration rate curves. We find that the estimated functional principal component scores can be used to cluster glomerular filtration rate curves. Ordering functional principal component scores can detect abnormal glomerular filtration rate curves. Finally, functional principal component analysis can effectively estimate missing glomerular filtration rate values and predict future glomerular filtration rate values.

  16. Tools based on multivariate statistical analysis for classification of soil and groundwater in Apulian agricultural sites.

    PubMed

    Ielpo, Pierina; Leardi, Riccardo; Pappagallo, Giuseppe; Uricchio, Vito Felice

    2017-06-01

    In this paper, the results obtained from multivariate statistical techniques such as PCA (Principal component analysis) and LDA (Linear discriminant analysis) applied to a wide soil data set are presented. The results have been compared with those obtained on a groundwater data set, whose samples were collected together with soil ones, within the project "Improvement of the Regional Agro-meteorological Monitoring Network (2004-2007)". LDA, applied to soil data, has allowed to distinguish the geographical origin of the sample from either one of the two macroaeras: Bari and Foggia provinces vs Brindisi, Lecce e Taranto provinces, with a percentage of correct prediction in cross validation of 87%. In the case of the groundwater data set, the best classification was obtained when the samples were grouped into three macroareas: Foggia province, Bari province and Brindisi, Lecce and Taranto provinces, by reaching a percentage of correct predictions in cross validation of 84%. The obtained information can be very useful in supporting soil and water resource management, such as the reduction of water consumption and the reduction of energy and chemical (nutrients and pesticides) inputs in agriculture.

  17. Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics

    PubMed Central

    Belo, David; Gamboa, Hugo

    2017-01-01

    The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components. PMID:28831239

  18. Spatio-temporal variability of hydro-chemical characteristics of coastal waters of Gulf of Mannar Marine Biosphere Reserve (GoMMBR), South India

    NASA Astrophysics Data System (ADS)

    Kathiravan, K.; Natesan, Usha; Vishnunath, R.

    2017-03-01

    The intention of this study was to appraise the spatial and temporal variations in the physico-chemical parameters of coastal waters of Rameswaram Island, Gulf of Mannar Marine Biosphere Reserve, south India, using multivariate statistical techniques, such as cluster analysis, factor analysis and principal component analysis. Spatio-temporal variations among the physico-chemical parameters are observed in the coastal waters of Gulf of Mannar, especially during northeast and post monsoon seasons. It is inferred that the high loadings of pH, temperature, suspended particulate matter, salinity, dissolved oxygen, biochemical oxygen demand, chlorophyll a, nutrient species of nitrogen and phosphorus strongly determine the discrimination of coastal water quality. Results highlight the important role of monsoonal variations to determine the coastal water quality around Rameswaram Island.

  19. Comparing Networks from a Data Analysis Perspective

    NASA Astrophysics Data System (ADS)

    Li, Wei; Yang, Jing-Yu

    To probe network characteristics, two predominant ways of network comparison are global property statistics and subgraph enumeration. However, they suffer from limited information and exhaustible computing. Here, we present an approach to compare networks from the perspective of data analysis. Initially, the approach projects each node of original network as a high-dimensional data point, and the network is seen as clouds of data points. Then the dispersion information of the principal component analysis (PCA) projection of the generated data clouds can be used to distinguish networks. We applied this node projection method to the yeast protein-protein interaction networks and the Internet Autonomous System networks, two types of networks with several similar higher properties. The method can efficiently distinguish one from the other. The identical result of different datasets from independent sources also indicated that the method is a robust and universal framework.

  20. Chemical profiling of guarana seeds (Paullinia cupana) from different geographical origins using UPLC-QTOF-MS combined with chemometrics.

    PubMed

    da Silva, Givaldo Souza; Canuto, Kirley Marques; Ribeiro, Paulo Riceli Vasconcelos; de Brito, Edy Sousa; Nascimento, Madson Moreira; Zocolo, Guilherme Julião; Coutinho, Janclei Pereira; de Jesus, Raildo Mota

    2017-12-01

    Paullinia cupana, commonly known as guarana, is an Amazonian fruit whose seeds are used to produce the powdered guarana, which is rich in caffeine and consumed for its stimulating activity. The metabolic profile of guarana from the two largest producing regions was investigated using UPLC-MS combined with multivariate statistical analysis. The principal component analysis (PCA) showed significant differences between samples produced in the states of Bahia and Amazonas. The metabolites responsible for the differentiation were identified by orthogonal partial least squares discriminant analysis (OPLS-DA). Fourteen phenolic compounds were characterized in guarana powder samples, and catechin, epicatechin, B-type procyanidin dimer, A-type procyanidin trimer and A-type procyanidin dimer were the main compounds responsible for the geographical variation of the samples. Copyright © 2017. Published by Elsevier Ltd.

  1. Multivariate Analysis and Prediction of Dioxin-Furan ...

    EPA Pesticide Factsheets

    Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE

  2. Non-targeted 1H NMR fingerprinting and multivariate statistical analyses for the characterisation of the geographical origin of Italian sweet cherries.

    PubMed

    Longobardi, F; Ventrella, A; Bianco, A; Catucci, L; Cafagna, I; Gallo, V; Mastrorilli, P; Agostiano, A

    2013-12-01

    In this study, non-targeted (1)H NMR fingerprinting was used in combination with multivariate statistical techniques for the classification of Italian sweet cherries based on their different geographical origins (Emilia Romagna and Puglia). As classification techniques, Soft Independent Modelling of Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Linear Discriminant Analysis (LDA) were carried out and the results were compared. For LDA, before performing a refined selection of the number/combination of variables, two different strategies for a preliminary reduction of the variable number were tested. The best average recognition and CV prediction abilities (both 100.0%) were obtained for all the LDA models, although PLS-DA also showed remarkable performances (94.6%). All the statistical models were validated by observing the prediction abilities with respect to an external set of cherry samples. The best result (94.9%) was obtained with LDA by performing a best subset selection procedure on a set of 30 principal components previously selected by a stepwise decorrelation. The metabolites that mostly contributed to the classification performances of such LDA model, were found to be malate, glucose, fructose, glutamine and succinate. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. Statistical downscaling of GCM simulations to streamflow using relevance vector machine

    NASA Astrophysics Data System (ADS)

    Ghosh, Subimal; Mujumdar, P. P.

    2008-01-01

    General circulation models (GCMs), the climate models often used in assessing the impact of climate change, operate on a coarse scale and thus the simulation results obtained from GCMs are not particularly useful in a comparatively smaller river basin scale hydrology. The article presents a methodology of statistical downscaling based on sparse Bayesian learning and Relevance Vector Machine (RVM) to model streamflow at river basin scale for monsoon period (June, July, August, September) using GCM simulated climatic variables. NCEP/NCAR reanalysis data have been used for training the model to establish a statistical relationship between streamflow and climatic variables. The relationship thus obtained is used to project the future streamflow from GCM simulations. The statistical methodology involves principal component analysis, fuzzy clustering and RVM. Different kernel functions are used for comparison purpose. The model is applied to Mahanadi river basin in India. The results obtained using RVM are compared with those of state-of-the-art Support Vector Machine (SVM) to present the advantages of RVMs over SVMs. A decreasing trend is observed for monsoon streamflow of Mahanadi due to high surface warming in future, with the CCSR/NIES GCM and B2 scenario.

  4. Female Traditional Principals and Co-Principals: Experiences of Role Conflict and Job Satisfaction

    ERIC Educational Resources Information Center

    Eckman, Ellen Wexler; Kelber, Sheryl Talcott

    2010-01-01

    This paper presents a secondary analysis of survey data focusing on role conflict and job satisfaction of 102 female principals. Data were collected from 51 female traditional principals and 51 female co-principals. By examining the traditional and co-principal leadership models as experienced by female principals, this paper addresses the impact…

  5. Generative statistical modeling of left atrial appendage appearance to substantiate clinical paradigms for stroke risk stratification

    NASA Astrophysics Data System (ADS)

    Sanatkhani, Soroosh; Menon, Prahlad G.

    2018-03-01

    Left atrial appendage (LAA) is the source of 91% of the thrombi in patients with atrial arrhythmias ( 2.3 million US adults), turning this region into a potential threat for stroke. LAA geometries have been clinically categorized into four appearance groups viz. Cauliflower, Cactus, Chicken-Wing and WindSock, based on visual appearance in 3D volume visualizations of contrast-enhanced computed tomography (CT) imaging, and have further been correlated with stroke risk by considering clinical mortality statistics. However, such classification from visual appearance is limited by human subjectivity and is not sophisticated enough to address all the characteristics of the geometries. Quantification of LAA geometry metrics can reveal a more repeatable and reliable estimate on the characteristics of the LAA which correspond with stasis risk, and in-turn cardioembolic risk. We present an approach to quantify the appearance of the LAA in patients in atrial fibrillation (AF) using a weighted set of baseline eigen-modes of LAA appearance variation, as a means to objectify classification of patient-specific LAAs into the four accepted clinical appearance groups. Clinical images of 16 patients (4 per LAA appearance category) with atrial fibrillation (AF) were identified and visualized as volume images. All the volume images were rigidly reoriented in order to be spatially co-registered, normalized in terms of intensity, resampled and finally reshaped appropriately to carry out principal component analysis (PCA), in order to parametrize the LAA region's appearance based on principal components (PCs/eigen mode) of greyscale appearance, generating 16 eigen-modes of appearance variation. Our pilot studies show that the most dominant LAA appearance (i.e. reconstructable using the fewest eigen-modes) resembles the Chicken-Wing class, which is known to have the lowest stroke risk per clinical mortality statistics. Our findings indicate the possibility that LAA geometries with high risk of stroke are higher-order statistical variants of underlying lower risk shapes.

  6. Investigation of the Impact of Extracting and Exchanging Health Information by Using Internet and Social Networks.

    PubMed

    Pistolis, John; Zimeras, Stelios; Chardalias, Kostas; Roupa, Zoe; Fildisis, George; Diomidous, Marianna

    2016-06-01

    Social networks (1) have been embedded in our daily life for a long time. They constitute a powerful tool used nowadays for both searching and exchanging information on different issues by using Internet searching engines (Google, Bing, etc.) and Social Networks (Facebook, Twitter etc.). In this paper, are presented the results of a research based on the frequency and the type of the usage of the Internet and the Social Networks by the general public and the health professionals. The objectives of the research were focused on the investigation of the frequency of seeking and meticulously searching for health information in the social media by both individuals and health practitioners. The exchanging of information is a procedure that involves the issues of reliability and quality of information. In this research, by using advanced statistical techniques an effort is made to investigate the participant's profile in using social networks for searching and exchanging information on health issues. Based on the answers 93 % of the people, use the Internet to find information on health-subjects. Considering principal component analysis, the most important health subjects were nutrition (0.719 %), respiratory issues (0.79 %), cardiological issues (0.777%), psychological issues (0.667%) and total (73.8%). The research results, based on different statistical techniques revealed that the 61.2% of the males and 56.4% of the females intended to use the social networks for searching medical information. Based on the principal components analysis, the most important sources that the participants mentioned, were the use of the Internet and social networks for exchanging information on health issues. These sources proved to be of paramount importance to the participants of the study. The same holds for nursing, medical and administrative staff in hospitals.

  7. A statistical approach to discriminate between non-fallers, rare fallers and frequent fallers in older adults based on posturographic data.

    PubMed

    Maranesi, E; Merlo, A; Fioretti, S; Zemp, D D; Campanini, I; Quadri, P

    2016-02-01

    Identification of future non-fallers, infrequent and frequent fallers among older people would permit focusing the delivery of prevention programs on selected individuals. Posturographic parameters have been proven to differentiate between non-fallers and frequent fallers, but not between the first group and infrequent fallers. In this study, postural stability with eyes open and closed on both a firm and a compliant surface and while performing a cognitive task was assessed in a consecutive sample of 130 cognitively able elderly, mean age 77(7)years, categorized as non-fallers (N=67), infrequent fallers (one/two falls, N=45) and frequent fallers (more than two falls, N=18) according to their last year fall history. Principal Component Analysis was used to select the most significant features from a set of 17posturographic parameters. Next, variables derived from principal component analysis were used to test, in each task, group differences between the three groups. One parameter based on a combination of a set of Centre of Pressure anterior-posterior variables obtained from the eyes-open on a compliant surface task was statistically different among all groups, thus distinguishing infrequent fallers from both non-fallers (P<0.05) and frequent fallers (P<0.05). For the first time, a method based on posturographic data to retrospectively discriminate infrequent fallers was obtained. The joint use of both the eyes-open on a compliant surface condition and this new parameter could be used, in a future study, to improve the performance of protocols and to verify the ability of this method to identify new-fallers in elderly without cognitive impairment. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. A method to estimate the effect of deformable image registration uncertainties on daily dose mapping

    PubMed Central

    Murphy, Martin J.; Salguero, Francisco J.; Siebers, Jeffrey V.; Staub, David; Vaman, Constantin

    2012-01-01

    Purpose: To develop a statistical sampling procedure for spatially-correlated uncertainties in deformable image registration and then use it to demonstrate their effect on daily dose mapping. Methods: Sequential daily CT studies are acquired to map anatomical variations prior to fractionated external beam radiotherapy. The CTs are deformably registered to the planning CT to obtain displacement vector fields (DVFs). The DVFs are used to accumulate the dose delivered each day onto the planning CT. Each DVF has spatially-correlated uncertainties associated with it. Principal components analysis (PCA) is applied to measured DVF error maps to produce decorrelated principal component modes of the errors. The modes are sampled independently and reconstructed to produce synthetic registration error maps. The synthetic error maps are convolved with dose mapped via deformable registration to model the resulting uncertainty in the dose mapping. The results are compared to the dose mapping uncertainty that would result from uncorrelated DVF errors that vary randomly from voxel to voxel. Results: The error sampling method is shown to produce synthetic DVF error maps that are statistically indistinguishable from the observed error maps. Spatially-correlated DVF uncertainties modeled by our procedure produce patterns of dose mapping error that are different from that due to randomly distributed uncertainties. Conclusions: Deformable image registration uncertainties have complex spatial distributions. The authors have developed and tested a method to decorrelate the spatial uncertainties and make statistical samples of highly correlated error maps. The sample error maps can be used to investigate the effect of DVF uncertainties on daily dose mapping via deformable image registration. An initial demonstration of this methodology shows that dose mapping uncertainties can be sensitive to spatial patterns in the DVF uncertainties. PMID:22320766

  9. Separating Putative Pathogens from Background Contamination with Principal Orthogonal Decomposition: Evidence for Leptospira in the Ugandan Neonatal Septisome

    PubMed Central

    Schiff, Steven J.; Kiwanuka, Julius; Riggio, Gina; Nguyen, Lan; Mu, Kevin; Sproul, Emily; Bazira, Joel; Mwanga-Amumpaire, Juliet; Tumusiime, Dickson; Nyesigire, Eunice; Lwanga, Nkangi; Bogale, Kaleb T.; Kapur, Vivek; Broach, James R.; Morton, Sarah U.; Warf, Benjamin C.; Poss, Mary

    2016-01-01

    Neonatal sepsis (NS) is responsible for over 1 million yearly deaths worldwide. In the developing world, NS is often treated without an identified microbial pathogen. Amplicon sequencing of the bacterial 16S rRNA gene can be used to identify organisms that are difficult to detect by routine microbiological methods. However, contaminating bacteria are ubiquitous in both hospital settings and research reagents and must be accounted for to make effective use of these data. In this study, we sequenced the bacterial 16S rRNA gene obtained from blood and cerebrospinal fluid (CSF) of 80 neonates presenting with NS to the Mbarara Regional Hospital in Uganda. Assuming that patterns of background contamination would be independent of pathogenic microorganism DNA, we applied a novel quantitative approach using principal orthogonal decomposition to separate background contamination from potential pathogens in sequencing data. We designed our quantitative approach contrasting blood, CSF, and control specimens and employed a variety of statistical random matrix bootstrap hypotheses to estimate statistical significance. These analyses demonstrate that Leptospira appears present in some infants presenting within 48 h of birth, indicative of infection in utero, and up to 28 days of age, suggesting environmental exposure. This organism cannot be cultured in routine bacteriological settings and is enzootic in the cattle that often live in close proximity to the rural peoples of western Uganda. Our findings demonstrate that statistical approaches to remove background organisms common in 16S sequence data can reveal putative pathogens in small volume biological samples from newborns. This computational analysis thus reveals an important medical finding that has the potential to alter therapy and prevention efforts in a critically ill population. PMID:27379237

  10. Forest statistics for Central Georgia, 1982

    Treesearch

    Raymond M. Sheffield; John B. Tansey

    1982-01-01

    This report highlights the principal findings of the fifth forest survey of Central Georgia. Fieldwork began in October 1981 and was completed in June 1982. Four previous surveys, completed in 1936, 1952, 1961, and 1972, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on the changes and trends since...

  11. Forest statistics for North Central Georgia, 1998

    Treesearch

    Michael T. Thompson

    1998-01-01

    This report highlights the principal findings of the seventh forest survey of North Central Georgia. Field work began in June 1997 and was completed in November 1997. Six previous surveys, completed in 1936, 1953, 196 1, 1972, 1983, and 1989 provide statistics for measuring changes and trends over the past 6 1 years. This report primarily emphasizes the changes and...

  12. Forest statistics for South Florida, 1995

    Treesearch

    Michael T. Thompson

    1996-01-01

    This report highlights the principal findings of the seventh forest survey of South Florida. Field work began in September 1994 and was completed in November 1994. Six previous surveys, completed in 1936, 1949, 1959, 1970, 1980, and 1988 provide statistics for measuring changes and trends over the past 59 years. This report primarily emphasizes the changes and trends...

  13. Forest statistics for Central Georgia, 1997

    Treesearch

    Michael T. Thompson

    1998-01-01

    This report highlights the principal findings of the seventh forest survey of Central Georgia. Field work began in November 1996 and was completed in August 1997. Six previous surveys, completed in 1936, 1952, 1961, 1972, 1982, and 1989 provide statistics for measuring changes and trends over the past 61 years. This report primarily emphasizes the changes and trends...

  14. Forest statistics for Central Florida - 1995

    Treesearch

    Mark J. Brown

    1996-01-01

    This report highlights the principal findings of the seventh forest survey of Central Florida. Field work began in February 1995 and was completed in May 1995. Six previous surveys, completed in 1936, 1949, 1959, 1970, 1960, and 1988 provide statistics for measuring changes and trends over the past 59 years. This report primarily emphasizes the changes and trends since...

  15. Forest statistics for Southwest Georgia, 1996

    Treesearch

    Raymond M. Sheffield; Michael T. Thompson

    1997-01-01

    This report highlights the principal findings of the seventh forest survey of Southwest Georgia. Field work began in June 1995 and was completed in November 1995. Six previous surveys, completed in 1934, 1951, 1960, 1971, 1981, and 1988 provide statistics for measuring changes and trends over the past 62 years. This report primarily emphasizes the changes and trends...

  16. Forest statistics for Northeast Florida, 1980

    Treesearch

    Raymond M. Sheffield

    1981-01-01

    This report highlights the principal findings of the fifth forest survey of Northeast Florida. Fieldwork began in June 1979 and was completed in December 1979. Four previous surveys, completed in 1934, 1949, 1959, and 1970, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on the changes and trends since...

  17. Indicators of School Crime and Safety: 2017. NCES 2018-036/NCJ 251413

    ERIC Educational Resources Information Center

    Zhang, Anlan; Wang, Ke; Zhang, Jizhi; Kemp, Jana; Diliberti, Melissa; Oudekerk, Barbara A.

    2018-01-01

    A joint effort by the National Center for Education Statistics and the Bureau of Justice Statistics, this annual report examines crime occurring in schools and colleges. This report presents data on crime at school from the perspectives of students, teachers, principals, and the general population from an array of sources--the National Crime…

  18. Forest statistics for Virginia, 1992

    Treesearch

    Tony G. Johnson

    1992-01-01

    This report highlights the principal findings of the sixth forest survey of Virginia. Field work began in October 1990 and was completed in January 1992. Five previous surveys, completed in 1940, 1957, 1966, 1977, and 1986, provide statistics for measuring changes and trends over the past 52 years. The primary emphasis in this report is on the changes and trends since...

  19. Forest statistics for the Northern Piedmont of Virginia 1976

    Treesearch

    Raymond M. Sheffield

    1976-01-01

    This report highlights the principal findings of the fourth inventory of the timber resource in the Northern Piedmont of Virginia. The inventory was started in March 1976 and completed in August 1976. Three previous inventories, completed in 1940, 1957, and 1965, provide statistics for measuring changes and trends over the past 36 years. In this report, the primary...

  20. Forest statistics for the Coastal Plain of Virginia, 1976

    Treesearch

    Noel D. Cost

    1976-01-01

    This report highlights the principal findings of the fourth inventory of the timber resource in the coastal Plain of Virginia. The inventory was started in February 1975 and completed in November 1975. Three previous inventories, completed in 1940, 1956, and 1966, provide statistics for measuring changes and trends over the past 36 years. In this report, the primary...

  1. Forest statistics for Northeast Florida, 1987

    Treesearch

    Mark J. Brown

    1987-01-01

    This report highlights the principal findings of the sixth forest survey of Northeast Florida. Field work began in January 1987 and was completed in July 1987. Five previous surveys, completed in 1934, 1949, 1959, 1970, and 1980, provide statistics for measuring changes and trends over the past 53 years. The primary emphasis in this report is on the changes and trends...

  2. Forest statistics for Northwest Florida, 1979

    Treesearch

    Raymond M. Sheffield

    1979-01-01

    This report highlights the principal findings of the fifth forest survey of Northwest Florida. Fieldwork began in September 1978 and was completed in June 1979. Four previous surveys, completed in 1934, 1949, 1959, and 1969, provide statistics for measuring changes and trends over the past 45 years. The primary emphasis in this report is on the changes and trends since...

  3. Forest statistics for Northeast Florida, 1995

    Treesearch

    Raymond M. Sheffield

    1995-01-01

    This report highlights the principal findings of the seventh forest survey of Northeast Florida. Field work began in April 1994 and was completed in May 1995. Six previous surveys, completed in 1934, 1949. 1959, 1970, 1980, and 1987 provide statistics for measuring changes and trends over the past 61 years. The primary emphasis in this report is on the changes and...

  4. Forest statistics for North Georgia, 1983

    Treesearch

    John B. Tansey

    1983-01-01

    This report highlights the principal findings of the fifth forest survey of North Georgia. Fieldwork began in September 1982 and was completed in January 1983. Four previous surveys, completed in 1936, 1953, 1961, and 1972, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes and trends since...

  5. Forest statistics for North Carolina, 1984

    Treesearch

    William A. Bechtold

    1984-01-01

    This report highlights the principal findings of the fifth forest survey of North Carolina, Fieldwork began in November 1982 and was completed in September 1984, Four previous surveys, completed in 1938, 1956, 1964, and 1974, provide statistics for measuring changes and trends over the past 46 years, The primary emphasis in this report is on the changes and trends...

  6. Forest statistics for North Central Georgia, 1983

    Treesearch

    John B. Tansey

    1983-01-01

    This report highlights the principal findings of the fifth forest survey of North Central Georgia. Fieldwork began in May 1982 and was completed in September 1982. Four previous surveys, completed in 1936, 1953, 1961, and 1972, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes and trends...

  7. Forest statistics for South Carolina, 1978

    Treesearch

    Raymond M. Sheffield

    1978-01-01

    This report highlights the principal findings of the fifth inventory of South Carolina's forests. Fieldwork began in April 1977 and was completed in August 1978. Four previous statewide inventories, completed in 1936, 1947, 19.58, and 1968, provide statistics for measuring changes and trends over the past 42 years. The primary emphasis in this report is on the...

  8. Forest statistics for Southeast Georgia, 1981

    Treesearch

    Raymond M. Sheffield

    1982-01-01

    This report highlights the principal findings of the fifth forest survey of Southeast Georgia, Fieldwork began in November 1980 and was completed in October 1981. Four previous surveys, completed in 1934, 1952, 1960, and 1971, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes and trends...

  9. Forest statistics for the Southern Piedmont of Virginia 1976

    Treesearch

    Raymond M. Sheffield

    1976-01-01

    This report highlights the principal findings of the fourth inventory of the timber resource in the Southern Piedmont of Virginia. The inventory was started in February 1975 and completed in November 1975. Three previous inventories, completed in 1940, 1956, and 1966, provide statistics for measuring changes and trends over the past 36 years. In this report, the...

  10. Data Sharing and the Development of the Cleveland Clinic Statistical Education Dataset Repository

    ERIC Educational Resources Information Center

    Nowacki, Amy S.

    2013-01-01

    Examples are highly sought by both students and teachers. This is particularly true as many statistical instructors aim to engage their students and increase active participation. While simulated datasets are functional, they lack real perspective and the intricacies of actual data. In order to obtain real datasets, the principal investigator of a…

  11. Forest statistics for Virginia, 1986

    Treesearch

    Mark J. Brown

    1986-01-01

    This report highlights the principal findings of the fifth forest survey of Virginia. Fieldwork began in September 1984 and was completed in November 1985. Four previous surveys, completed in 1940, 1957, 1966, and 1977, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on the changes and trends since 1977...

  12. Forest statistics for Southwest Georgia, 1981

    Treesearch

    Raymond M. Sheffield

    1981-01-01

    This report highlights the principal findings of the fifth forest survey of southwest Georgia, Fieldwork began in May 1980 and was completed in November 1980. Four previous surveys, completed in 1938, 1951, 1960, 1971, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes and trends since 1971...

  13. Forest statistics for the Northern Mountain region of Virginia 1977

    Treesearch

    Raymond M. Sheffield

    1977-01-01

    This report highlights the principal findings of the fourth inventory of timber resources in the Northern Mountain Region of Virginia. The inventory was started in August 1976 and completed in December 1976. Three previous inventories, completed in 1940, 1957 and 1966, provide statistics for measuring changes and trends over the past 37 years. In this report, the...

  14. Forest statistics for Central Florida - 1980

    Treesearch

    Raymond M. Sheffield

    1981-01-01

    This report highlights the principal findings of the fifth forest survey of Central Florida. Fieldwork began in December 1979 and was completed in March 1980. Four previous surveys, completed in 1936, 1949, 1959, and 1970, provide statistics for measuring changes and trends over the past 44 years. The primary emphasis in this report is on the changes and trends since...

  15. Forest statistics for South Carolina, 1986

    Treesearch

    John B. Tansey

    1986-01-01

    This report highlights the principal findings of the sixth forest survey in South Carolina. Fieldwork began in November 1985 and was completed in September 1986. Five previous surveys, completed in 1936, 1947, 1958, 1968, and 1978, provide statistics for measuring changes and trends over the past 50 years, The primary emphasis in this report is on the changes and...

  16. Forest statistics for Southwest Georgia, 1988

    Treesearch

    Michael T. Thompson

    1988-01-01

    This report highlights the principal findings of the sixth forest survey in southwest Georgia. Field work began in October 1987 and was completed in January 1988. Five previous surveys, completed in 1934, 1951, 1960, 1971, and 1981, provide statistics for measuring changes and trends over the past 54 years. The primary emphasis in this report is on the changes and...

  17. Forest statistics for Central Florida - 1988

    Treesearch

    Mark J. Brown

    1988-01-01

    This report highlights the principal findings of the sixth forest survey of Central Florida. Field work began in July 1987 and was completed in September 1987. Five previous surveys, completed in 1936, 1949, 1959, 1970, and 1980, provide statistics for measuring changes and trends over the past 52 years. The primary emphasis in this report is on the changes and trends...

  18. The contribution of executive functions to emergent mathematic skills in preschool children.

    PubMed

    Espy, Kimberly Andrews; McDiarmid, Melanie M; Cwik, Mary F; Stalets, Melissa Meade; Hamby, Arlena; Senn, Theresa E

    2004-01-01

    Mathematical ability is related to both activation of the prefrontal cortex in neuroimaging studies of adults and to executive functions in school-age children. The purpose of this study was to determine whether executive functions were related to emergent mathematical proficiency in preschool children. Preschool children (N = 96) were administered an executive function battery that was reduced empirically to working memory (WM), inhibitory control (IC), and shifting abilities by calculating composite scores derived from principal component analysis. Both WM and IC predicted early arithmetic competency, with the observed relations robust after controlling statistically for child age, maternal education, and child vocabulary. Only IC accounted for unique variance in mathematical skills, after the contribution of other executive functions were controlled statistically as well. Specific executive functions are related to emergent mathematical proficiency in this age range. Longitudinal studies using structural equation modeling are necessary to better characterize these ontogenetic relations.

  19. Ice Mass Change in Greenland and Antarctica Between 1993 and 2013 from Satellite Gravity Measurements

    NASA Technical Reports Server (NTRS)

    Talpe, Matthieu J.; Nerem, R. Steven; Forootan, Ehsan; Schmidt, Michael; Lemoine, Frank G.; Enderlin, Ellyn M.; Landerer, Felix W.

    2017-01-01

    We construct long-term time series of Greenland and Antarctic ice sheet mass change from satellite gravity measurements. A statistical reconstruction approach is developed based on a principal component analysis (PCA) to combine high-resolution spatial modes from the Gravity Recovery and Climate Experiment (GRACE) mission with the gravity information from conventional satellite tracking data. Uncertainties of this reconstruction are rigorously assessed; they include temporal limitations for short GRACE measurements, spatial limitations for the low-resolution conventional tracking data measurements, and limitations of the estimated statistical relationships between low- and high-degree potential coefficients reflected in the PCA modes. Trends of mass variations in Greenland and Antarctica are assessed against a number of previous studies. The resulting time series for Greenland show a higher rate of mass loss than other methods before 2000, while the Antarctic ice sheet appears heavily influenced by interannual variations.

  20. Exploring relationships between Dairy Herd Improvement monitors of performance and the Transition Cow Index in Wisconsin dairy herds.

    PubMed

    Schultz, K K; Bennett, T B; Nordlund, K V; Döpfer, D; Cook, N B

    2016-09-01

    Transition cow management has been tracked via the Transition Cow Index (TCI; AgSource Cooperative Services, Verona, WI) since 2006. Transition Cow Index was developed to measure the difference between actual and predicted milk yield at first test day to evaluate the relative success of the transition period program. This project aimed to assess TCI in relation to all commonly used Dairy Herd Improvement (DHI) metrics available through AgSource Cooperative Services. Regression analysis was used to isolate variables that were relevant to TCI, and then principal components analysis and network analysis were used to determine the relative strength and relatedness among variables. Finally, cluster analysis was used to segregate herds based on similarity of relevant variables. The DHI data were obtained from 2,131 Wisconsin dairy herds with test-day mean ≥30 cows, which were tested ≥10 times throughout the 2014 calendar year. The original list of 940 DHI variables was reduced through expert-driven selection and regression analysis to 23 variables. The K-means cluster analysis produced 5 distinct clusters. Descriptive statistics were calculated for the 23 variables per cluster grouping. Using principal components analysis, cluster analysis, and network analysis, 4 parameters were isolated as most relevant to TCI; these were energy-corrected milk, 3 measures of intramammary infection (dry cow cure rate, linear somatic cell count score in primiparous cows, and new infection rate), peak ratio, and days in milk at peak milk production. These variables together with cow and newborn calf survival measures form a group of metrics that can be used to assist in the evaluation of overall transition period performance. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

Top