Comparing and combining biomarkers as principle surrogates for time-to-event clinical endpoints.
Gabriel, Erin E; Sachs, Michael C; Gilbert, Peter B
2015-02-10
Principal surrogate endpoints are useful as targets for phase I and II trials. In many recent trials, multiple post-randomization biomarkers are measured. However, few statistical methods exist for comparison of or combination of biomarkers as principal surrogates, and none of these methods to our knowledge utilize time-to-event clinical endpoint information. We propose a Weibull model extension of the semi-parametric estimated maximum likelihood method that allows for the inclusion of multiple biomarkers in the same risk model as multivariate candidate principal surrogates. We propose several methods for comparing candidate principal surrogates and evaluating multivariate principal surrogates. These include the time-dependent and surrogate-dependent true and false positive fraction, the time-dependent and the integrated standardized total gain, and the cumulative distribution function of the risk difference. We illustrate the operating characteristics of our proposed methods in simulations and outline how these statistics can be used to evaluate and compare candidate principal surrogates. We use these methods to investigate candidate surrogates in the Diabetes Control and Complications Trial. Copyright © 2014 John Wiley & Sons, Ltd.
Transforming Graph Data for Statistical Relational Learning
2012-10-01
Jordan, 2003), PLSA (Hofmann, 1999), ? Classification via RMN (Taskar et al., 2003) or SVM (Hasan, Chaoji, Salem , & Zaki, 2006) ? Hierarchical...dimensionality reduction methods such as Principal 407 Rossi, McDowell, Aha, & Neville Component Analysis (PCA), Principal Factor Analysis ( PFA ), and...clustering algorithm. Journal of the Royal Statistical Society. Series C, Applied statistics, 28, 100–108. Hasan, M. A., Chaoji, V., Salem , S., & Zaki, M
Dascălu, Cristina Gena; Antohe, Magda Ecaterina
2009-01-01
Based on the eigenvalues and the eigenvectors analysis, the principal component analysis has the purpose to identify the subspace of the main components from a set of parameters, which are enough to characterize the whole set of parameters. Interpreting the data for analysis as a cloud of points, we find through geometrical transformations the directions where the cloud's dispersion is maximal--the lines that pass through the cloud's center of weight and have a maximal density of points around them (by defining an appropriate criteria function and its minimization. This method can be successfully used in order to simplify the statistical analysis on questionnaires--because it helps us to select from a set of items only the most relevant ones, which cover the variations of the whole set of data. For instance, in the presented sample we started from a questionnaire with 28 items and, applying the principal component analysis we identified 7 principal components--or main items--fact that simplifies significantly the further data statistical analysis.
NASA Astrophysics Data System (ADS)
Bookstein, Fred L.
1995-08-01
Recent advances in computational geometry have greatly extended the range of neuroanatomical questions that can be approached by rigorous quantitative methods. One of the major current challenges in this area is to describe the variability of human cortical surface form and its implications for individual differences in neurophysiological functioning. Existing techniques for representation of stochastically invaginated surfaces do not conduce to the necessary parametric statistical summaries. In this paper, following a hint from David Van Essen and Heather Drury, I sketch a statistical method customized for the constraints of this complex data type. Cortical surface form is represented by its Riemannian metric tensor and averaged according to parameters of a smooth averaged surface. Sulci are represented by integral trajectories of the smaller principal strains of this metric, and their statistics follow the statistics of that relative metric. The diagrams visualizing this tensor analysis look like alligator leather but summarize all aspects of cortical surface form in between the principal sulci, the reliable ones; no flattening is required.
Jankovic, Marko; Ogawa, Hidemitsu
2004-10-01
Principal Component Analysis (PCA) and Principal Subspace Analysis (PSA) are classic techniques in statistical data analysis, feature extraction and data compression. Given a set of multivariate measurements, PCA and PSA provide a smaller set of "basis vectors" with less redundancy, and a subspace spanned by them, respectively. Artificial neurons and neural networks have been shown to perform PSA and PCA when gradient ascent (descent) learning rules are used, which is related to the constrained maximization (minimization) of statistical objective functions. Due to their low complexity, such algorithms and their implementation in neural networks are potentially useful in cases of tracking slow changes of correlations in the input data or in updating eigenvectors with new samples. In this paper we propose PCA learning algorithm that is fully homogeneous with respect to neurons. The algorithm is obtained by modification of one of the most famous PSA learning algorithms--Subspace Learning Algorithm (SLA). Modification of the algorithm is based on Time-Oriented Hierarchical Method (TOHM). The method uses two distinct time scales. On a faster time scale PSA algorithm is responsible for the "behavior" of all output neurons. On a slower scale, output neurons will compete for fulfillment of their "own interests". On this scale, basis vectors in the principal subspace are rotated toward the principal eigenvectors. At the end of the paper it will be briefly analyzed how (or why) time-oriented hierarchical method can be used for transformation of any of the existing neural network PSA method, into PCA method.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Sparse approximation of currents for statistics on curves and surfaces.
Durrleman, Stanley; Pennec, Xavier; Trouvé, Alain; Ayache, Nicholas
2008-01-01
Computing, processing, visualizing statistics on shapes like curves or surfaces is a real challenge with many applications ranging from medical image analysis to computational geometry. Modelling such geometrical primitives with currents avoids feature-based approach as well as point-correspondence method. This framework has been proved to be powerful to register brain surfaces or to measure geometrical invariants. However, if the state-of-the-art methods perform efficiently pairwise registrations, new numerical schemes are required to process groupwise statistics due to an increasing complexity when the size of the database is growing. Statistics such as mean and principal modes of a set of shapes often have a heavy and highly redundant representation. We propose therefore to find an adapted basis on which mean and principal modes have a sparse decomposition. Besides the computational improvement, this sparse representation offers a way to visualize and interpret statistics on currents. Experiments show the relevance of the approach on 34 sets of 70 sulcal lines and on 50 sets of 10 meshes of deep brain structures.
Statistical Data Analyses of Trace Chemical, Biochemical, and Physical Analytical Signatures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Udey, Ruth Norma
Analytical and bioanalytical chemistry measurement results are most meaningful when interpreted using rigorous statistical treatments of the data. The same data set may provide many dimensions of information depending on the questions asked through the applied statistical methods. Three principal projects illustrated the wealth of information gained through the application of statistical data analyses to diverse problems.
Principal Component Analysis: Resources for an Essential Application of Linear Algebra
ERIC Educational Resources Information Center
Pankavich, Stephen; Swanson, Rebecca
2015-01-01
Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…
Applications of Nonlinear Principal Components Analysis to Behavioral Data.
ERIC Educational Resources Information Center
Hicks, Marilyn Maginley
1981-01-01
An empirical investigation of the statistical procedure entitled nonlinear principal components analysis was conducted on a known equation and on measurement data in order to demonstrate the procedure and examine its potential usefulness. This method was suggested by R. Gnanadesikan and based on an early paper of Karl Pearson. (Author/AL)
NASA Astrophysics Data System (ADS)
Qi, D.; Majda, A.
2017-12-01
A low-dimensional reduced-order statistical closure model is developed for quantifying the uncertainty in statistical sensitivity and intermittency in principal model directions with largest variability in high-dimensional turbulent system and turbulent transport models. Imperfect model sensitivity is improved through a recent mathematical strategy for calibrating model errors in a training phase, where information theory and linear statistical response theory are combined in a systematic fashion to achieve the optimal model performance. The idea in the reduced-order method is from a self-consistent mathematical framework for general systems with quadratic nonlinearity, where crucial high-order statistics are approximated by a systematic model calibration procedure. Model efficiency is improved through additional damping and noise corrections to replace the expensive energy-conserving nonlinear interactions. Model errors due to the imperfect nonlinear approximation are corrected by tuning the model parameters using linear response theory with an information metric in a training phase before prediction. A statistical energy principle is adopted to introduce a global scaling factor in characterizing the higher-order moments in a consistent way to improve model sensitivity. Stringent models of barotropic and baroclinic turbulence are used to display the feasibility of the reduced-order methods. Principal statistical responses in mean and variance can be captured by the reduced-order models with accuracy and efficiency. Besides, the reduced-order models are also used to capture crucial passive tracer field that is advected by the baroclinic turbulent flow. It is demonstrated that crucial principal statistical quantities like the tracer spectrum and fat-tails in the tracer probability density functions in the most important large scales can be captured efficiently with accuracy using the reduced-order tracer model in various dynamical regimes of the flow field with distinct statistical structures.
ASCS online fault detection and isolation based on an improved MPCA
NASA Astrophysics Data System (ADS)
Peng, Jianxin; Liu, Haiou; Hu, Yuhui; Xi, Junqiang; Chen, Huiyan
2014-09-01
Multi-way principal component analysis (MPCA) has received considerable attention and been widely used in process monitoring. A traditional MPCA algorithm unfolds multiple batches of historical data into a two-dimensional matrix and cut the matrix along the time axis to form subspaces. However, low efficiency of subspaces and difficult fault isolation are the common disadvantages for the principal component model. This paper presents a new subspace construction method based on kernel density estimation function that can effectively reduce the storage amount of the subspace information. The MPCA model and the knowledge base are built based on the new subspace. Then, fault detection and isolation with the squared prediction error (SPE) statistic and the Hotelling ( T 2) statistic are also realized in process monitoring. When a fault occurs, fault isolation based on the SPE statistic is achieved by residual contribution analysis of different variables. For fault isolation of subspace based on the T 2 statistic, the relationship between the statistic indicator and state variables is constructed, and the constraint conditions are presented to check the validity of fault isolation. Then, to improve the robustness of fault isolation to unexpected disturbances, the statistic method is adopted to set the relation between single subspace and multiple subspaces to increase the corrective rate of fault isolation. Finally fault detection and isolation based on the improved MPCA is used to monitor the automatic shift control system (ASCS) to prove the correctness and effectiveness of the algorithm. The research proposes a new subspace construction method to reduce the required storage capacity and to prove the robustness of the principal component model, and sets the relationship between the state variables and fault detection indicators for fault isolation.
A novel principal component analysis for spatially misaligned multivariate air pollution data.
Jandarov, Roman A; Sheppard, Lianne A; Sampson, Paul D; Szpiro, Adam A
2017-01-01
We propose novel methods for predictive (sparse) PCA with spatially misaligned data. These methods identify principal component loading vectors that explain as much variability in the observed data as possible, while also ensuring the corresponding principal component scores can be predicted accurately by means of spatial statistics at locations where air pollution measurements are not available. This will make it possible to identify important mixtures of air pollutants and to quantify their health effects in cohort studies, where currently available methods cannot be used. We demonstrate the utility of predictive (sparse) PCA in simulated data and apply the approach to annual averages of particulate matter speciation data from national Environmental Protection Agency (EPA) regulatory monitors.
An application of principal component analysis to the clavicle and clavicle fixation devices.
Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan
2010-03-26
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
NASA Technical Reports Server (NTRS)
Williams, D. L.; Borden, F. Y.
1977-01-01
Methods to accurately delineate the types of land cover in the urban-rural transition zone of metropolitan areas were considered. The application of principal components analysis to multidate LANDSAT imagery was investigated as a means of reducing the overlap between residential and agricultural spectral signatures. The statistical concepts of principal components analysis were discussed, as well as the results of this analysis when applied to multidate LANDSAT imagery of the Washington, D.C. metropolitan area.
Reconstruction Error and Principal Component Based Anomaly Detection in Hyperspectral Imagery
2014-03-27
2003), and (Jackson D. A., 1993). In 1933, Hotelling ( Hotelling , 1933), who coined the term ‘principal components,’ surmised that there was a...goodness of fit and multivariate quality control with the statistic Qi = (Xi(1×p) − X̂i(1×p) )(Xi(1×p) − X̂i(1×p) ) T (20) where, under the...sparsely targeted scenes through SNR or other methods. 5) Customize sorting and histogram construction methods in Multiple PCA to avoid redundancy
Mahler, Barbara J.
2008-01-01
The statistical analyses taken together indicate that the geochemistry at the freshwater-zone wells is more variable than that at the transition-zone wells. The geochemical variability at the freshwater-zone wells might result from dilution of ground water by meteoric water. This is indicated by relatively constant major ion molar ratios; a preponderance of positive correlations between SC, major ions, and trace elements; and a principal components analysis in which the major ions are strongly loaded on the first principal component. Much of the variability at three of the four transition-zone wells might result from the use of different laboratory analytical methods or reporting procedures during the period of sampling. This is reflected by a lack of correlation between SC and major ion concentrations at the transition-zone wells and by a principal components analysis in which the variability is fairly evenly distributed across several principal components. The statistical analyses further indicate that, although the transition-zone wells are less well connected to surficial hydrologic conditions than the freshwater-zone wells, there is some connection but the response time is longer.
NASA Astrophysics Data System (ADS)
Shirota, Yukari; Hashimoto, Takako; Fitri Sari, Riri
2018-03-01
It has been very significant to visualize time series big data. In the paper we shall discuss a new analysis method called “statistical shape analysis” or “geometry driven statistics” on time series statistical data in economics. In the paper, we analyse the agriculture, value added and industry, value added (percentage of GDP) changes from 2000 to 2010 in Asia. We handle the data as a set of landmarks on a two-dimensional image to see the deformation using the principal components. The point of the analysis method is the principal components of the given formation which are eigenvectors of its bending energy matrix. The local deformation can be expressed as the set of non-Affine transformations. The transformations give us information about the local differences between in 2000 and in 2010. Because the non-Affine transformation can be decomposed into a set of partial warps, we present the partial warps visually. The statistical shape analysis is widely used in biology but, in economics, no application can be found. In the paper, we investigate its potential to analyse the economic data.
Azevedo, C F; Nascimento, M; Silva, F F; Resende, M D V; Lopes, P S; Guimarães, S E F; Glória, L S
2015-10-09
A significant contribution of molecular genetics is the direct use of DNA information to identify genetically superior individuals. With this approach, genome-wide selection (GWS) can be used for this purpose. GWS consists of analyzing a large number of single nucleotide polymorphism markers widely distributed in the genome; however, because the number of markers is much larger than the number of genotyped individuals, and such markers are highly correlated, special statistical methods are widely required. Among these methods, independent component regression, principal component regression, partial least squares, and partial principal components stand out. Thus, the aim of this study was to propose an application of the methods of dimensionality reduction to GWS of carcass traits in an F2 (Piau x commercial line) pig population. The results show similarities between the principal and the independent component methods and provided the most accurate genomic breeding estimates for most carcass traits in pigs.
Tipton, John; Hooten, Mevin B.; Goring, Simon
2017-01-01
Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio-temporal temperature from a very sparse historical record.
The Influence Function of Principal Component Analysis by Self-Organizing Rule.
Higuchi; Eguchi
1998-07-28
This article is concerned with a neural network approach to principal component analysis (PCA). An algorithm for PCA by the self-organizing rule has been proposed and its robustness observed through the simulation study by Xu and Yuille (1995). In this article, the robustness of the algorithm against outliers is investigated by using the theory of influence function. The influence function of the principal component vector is given in an explicit form. Through this expression, the method is shown to be robust against any directions orthogonal to the principal component vector. In addition, a statistic generated by the self-organizing rule is proposed to assess the influence of data in PCA.
Surzhikov, V D; Surzhikov, D V
2014-01-01
The search and measurement of causal relationships between exposure to air pollution and health state of the population is based on the system analysis and risk assessment to improve the quality of research. With this purpose there is applied the modern statistical analysis with the use of criteria of independence, principal component analysis and discriminate function analysis. As a result of analysis out of all atmospheric pollutants there were separated four main components: for diseases of the circulatory system main principal component is implied with concentrations of suspended solids, nitrogen dioxide, carbon monoxide, hydrogen fluoride, for the respiratory diseases the main c principal component is closely associated with suspended solids, sulfur dioxide and nitrogen dioxide, charcoal black. The discriminant function was shown to be used as a measure of the level of air pollution.
USDA-ARS?s Scientific Manuscript database
Six methods were compared with respect to spectral fingerprinting of a well-characterized series of broccoli samples. Spectral fingerprints were acquired for finely-powdered solid samples using Fourier transform-infrared (IR) and Fourier transform-near infrared (NIR) spectrometry and for aqueous met...
Mansouri, Majdi; Nounou, Mohamed N; Nounou, Hazem N
2017-09-01
In our previous work, we have demonstrated the effectiveness of the linear multiscale principal component analysis (PCA)-based moving window (MW)-generalized likelihood ratio test (GLRT) technique over the classical PCA and multiscale principal component analysis (MSPCA)-based GLRT methods. The developed fault detection algorithm provided optimal properties by maximizing the detection probability for a particular false alarm rate (FAR) with different values of windows, and however, most real systems are nonlinear, which make the linear PCA method not able to tackle the issue of non-linearity to a great extent. Thus, in this paper, first, we apply a nonlinear PCA to obtain an accurate principal component of a set of data and handle a wide range of nonlinearities using the kernel principal component analysis (KPCA) model. The KPCA is among the most popular nonlinear statistical methods. Second, we extend the MW-GLRT technique to one that utilizes exponential weights to residuals in the moving window (instead of equal weightage) as it might be able to further improve fault detection performance by reducing the FAR using exponentially weighed moving average (EWMA). The developed detection method, which is called EWMA-GLRT, provides improved properties, such as smaller missed detection and FARs and smaller average run length. The idea behind the developed EWMA-GLRT is to compute a new GLRT statistic that integrates current and previous data information in a decreasing exponential fashion giving more weight to the more recent data. This provides a more accurate estimation of the GLRT statistic and provides a stronger memory that will enable better decision making with respect to fault detection. Therefore, in this paper, a KPCA-based EWMA-GLRT method is developed and utilized in practice to improve fault detection in biological phenomena modeled by S-systems and to enhance monitoring process mean. The idea behind a KPCA-based EWMA-GLRT fault detection algorithm is to combine the advantages brought forward by the proposed EWMA-GLRT fault detection chart with the KPCA model. Thus, it is used to enhance fault detection of the Cad System in E. coli model through monitoring some of the key variables involved in this model such as enzymes, transport proteins, regulatory proteins, lysine, and cadaverine. The results demonstrate the effectiveness of the proposed KPCA-based EWMA-GLRT method over Q , GLRT, EWMA, Shewhart, and moving window-GLRT methods. The detection performance is assessed and evaluated in terms of FAR, missed detection rates, and average run length (ARL 1 ) values.
HAYAT, ALIASGHAR; ABDOLLAHI, BIJAN; ZAINABADI, HASAN REZA; ARASTEH, HAMID REZA
2015-01-01
Introduction The increased emphasis on standards-based school accountability since the passage of the no child left behind act of 2001 is focusing critical attention on the professional development of school principals and their ability to meet the challenges of improving the student outcomes. Due to this subject, the current study examined professional development needs of Shiraz high schools principals. Methods The statistical population consisted of 343 principals of Shiraz high schools, of whom 250 subjects were selected using Krejcie and Morgan (1978) sample size determination table. To collect the data, a questionnaire developed by Salazar (2007) was administered. This questionnaire was designed for professional development in the leadership skills/competencies and consisted of 25 items in each leadership performance domain using five-point Likert-type scales. The content validity of the questionnaire was confirmed and the Cronbach’s Alpha coefficient was (a=0.78). To analyze the data, descriptive statistics and Paired-Samples t-test were used. Also, the data was analyzed through SPSS14 software. Results The findings showed that principals’ “Importance” ratings were always higher than their “Actual proficiency” ratings. The mean score of the difference between “Importance” and “Actual proficiency” pair on “Organizing resources” was 2.11, making it the highest “need” area. The lowest need area was “Managing the organization and operational procedures” at 0.81. Also, the results showed that there was a statistically significant difference between the means of the “Importance” and the corresponding means on the “Actual proficiency” (Difference of means=1.48, t=49.38, p<0.001). Conclusion Based on the obtained results, the most important professional development needs of the principals included organizing resources, resolving complex problems, understanding student development and learning, developing the vision and the mission, building team commitment, understanding measurements, evaluation and assessment strategies, facilitating the change process, solving problems and making decisions. In other words, the principals had statistically significant professional development needs in all areas of the educational leadership. Also, the results suggested that today’s school principals need to grow and learn throughout their careers by ongoing professional development. PMID:26269786
ERIC Educational Resources Information Center
Yung-Kuan, Chan; Hsieh, Ming-Yuan; Lee, Chin-Feng; Huang, Chih-Cheng; Ho, Li-Chih
2017-01-01
Under the hyper-dynamic education situation, this research, in order to comprehensively explore the interplays between Teacher Competence Demands (TCD) and Learning Organization Requests (LOR), cross-employs the data refined method of Descriptive Statistics (DS) method and Analysis of Variance (ANOVA) and Principal Components Analysis (PCA)…
Missing data is a common problem in the application of statistical techniques. In principal component analysis (PCA), a technique for dimensionality reduction, incomplete data points are either discarded or imputed using interpolation methods. Such approaches are less valid when ...
Research of facial feature extraction based on MMC
NASA Astrophysics Data System (ADS)
Xue, Donglin; Zhao, Jiufen; Tang, Qinhong; Shi, Shaokun
2017-07-01
Based on the maximum margin criterion (MMC), a new algorithm of statistically uncorrelated optimal discriminant vectors and a new algorithm of orthogonal optimal discriminant vectors for feature extraction were proposed. The purpose of the maximum margin criterion is to maximize the inter-class scatter while simultaneously minimizing the intra-class scatter after the projection. Compared with original MMC method and principal component analysis (PCA) method, the proposed methods are better in terms of reducing or eliminating the statistically correlation between features and improving recognition rate. The experiment results on Olivetti Research Laboratory (ORL) face database shows that the new feature extraction method of statistically uncorrelated maximum margin criterion (SUMMC) are better in terms of recognition rate and stability. Besides, the relations between maximum margin criterion and Fisher criterion for feature extraction were revealed.
Lindsell, Christopher J.; Welty, Leah J.; Mazumdar, Madhu; Thurston, Sally W.; Rahbar, Mohammad H.; Carter, Rickey E.; Pollock, Bradley H.; Cucchiara, Andrew J.; Kopras, Elizabeth J.; Jovanovic, Borko D.; Enders, Felicity T.
2014-01-01
Abstract Introduction Statistics is an essential training component for a career in clinical and translational science (CTS). Given the increasing complexity of statistics, learners may have difficulty selecting appropriate courses. Our question was: what depth of statistical knowledge do different CTS learners require? Methods For three types of CTS learners (principal investigator, co‐investigator, informed reader of the literature), each with different backgrounds in research (no previous research experience, reader of the research literature, previous research experience), 18 experts in biostatistics, epidemiology, and research design proposed levels for 21 statistical competencies. Results Statistical competencies were categorized as fundamental, intermediate, or specialized. CTS learners who intend to become independent principal investigators require more specialized training, while those intending to become informed consumers of the medical literature require more fundamental education. For most competencies, less training was proposed for those with more research background. Discussion When selecting statistical coursework, the learner's research background and career goal should guide the decision. Some statistical competencies are considered to be more important than others. Baseline knowledge assessments may help learners identify appropriate coursework. Conclusion Rather than one size fits all, tailoring education to baseline knowledge, learner background, and future goals increases learning potential while minimizing classroom time. PMID:25212569
Code of Federal Regulations, 2014 CFR
2014-07-01
... methods employed in statistical compilations. The principal title of each exhibit should state what it... furnished: (i) Market research. (a) The following data and information shall be provided: (1) A clear and detailed description of the sample, observational, and data preparation designs, including definitions of...
Code of Federal Regulations, 2013 CFR
2013-07-01
... item of information used and the methods employed in statistical compilations. The principal title of... furnished: (i) Market research. (a) The following data and information shall be provided: (1) A clear and detailed description of the sample, observational, and data preparation designs, including definitions of...
Code of Federal Regulations, 2012 CFR
2012-07-01
... item of information used and the methods employed in statistical compilations. The principal title of... should be furnished: (i) Market research. (a) The following data and information shall be provided: (1) A clear and detailed description of the sample, observational, and data preparation designs, including...
Code of Federal Regulations, 2011 CFR
2011-07-01
... item of information used and the methods employed in statistical compilations. The principal title of... should be furnished: (i) Market research. (a) The following data and information shall be provided: (1) A clear and detailed description of the sample, observational, and data preparation designs, including...
Code of Federal Regulations, 2010 CFR
2010-07-01
... item of information used and the methods employed in statistical compilations. The principal title of... should be furnished: (i) Market research. (a) The following data and information shall be provided: (1) A clear and detailed description of the sample, observational, and data preparation designs, including...
A Genealogical Interpretation of Principal Components Analysis
McVean, Gil
2009-01-01
Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's fst and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference. PMID:19834557
Principal Component Analysis in the Spectral Analysis of the Dynamic Laser Speckle Patterns
NASA Astrophysics Data System (ADS)
Ribeiro, K. M.; Braga, R. A., Jr.; Horgan, G. W.; Ferreira, D. D.; Safadi, T.
2014-02-01
Dynamic laser speckle is a phenomenon that interprets an optical patterns formed by illuminating a surface under changes with coherent light. Therefore, the dynamic change of the speckle patterns caused by biological material is known as biospeckle. Usually, these patterns of optical interference evolving in time are analyzed by graphical or numerical methods, and the analysis in frequency domain has also been an option, however involving large computational requirements which demands new approaches to filter the images in time. Principal component analysis (PCA) works with the statistical decorrelation of data and it can be used as a data filtering. In this context, the present work evaluated the PCA technique to filter in time the data from the biospeckle images aiming the reduction of time computer consuming and improving the robustness of the filtering. It was used 64 images of biospeckle in time observed in a maize seed. The images were arranged in a data matrix and statistically uncorrelated by PCA technique, and the reconstructed signals were analyzed using the routine graphical and numerical methods to analyze the biospeckle. Results showed the potential of the PCA tool in filtering the dynamic laser speckle data, with the definition of markers of principal components related to the biological phenomena and with the advantage of fast computational processing.
Considering Horn's Parallel Analysis from a Random Matrix Theory Point of View.
Saccenti, Edoardo; Timmerman, Marieke E
2017-03-01
Horn's parallel analysis is a widely used method for assessing the number of principal components and common factors. We discuss the theoretical foundations of parallel analysis for principal components based on a covariance matrix by making use of arguments from random matrix theory. In particular, we show that (i) for the first component, parallel analysis is an inferential method equivalent to the Tracy-Widom test, (ii) its use to test high-order eigenvalues is equivalent to the use of the joint distribution of the eigenvalues, and thus should be discouraged, and (iii) a formal test for higher-order components can be obtained based on a Tracy-Widom approximation. We illustrate the performance of the two testing procedures using simulated data generated under both a principal component model and a common factors model. For the principal component model, the Tracy-Widom test performs consistently in all conditions, while parallel analysis shows unpredictable behavior for higher-order components. For the common factor model, including major and minor factors, both procedures are heuristic approaches, with variable performance. We conclude that the Tracy-Widom procedure is preferred over parallel analysis for statistically testing the number of principal components based on a covariance matrix.
NASA Astrophysics Data System (ADS)
Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.
2008-04-01
Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.
NASA Astrophysics Data System (ADS)
Chavez, Roberto; Lozano, Sergio; Correia, Pedro; Sanz-Rodrigo, Javier; Probst, Oliver
2013-04-01
With the purpose of efficiently and reliably generating long-term wind resource maps for the wind energy industry, the application and verification of a statistical methodology for the climate downscaling of wind fields at surface level is presented in this work. This procedure is based on the combination of the Monte Carlo and the Principal Component Analysis (PCA) statistical methods. Firstly the Monte Carlo method is used to create a huge number of daily-based annual time series, so called climate representative years, by the stratified sampling of a 33-year-long time series corresponding to the available period of the NCAR/NCEP global reanalysis data set (R-2). Secondly the representative years are evaluated such that the best set is chosen according to its capability to recreate the Sea Level Pressure (SLP) temporal and spatial fields from the R-2 data set. The measure of this correspondence is based on the Euclidean distance between the Empirical Orthogonal Functions (EOF) spaces generated by the PCA (Principal Component Analysis) decomposition of the SLP fields from both the long-term and the representative year data sets. The methodology was verified by comparing the selected 365-days period against a 9-year period of wind fields generated by dynamical downscaling the Global Forecast System data with the mesoscale model SKIRON for the Iberian Peninsula. These results showed that, compared to the traditional method of dynamical downscaling any random 365-days period, the error in the average wind velocity by the PCA's representative year was reduced by almost 30%. Moreover the Mean Absolute Errors (MAE) in the monthly and daily wind profiles were also reduced by almost 25% along all SKIRON grid points. These results showed also that the methodology presented maximum error values in the wind speed mean of 0.8 m/s and maximum MAE in the monthly curves of 0.7 m/s. Besides the bulk numbers, this work shows the spatial distribution of the errors across the Iberian domain and additional wind statistics such as the velocity and directional frequency. Additional repetitions were performed to prove the reliability and robustness of this kind-of statistical-dynamical downscaling method.
Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg
2015-03-01
Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.
Nariai, N; Kim, S; Imoto, S; Miyano, S
2004-01-01
We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.
NASA Astrophysics Data System (ADS)
O'Shea, Bethany; Jankowski, Jerzy
2006-12-01
The major ion composition of Great Artesian Basin groundwater in the lower Namoi River valley is relatively homogeneous in chemical composition. Traditional graphical techniques have been combined with multivariate statistical methods to determine whether subtle differences in the chemical composition of these waters can be delineated. Hierarchical cluster analysis and principal components analysis were successful in delineating minor variations within the groundwaters of the study area that were not visually identified in the graphical techniques applied. Hydrochemical interpretation allowed geochemical processes to be identified in each statistically defined water type and illustrated how these groundwaters differ from one another. Three main geochemical processes were identified in the groundwaters: ion exchange, precipitation, and mixing between waters from different sources. Both statistical methods delineated an anomalous sample suspected of being influenced by magmatic CO2 input. The use of statistical methods to complement traditional graphical techniques for waters appearing homogeneous is emphasized for all investigations of this type. Copyright
Non-rigid image registration using a statistical spline deformation model.
Loeckx, Dirk; Maes, Frederik; Vandermeulen, Dirk; Suetens, Paul
2003-07-01
We propose a statistical spline deformation model (SSDM) as a method to solve non-rigid image registration. Within this model, the deformation is expressed using a statistically trained B-spline deformation mesh. The model is trained by principal component analysis of a training set. This approach allows to reduce the number of degrees of freedom needed for non-rigid registration by only retaining the most significant modes of variation observed in the training set. User-defined transformation components, like affine modes, are merged with the principal components into a unified framework. Optimization proceeds along the transformation components rather then along the individual spline coefficients. The concept of SSDM's is applied to the temporal registration of thorax CR-images using pattern intensity as the registration measure. Our results show that, using 30 training pairs, a reduction of 33% is possible in the number of degrees of freedom without deterioration of the result. The same accuracy as without SSDM's is still achieved after a reduction up to 66% of the degrees of freedom.
Lindsey, David A.; Tysdal, Russell G.; Taggart, Joseph E.
2002-01-01
The principal purpose of this report is to provide a reference archive for results of a statistical analysis of geochemical data for metasedimentary rocks of Mesoproterozoic age of the Salmon River Mountains and Lemhi Range, central Idaho. Descriptions of geochemical data sets, statistical methods, rationale for interpretations, and references to the literature are provided. Three methods of analysis are used: R-mode factor analysis of major oxide and trace element data for identifying petrochemical processes, analysis of variance for effects of rock type and stratigraphic position on chemical composition, and major-oxide ratio plots for comparison with the chemical composition of common clastic sedimentary rocks.
Independent component analysis for automatic note extraction from musical trills
NASA Astrophysics Data System (ADS)
Brown, Judith C.; Smaragdis, Paris
2004-05-01
The method of principal component analysis, which is based on second-order statistics (or linear independence), has long been used for redundancy reduction of audio data. The more recent technique of independent component analysis, enforcing much stricter statistical criteria based on higher-order statistical independence, is introduced and shown to be far superior in separating independent musical sources. This theory has been applied to piano trills and a database of trill rates was assembled from experiments with a computer-driven piano, recordings of a professional pianist, and commercially available compact disks. The method of independent component analysis has thus been shown to be an outstanding, effective means of automatically extracting interesting musical information from a sea of redundant data.
Zeng, Irene Sui Lan; Lumley, Thomas
2018-01-01
Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
Nonequilibrium Statistical Operator Method and Generalized Kinetic Equations
NASA Astrophysics Data System (ADS)
Kuzemsky, A. L.
2018-01-01
We consider some principal problems of nonequilibrium statistical thermodynamics in the framework of the Zubarev nonequilibrium statistical operator approach. We present a brief comparative analysis of some approaches to describing irreversible processes based on the concept of nonequilibrium Gibbs ensembles and their applicability to describing nonequilibrium processes. We discuss the derivation of generalized kinetic equations for a system in a heat bath. We obtain and analyze a damped Schrödinger-type equation for a dynamical system in a heat bath. We study the dynamical behavior of a particle in a medium taking the dissipation effects into account. We consider the scattering problem for neutrons in a nonequilibrium medium and derive a generalized Van Hove formula. We show that the nonequilibrium statistical operator method is an effective, convenient tool for describing irreversible processes in condensed matter.
Ghosh, Debasree; Chattopadhyay, Parimal
2012-06-01
The objective of the work was to use the method of quantitative descriptive analysis (QDA) to describe the sensory attributes of the fermented food products prepared with the incorporation of lactic cultures. Panellists were selected and trained to evaluate various attributes specially color and appearance, body texture, flavor, overall acceptability and acidity of the fermented food products like cow milk curd and soymilk curd, idli, sauerkraut and probiotic ice cream. Principal component analysis (PCA) identified the six significant principal components that accounted for more than 90% of the variance in the sensory attribute data. Overall product quality was modelled as a function of principal components using multiple least squares regression (R (2) = 0.8). The result from PCA was statistically analyzed by analysis of variance (ANOVA). These findings demonstrate the utility of quantitative descriptive analysis for identifying and measuring the fermented food product attributes that are important for consumer acceptability.
NASA Astrophysics Data System (ADS)
Li, Jiangtong; Luo, Yongdao; Dai, Honglin
2018-01-01
Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Spectral gene set enrichment (SGSE).
Frost, H Robert; Li, Zhigang; Moore, Jason H
2015-03-03
Gene set testing is typically performed in a supervised context to quantify the association between groups of genes and a clinical phenotype. In many cases, however, a gene set-based interpretation of genomic data is desired in the absence of a phenotype variable. Although methods exist for unsupervised gene set testing, they predominantly compute enrichment relative to clusters of the genomic variables with performance strongly dependent on the clustering algorithm and number of clusters. We propose a novel method, spectral gene set enrichment (SGSE), for unsupervised competitive testing of the association between gene sets and empirical data sources. SGSE first computes the statistical association between gene sets and principal components (PCs) using our principal component gene set enrichment (PCGSE) method. The overall statistical association between each gene set and the spectral structure of the data is then computed by combining the PC-level p-values using the weighted Z-method with weights set to the PC variance scaled by Tracy-Widom test p-values. Using simulated data, we show that the SGSE algorithm can accurately recover spectral features from noisy data. To illustrate the utility of our method on real data, we demonstrate the superior performance of the SGSE method relative to standard cluster-based techniques for testing the association between MSigDB gene sets and the variance structure of microarray gene expression data. Unsupervised gene set testing can provide important information about the biological signal held in high-dimensional genomic data sets. Because it uses the association between gene sets and samples PCs to generate a measure of unsupervised enrichment, the SGSE method is independent of cluster or network creation algorithms and, most importantly, is able to utilize the statistical significance of PC eigenvalues to ignore elements of the data most likely to represent noise.
Correcting for population structure and kinship using the linear mixed model: theory and extensions.
Hoffman, Gabriel E
2013-01-01
Population structure and kinship are widespread confounding factors in genome-wide association studies (GWAS). It has been standard practice to include principal components of the genotypes in a regression model in order to account for population structure. More recently, the linear mixed model (LMM) has emerged as a powerful method for simultaneously accounting for population structure and kinship. The statistical theory underlying the differences in empirical performance between modeling principal components as fixed versus random effects has not been thoroughly examined. We undertake an analysis to formalize the relationship between these widely used methods and elucidate the statistical properties of each. Moreover, we introduce a new statistic, effective degrees of freedom, that serves as a metric of model complexity and a novel low rank linear mixed model (LRLMM) to learn the dimensionality of the correction for population structure and kinship, and we assess its performance through simulations. A comparison of the results of LRLMM and a standard LMM analysis applied to GWAS data from the Multi-Ethnic Study of Atherosclerosis (MESA) illustrates how our theoretical results translate into empirical properties of the mixed model. Finally, the analysis demonstrates the ability of the LRLMM to substantially boost the strength of an association for HDL cholesterol in Europeans.
Joint principal trend analysis for longitudinal high-dimensional data.
Zhang, Yuping; Ouyang, Zhengqing
2018-06-01
We consider a research scenario motivated by integrating multiple sources of information for better knowledge discovery in diverse dynamic biological processes. Given two longitudinal high-dimensional datasets for a group of subjects, we want to extract shared latent trends and identify relevant features. To solve this problem, we present a new statistical method named as joint principal trend analysis (JPTA). We demonstrate the utility of JPTA through simulations and applications to gene expression data of the mammalian cell cycle and longitudinal transcriptional profiling data in response to influenza viral infections. © 2017, The International Biometric Society.
Reese, Sarah E; Archer, Kellie J; Therneau, Terry M; Atkinson, Elizabeth J; Vachon, Celine M; de Andrade, Mariza; Kocher, Jean-Pierre A; Eckel-Passow, Jeanette E
2013-11-15
Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal component analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of PCA to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test whether a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays, whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We developed a new statistic that uses gPCA to identify whether batch effects exist in high-throughput genomic data. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. The gPCA R package (Available via CRAN) provides functionality and data to perform the methods in this article. reesese@vcu.edu
Spiric, Aurelija; Trbovic, Dejana; Vranic, Danijela; Djinovic, Jasna; Petronijevic, Radivoj; Matekalo-Sverak, Vesna
2010-07-05
Studies performed on lipid extraction from animal and fish tissues do not provide information on its influence on fatty acid composition of the extracted lipids as well as on cholesterol content. Data presented in this paper indicate the impact of extraction procedures on fatty acid profile of fish lipids extracted by the modified Soxhlet and ASE (accelerated solvent extraction) procedure. Cholesterol was also determined by direct saponification method, too. Student's paired t-test used for comparison of the total fat content in carp fish population obtained by two extraction methods shows that differences between values of the total fat content determined by ASE and modified Soxhlet method are not statistically significant. Values obtained by three different methods (direct saponification, ASE and modified Soxhlet method), used for determination of cholesterol content in carp, were compared by one-way analysis of variance (ANOVA). The obtained results show that modified Soxhlet method gives results which differ significantly from the results obtained by direct saponification and ASE method. However the results obtained by direct saponification and ASE method do not differ significantly from each other. The highest quantities for cholesterol (37.65 to 65.44 mg/100 g) in the analyzed fish muscle were obtained by applying direct saponification method, as less destructive one, followed by ASE (34.16 to 52.60 mg/100 g) and modified Soxhlet extraction method (10.73 to 30.83 mg/100 g). Modified Soxhlet method for extraction of fish lipids gives higher values for n-6 fatty acids than ASE method (t(paired)=3.22 t(c)=2.36), while there is no statistically significant difference in the n-3 content levels between the methods (t(paired)=1.31). The UNSFA/SFA ratio obtained by using modified Soxhlet method is also higher than the ratio obtained using ASE method (t(paired)=4.88 t(c)=2.36). Results of Principal Component Analysis (PCA) showed that the highest positive impact to the second principal component (PC2) is recorded by C18:3 n-3, and C20:3 n-6, being present in a higher amount in the samples treated by the modified Soxhlet extraction, while C22:5 n-3, C20:3 n-3, C22:1 and C20:4, C16 and C18 negatively influence the score values of the PC2, showing significantly increased level in the samples treated by ASE method. Hotelling's paired T-square test used on the first three principal components for confirmation of differences in individual fatty acid content obtained by ASE and Soxhlet method in carp muscle showed statistically significant difference between these two data sets (T(2)=161.308, p<0.001). Copyright 2010 Elsevier B.V. All rights reserved.
Online signature recognition using principal component analysis and artificial neural network
NASA Astrophysics Data System (ADS)
Hwang, Seung-Jun; Park, Seung-Je; Baek, Joong-Hwan
2016-12-01
In this paper, we propose an algorithm for on-line signature recognition using fingertip point in the air from the depth image acquired by Kinect. We extract 10 statistical features from X, Y, Z axis, which are invariant to changes in shifting and scaling of the signature trajectories in three-dimensional space. Artificial neural network is adopted to solve the complex signature classification problem. 30 dimensional features are converted into 10 principal components using principal component analysis, which is 99.02% of total variances. We implement the proposed algorithm and test to actual on-line signatures. In experiment, we verify the proposed method is successful to classify 15 different on-line signatures. Experimental result shows 98.47% of recognition rate when using only 10 feature vectors.
A Comparison of Analytical and Data Preprocessing Methods for Spectral Fingerprinting
LUTHRIA, DEVANAND L.; MUKHOPADHYAY, SUDARSAN; LIN, LONG-ZE; HARNLY, JAMES M.
2013-01-01
Spectral fingerprinting, as a method of discriminating between plant cultivars and growing treatments for a common set of broccoli samples, was compared for six analytical instruments. Spectra were acquired for finely powdered solid samples using Fourier transform infrared (FT-IR) and Fourier transform near-infrared (NIR) spectrometry. Spectra were also acquired for unfractionated aqueous methanol extracts of the powders using molecular absorption in the ultraviolet (UV) and visible (VIS) regions and mass spectrometry with negative (MS−) and positive (MS+) ionization. The spectra were analyzed using nested one-way analysis of variance (ANOVA) and principal component analysis (PCA) to statistically evaluate the quality of discrimination. All six methods showed statistically significant differences between the cultivars and treatments. The significance of the statistical tests was improved by the judicious selection of spectral regions (IR and NIR), masses (MS+ and MS−), and derivatives (IR, NIR, UV, and VIS). PMID:21352644
Principal Curves on Riemannian Manifolds.
Hauberg, Soren
2016-09-01
Euclidean statistics are often generalized to Riemannian manifolds by replacing straight-line interpolations with geodesic ones. While these Riemannian models are familiar-looking, they are restricted by the inflexibility of geodesics, and they rely on constructions which are optimal only in Euclidean domains. We consider extensions of Principal Component Analysis (PCA) to Riemannian manifolds. Classic Riemannian approaches seek a geodesic curve passing through the mean that optimizes a criteria of interest. The requirements that the solution both is geodesic and must pass through the mean tend to imply that the methods only work well when the manifold is mostly flat within the support of the generating distribution. We argue that instead of generalizing linear Euclidean models, it is more fruitful to generalize non-linear Euclidean models. Specifically, we extend the classic Principal Curves from Hastie & Stuetzle to data residing on a complete Riemannian manifold. We show that for elliptical distributions in the tangent of spaces of constant curvature, the standard principal geodesic is a principal curve. The proposed model is simple to compute and avoids many of the pitfalls of traditional geodesic approaches. We empirically demonstrate the effectiveness of the Riemannian principal curves on several manifolds and datasets.
Let's Keep Our Quality School Principals on the Job
ERIC Educational Resources Information Center
Norton, M. Scott
2003-01-01
Research studies strongly support the fact that the leadership of the school principal impacts directly on the climate of the school and, in turn, on student achievement. National statistics relating to principal turnover and dwindling supplies of qualified replacements show clearly that principal turnover has reached crisis proportions.…
ERIC Educational Resources Information Center
Muslihah, Oleh Eneng
2015-01-01
The research examines the correlation between the understanding of school-based management, emotional intelligences and headmaster performance. Data was collected, using quantitative methods. The statistical analysis used was the Pearson Correlation, and multivariate regression analysis. The results of this research suggest firstly that there is…
ERIC Educational Resources Information Center
Ross, Christine; Herrmann, Mariesa; Angus, Megan Hague
2015-01-01
The purpose of this study was to describe the measures used to evaluate principals in New Jersey in the first (pilot) year of the new principal evaluation system and examine three of the statistical properties of the measures: their variation among principals, their year-to-year stability, and the associations between these measures and the…
Gabriel, Erin E; Gilbert, Peter B
2014-04-01
Principal surrogate (PS) endpoints are relatively inexpensive and easy to measure study outcomes that can be used to reliably predict treatment effects on clinical endpoints of interest. Few statistical methods for assessing the validity of potential PSs utilize time-to-event clinical endpoint information and to our knowledge none allow for the characterization of time-varying treatment effects. We introduce the time-dependent and surrogate-dependent treatment efficacy curve, ${\\mathrm {TE}}(t|s)$, and a new augmented trial design for assessing the quality of a biomarker as a PS. We propose a novel Weibull model and an estimated maximum likelihood method for estimation of the ${\\mathrm {TE}}(t|s)$ curve. We describe the operating characteristics of our methods via simulations. We analyze data from the Diabetes Control and Complications Trial, in which we find evidence of a biomarker with value as a PS.
Shahlaei, Mohsen; Sabet, Razieh; Ziari, Maryam Bahman; Moeinifard, Behzad; Fassihi, Afshin; Karbakhsh, Reza
2010-10-01
Quantitative relationships between molecular structure and methionine aminopeptidase-2 inhibitory activity of a series of cytotoxic anthranilic acid sulfonamide derivatives were discovered. We have demonstrated the detailed application of two efficient nonlinear methods for evaluation of quantitative structure-activity relationships of the studied compounds. Components produced by principal component analysis as input of developed nonlinear models were used. The performance of the developed models namely PC-GRNN and PC-LS-SVM were tested by several validation methods. The resulted PC-LS-SVM model had a high statistical quality (R(2)=0.91 and R(CV)(2)=0.81) for predicting the cytotoxic activity of the compounds. Comparison between predictability of PC-GRNN and PC-LS-SVM indicates that later method has higher ability to predict the activity of the studied molecules. Copyright (c) 2010 Elsevier Masson SAS. All rights reserved.
[Statistical analysis of German radiologic periodicals: developmental trends in the last 10 years].
Golder, W
1999-09-01
To identify which statistical tests are applied in German radiological publications, to what extent their use has changed during the last decade, and which factors might be responsible for this development. The major articles published in "ROFO" and "DER RADIOLOGE" during 1988, 1993 and 1998 were reviewed for statistical content. The contributions were classified by principal focus and radiological subspecialty. The methods used were assigned to descriptive, basal and advanced statistics. Sample size, significance level and power were established. The use of experts' assistance was monitored. Finally, we calculated the so-called cumulative accessibility of the publications. 525 contributions were found to be eligible. In 1988, 87% used descriptive statistics only, 12.5% basal, and 0.5% advanced statistics. The corresponding figures in 1993 and 1998 are 62 and 49%, 32 and 41%, and 6 and 10%, respectively. Statistical techniques were most likely to be used in research on musculoskeletal imaging and articles dedicated to MRI. Six basic categories of statistical methods account for the complete statistical analysis appearing in 90% of the articles. ROC analysis is the single most common advanced technique. Authors make increasingly use of statistical experts' opinion and programs. During the last decade, the use of statistical methods in German radiological journals has fundamentally improved, both quantitatively and qualitatively. Presently, advanced techniques account for 20% of the pertinent statistical tests. This development seems to be promoted by the increasing availability of statistical analysis software.
ERIC Educational Resources Information Center
Hendrix, Dean
2010-01-01
This study analyzed 2005-2006 Web of Science bibliometric data from institutions belonging to the Association of Research Libraries (ARL) and corresponding ARL statistics to find any associations between indicators from the two data sets. Principal components analysis on 36 variables from 103 universities revealed obvious associations between…
Tsai, Tsung-Yuan; Li, Jing-Sheng; Wang, Shaobai; Li, Pingyue; Kwon, Young-Min; Li, Guoan
2013-01-01
The statistical shape model (SSM) method that uses 2D images of the knee joint to predict the 3D joint surface model has been reported in literature. In this study, we constructed a SSM database using 152 human CT knee joint models, including the femur, tibia and patella and analyzed the characteristics of each principal component of the SSM. The surface models of two in vivo knees were predicted using the SSM and their 2D bi-plane fluoroscopic images. The predicted models were compared to their CT joint models. The differences between the predicted 3D knee joint surfaces and the CT image-based surfaces were 0.30 ± 0.81 mm, 0.34 ± 0.79 mm and 0.36 ± 0.59 mm for the femur, tibia and patella, respectively (average ± standard deviation). The computational time for each bone of the knee joint was within 30 seconds using a personal computer. The analysis of this study indicated that the SSM method could be a useful tool to construct 3D surface models of the knee with sub-millimeter accuracy in real time. Thus it may have a broad application in computer assisted knee surgeries that require 3D surface models of the knee. PMID:24156375
NASA Astrophysics Data System (ADS)
Attallah, Bilal; Serir, Amina; Chahir, Youssef; Boudjelal, Abdelwahhab
2017-11-01
Palmprint recognition systems are dependent on feature extraction. A method of feature extraction using higher discrimination information was developed to characterize palmprint images. In this method, two individual feature extraction techniques are applied to a discrete wavelet transform of a palmprint image, and their outputs are fused. The two techniques used in the fusion are the histogram of gradient and the binarized statistical image features. They are then evaluated using an extreme learning machine classifier before selecting a feature based on principal component analysis. Three palmprint databases, the Hong Kong Polytechnic University (PolyU) Multispectral Palmprint Database, Hong Kong PolyU Palmprint Database II, and the Delhi Touchless (IIDT) Palmprint Database, are used in this study. The study shows that our method effectively identifies and verifies palmprints and outperforms other methods based on feature extraction.
Saliba, Christopher M; Clouthier, Allison L; Brandon, Scott C E; Rainbow, Michael J; Deluzio, Kevin J
2018-05-29
Abnormal loading of the knee joint contributes to the pathogenesis of knee osteoarthritis. Gait retraining is a non-invasive intervention that aims to reduce knee loads by providing audible, visual, or haptic feedback of gait parameters. The computational expense of joint contact force prediction has limited real-time feedback to surrogate measures of the contact force, such as the knee adduction moment. We developed a method to predict knee joint contact forces using motion analysis and a statistical regression model that can be implemented in near real-time. Gait waveform variables were deconstructed using principal component analysis and a linear regression was used to predict the principal component scores of the contact force waveforms. Knee joint contact force waveforms were reconstructed using the predicted scores. We tested our method using a heterogenous population of asymptomatic controls and subjects with knee osteoarthritis. The reconstructed contact force waveforms had mean (SD) RMS differences of 0.17 (0.05) bodyweight compared to the contact forces predicted by a musculoskeletal model. Our method successfully predicted subject-specific shape features of contact force waveforms and is a potentially powerful tool in biofeedback and clinical gait analysis.
Assessment of technological level of stem cell research using principal component analysis.
Do Cho, Sung; Hwan Hyun, Byung; Kim, Jae Kyeom
2016-01-01
In general, technological levels have been assessed based on specialist's opinion through the methods such as Delphi. But in such cases, results could be significantly biased per study design and individual expert. In this study, therefore scientific literatures and patents were selected by means of analytic indexes for statistic approach and technical assessment of stem cell fields. The analytic indexes, numbers and impact indexes of scientific literatures and patents, were weighted based on principal component analysis, and then, were summated into the single value. Technological obsolescence was calculated through the cited half-life of patents issued by the United States Patents and Trademark Office and was reflected in technological level assessment. As results, ranks of each nation's in reference to the technology level were rated by the proposed method. Furthermore we were able to evaluate strengthens and weaknesses thereof. Although our empirical research presents faithful results, in the further study, there is a need to compare the existing methods and the suggested method.
Statistical learning and selective inference.
Taylor, Jonathan; Tibshirani, Robert J
2015-06-23
We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.
A method to estimate the effect of deformable image registration uncertainties on daily dose mapping
Murphy, Martin J.; Salguero, Francisco J.; Siebers, Jeffrey V.; Staub, David; Vaman, Constantin
2012-01-01
Purpose: To develop a statistical sampling procedure for spatially-correlated uncertainties in deformable image registration and then use it to demonstrate their effect on daily dose mapping. Methods: Sequential daily CT studies are acquired to map anatomical variations prior to fractionated external beam radiotherapy. The CTs are deformably registered to the planning CT to obtain displacement vector fields (DVFs). The DVFs are used to accumulate the dose delivered each day onto the planning CT. Each DVF has spatially-correlated uncertainties associated with it. Principal components analysis (PCA) is applied to measured DVF error maps to produce decorrelated principal component modes of the errors. The modes are sampled independently and reconstructed to produce synthetic registration error maps. The synthetic error maps are convolved with dose mapped via deformable registration to model the resulting uncertainty in the dose mapping. The results are compared to the dose mapping uncertainty that would result from uncorrelated DVF errors that vary randomly from voxel to voxel. Results: The error sampling method is shown to produce synthetic DVF error maps that are statistically indistinguishable from the observed error maps. Spatially-correlated DVF uncertainties modeled by our procedure produce patterns of dose mapping error that are different from that due to randomly distributed uncertainties. Conclusions: Deformable image registration uncertainties have complex spatial distributions. The authors have developed and tested a method to decorrelate the spatial uncertainties and make statistical samples of highly correlated error maps. The sample error maps can be used to investigate the effect of DVF uncertainties on daily dose mapping via deformable image registration. An initial demonstration of this methodology shows that dose mapping uncertainties can be sensitive to spatial patterns in the DVF uncertainties. PMID:22320766
Pathway analysis with next-generation sequencing data.
Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao
2015-04-01
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
Representation of Probability Density Functions from Orbit Determination using the Particle Filter
NASA Technical Reports Server (NTRS)
Mashiku, Alinda K.; Garrison, James; Carpenter, J. Russell
2012-01-01
Statistical orbit determination enables us to obtain estimates of the state and the statistical information of its region of uncertainty. In order to obtain an accurate representation of the probability density function (PDF) that incorporates higher order statistical information, we propose the use of nonlinear estimation methods such as the Particle Filter. The Particle Filter (PF) is capable of providing a PDF representation of the state estimates whose accuracy is dependent on the number of particles or samples used. For this method to be applicable to real case scenarios, we need a way of accurately representing the PDF in a compressed manner with little information loss. Hence we propose using the Independent Component Analysis (ICA) as a non-Gaussian dimensional reduction method that is capable of maintaining higher order statistical information obtained using the PF. Methods such as the Principal Component Analysis (PCA) are based on utilizing up to second order statistics, hence will not suffice in maintaining maximum information content. Both the PCA and the ICA are applied to two scenarios that involve a highly eccentric orbit with a lower apriori uncertainty covariance and a less eccentric orbit with a higher a priori uncertainty covariance, to illustrate the capability of the ICA in relation to the PCA.
Lu, Yuan-Chiao; Untaroiu, Costin D
2013-09-01
During car collisions, the shoulder belt exposes the occupant's clavicle to large loading conditions which often leads to a bone fracture. To better understand the geometric variability of clavicular cortical bone which may influence its injury tolerance, twenty human clavicles were evaluated using statistical shape analysis. The interior and exterior clavicular cortical bone surfaces were reconstructed from CT-scan images. Registration between one selected template and the remaining 19 clavicle models was conducted to remove translation and rotation differences. The correspondences of landmarks between the models were then established using coordinates and surface normals. Three registration methods were compared: the LM-ICP method; the global method; and the SHREC method. The LM-ICP registration method showed better performance than the global and SHREC registration methods, in terms of compactness, generalization, and specificity. The first four principal components obtained by using the LM-ICP registration method account for 61% and 67% of the overall anatomical variation for the exterior and interior cortical bone shapes, respectively. The length was found to be the most significant variation mode of the human clavicle. The mean and two boundary shape models were created using the four most significant principal components to investigate the size and shape variation of clavicular cortical bone. In the future, boundary shape models could be used to develop probabilistic finite element models which may help to better understand the variability in biomechanical responses and injuries to the clavicle. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Buonaccorsi, G A; Rose, C J; O'Connor, J P B; Roberts, C; Watson, Y; Jackson, A; Jayson, G C; Parker, G J M
2010-01-01
Clinical trials of anti-angiogenic and vascular-disrupting agents often use biomarkers derived from DCE-MRI, typically reporting whole-tumor summary statistics and so overlooking spatial parameter variations caused by tissue heterogeneity. We present a data-driven segmentation method comprising tracer-kinetic model-driven registration for motion correction, conversion from MR signal intensity to contrast agent concentration for cross-visit normalization, iterative principal components analysis for imputation of missing data and dimensionality reduction, and statistical outlier detection using the minimum covariance determinant to obtain a robust Mahalanobis distance. After applying these techniques we cluster in the principal components space using k-means. We present results from a clinical trial of a VEGF inhibitor, using time-series data selected because of problems due to motion and outlier time series. We obtained spatially-contiguous clusters that map to regions with distinct microvascular characteristics. This methodology has the potential to uncover localized effects in trials using DCE-MRI-based biomarkers.
Priority of VHS Development Based in Potential Area using Principal Component Analysis
NASA Astrophysics Data System (ADS)
Meirawan, D.; Ana, A.; Saripudin, S.
2018-02-01
The current condition of VHS is still inadequate in quality, quantity and relevance. The purpose of this research is to analyse the development of VHS based on the development of regional potential by using principal component analysis (PCA) in Bandung, Indonesia. This study used descriptive qualitative data analysis using the principle of secondary data reduction component. The method used is Principal Component Analysis (PCA) analysis with Minitab Statistics Software tool. The results of this study indicate the value of the lowest requirement is a priority of the construction of development VHS with a program of majors in accordance with the development of regional potential. Based on the PCA score found that the main priority in the development of VHS in Bandung is in Saguling, which has the lowest PCA value of 416.92 in area 1, Cihampelas with the lowest PCA value in region 2 and Padalarang with the lowest PCA value.
Chang, Cheng; Xu, Kaikun; Guo, Chaoping; Wang, Jinxia; Yan, Qi; Zhang, Jian; He, Fuchu; Zhu, Yunping
2018-05-22
Compared with the numerous software tools developed for identification and quantification of -omics data, there remains a lack of suitable tools for both downstream analysis and data visualization. To help researchers better understand the biological meanings in their -omics data, we present an easy-to-use tool, named PANDA-view, for both statistical analysis and visualization of quantitative proteomics data and other -omics data. PANDA-view contains various kinds of analysis methods such as normalization, missing value imputation, statistical tests, clustering and principal component analysis, as well as the most commonly-used data visualization methods including an interactive volcano plot. Additionally, it provides user-friendly interfaces for protein-peptide-spectrum representation of the quantitative proteomics data. PANDA-view is freely available at https://sourceforge.net/projects/panda-view/. 1987ccpacer@163.com and zhuyunping@gmail.com. Supplementary data are available at Bioinformatics online.
ERIC Educational Resources Information Center
Battle, Danielle
2010-01-01
While the National Center for Education Statistics (NCES) has conducted surveys of attrition and mobility among school teachers for two decades, little was known about similar movements among school principals. In order to inform discussions and decisions among policymakers, researchers, and parents, the 2008-09 Principal Follow-up Survey (PFS)…
Mishra-Kalyani, Pallavi S.; Johnson, Brent A.; Glass, Jonathan D.; Long, Qi
2016-01-01
Clinical disease registries offer a rich collection of valuable patient information but also pose challenges that require special care and attention in statistical analyses. The goal of this paper is to propose a statistical framework that allows for estimating the effect of surgical insertion of a percutaneous endogastrostomy (PEG) tube for patients living with amyotrophic lateral sclerosis (ALS) using data from a clinical registry. Although all ALS patients are informed about PEG, only some patients agree to the procedure which, leads to the potential for selection bias. Assessing the effect of PEG is further complicated by the aggressively fatal disease, such that time to death competes directly with both the opportunity to receive PEG and clinical outcome measurements. Our proposed methodology handles the “censoring by death” phenomenon through principal stratification and selection bias for PEG treatment through generalized propensity scores. We develop a fully Bayesian modeling approach to estimate the survivor average causal effect (SACE) of PEG on BMI, a surrogate outcome measure of nutrition and quality of life. The use of propensity score methods within the principal stratification framework demonstrates a significant and positive effect of PEG treatment, particularly when time of treatment is included in the treatment definition. PMID:27640365
NASA Astrophysics Data System (ADS)
Mishra-Kalyani, Pallavi S.; Johnson, Brent A.; Glass, Jonathan D.; Long, Qi
2016-09-01
Clinical disease registries offer a rich collection of valuable patient information but also pose challenges that require special care and attention in statistical analyses. The goal of this paper is to propose a statistical framework that allows for estimating the effect of surgical insertion of a percutaneous endogastrostomy (PEG) tube for patients living with amyotrophic lateral sclerosis (ALS) using data from a clinical registry. Although all ALS patients are informed about PEG, only some patients agree to the procedure which, leads to the potential for selection bias. Assessing the effect of PEG is further complicated by the aggressively fatal disease, such that time to death competes directly with both the opportunity to receive PEG and clinical outcome measurements. Our proposed methodology handles the “censoring by death” phenomenon through principal stratification and selection bias for PEG treatment through generalized propensity scores. We develop a fully Bayesian modeling approach to estimate the survivor average causal effect (SACE) of PEG on BMI, a surrogate outcome measure of nutrition and quality of life. The use of propensity score methods within the principal stratification framework demonstrates a significant and positive effect of PEG treatment, particularly when time of treatment is included in the treatment definition.
Fritscher, Karl; Schuler, Benedikt; Link, Thomas; Eckstein, Felix; Suhm, Norbert; Hänni, Markus; Hengg, Clemens; Schubert, Rainer
2008-01-01
Fractures of the proximal femur are one of the principal causes of mortality among elderly persons. Traditional methods for the determination of femoral fracture risk use methods for measuring bone mineral density. However, BMD alone is not sufficient to predict bone failure load for an individual patient and additional parameters have to be determined for this purpose. In this work an approach that uses statistical models of appearance to identify relevant regions and parameters for the prediction of biomechanical properties of the proximal femur will be presented. By using Support Vector Regression the proposed model based approach is capable of predicting two different biomechanical parameters accurately and fully automatically in two different testing scenarios.
NASA Astrophysics Data System (ADS)
Farsadnia, Farhad; Ghahreman, Bijan
2016-04-01
Hydrologic homogeneous group identification is considered both fundamental and applied research in hydrology. Clustering methods are among conventional methods to assess the hydrological homogeneous regions. Recently, Self-Organizing feature Map (SOM) method has been applied in some studies. However, the main problem of this method is the interpretation on the output map of this approach. Therefore, SOM is used as input to other clustering algorithms. The aim of this study is to apply a two-level Self-Organizing feature map and Ward hierarchical clustering method to determine the hydrologic homogenous regions in North and Razavi Khorasan provinces. At first by principal component analysis, we reduced SOM input matrix dimension, then the SOM was used to form a two-dimensional features map. To determine homogeneous regions for flood frequency analysis, SOM output nodes were used as input into the Ward method. Generally, the regions identified by the clustering algorithms are not statistically homogeneous. Consequently, they have to be adjusted to improve their homogeneity. After adjustment of the homogeneity regions by L-moment tests, five hydrologic homogeneous regions were identified. Finally, adjusted regions were created by a two-level SOM and then the best regional distribution function and associated parameters were selected by the L-moment approach. The results showed that the combination of self-organizing maps and Ward hierarchical clustering by principal components as input is more effective than the hierarchical method, by principal components or standardized inputs to achieve hydrologic homogeneous regions.
NASA Astrophysics Data System (ADS)
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Nature of Driving Force for Protein Folding: A Result From Analyzing the Statistical Potential
NASA Astrophysics Data System (ADS)
Li, Hao; Tang, Chao; Wingreen, Ned S.
1997-07-01
In a statistical approach to protein structure analysis, Miyazawa and Jernigan derived a 20×20 matrix of inter-residue contact energies between different types of amino acids. Using the method of eigenvalue decomposition, we find that the Miyazawa-Jernigan matrix can be accurately reconstructed from its first two principal component vectors as Mij = C0+C1\\(qi+qj\\)+C2qiqj, with constant C's, and 20 q values associated with the 20 amino acids. This regularity is due to hydrophobic interactions and a force of demixing, the latter obeying Hildebrand's solubility theory of simple liquids.
Xiao, Keke; Chen, Yun; Jiang, Xie; Zhou, Yan
2017-03-01
An investigation was conducted for 20 different types of sludge in order to identify the key organic compounds in extracellular polymeric substances (EPS) that are important in assessing variations of sludge filterability. The different types of sludge varied in initial total solids (TS) content, organic composition and pre-treatment methods. For instance, some of the sludges were pre-treated by acid, ultrasonic, thermal, alkaline, or advanced oxidation technique. The Pearson's correlation results showed significant correlations between sludge filterability and zeta potential, pH, dissolved organic carbon, protein and polysaccharide in soluble EPS (SB EPS), loosely bound EPS (LB EPS) and tightly bound EPS (TB EPS). The principal component analysis (PCA) method was used to further explore correlations between variables and similarities among EPS fractions of different types of sludge. Two principal components were extracted: principal component 1 accounted for 59.24% of total EPS variations, while principal component 2 accounted for 25.46% of total EPS variations. Dissolved organic carbon, protein and polysaccharide in LB EPS showed higher eigenvector projection values than the corresponding compounds in SB EPS and TB EPS in principal component 1. Further characterization of fractionized key organic compounds in LB EPS was conducted with size-exclusion chromatography-organic carbon detection-organic nitrogen detection (LC-OCD-OND). A numerical multiple linear regression model was established to describe relationship between organic compounds in LB EPS and sludge filterability. Copyright © 2016 Elsevier Ltd. All rights reserved.
Field-effect transistors (2nd revised and enlarged edition)
NASA Astrophysics Data System (ADS)
Bocharov, L. N.
The design, principle of operation, and principal technical characteristics of field-effect transistors produced in the USSR are described. Problems related to the use of field-effect transistors in various radioelectronic devices are examined, and tables of parameters and mean statistical characteristics are presented for the main types of field-effect transistors. Methods for calculating various circuit components are discussed and illustrated by numerical examples.
1984-11-01
welL The subipace is found by using the usual linear eigenv’ctor solution in th3 new enlarged space. This technique was first suggested by Gnanadesikan ...Wilk (1966, 1968), and a good description can be found in Gnanadesikan (1977). They suggested using polynomial functions’ of the original p co...Heidelberg, Springer Ver- lag. Gnanadesikan , R. (1977), Methods for Statistical Data Analysis of Multivariate Observa- tions, Wiley, New York
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-01
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method. PMID:26761006
Li, Ke; Zhang, Qiuju; Wang, Kun; Chen, Peng; Wang, Huaqing
2016-01-08
A new fault diagnosis method for rotating machinery based on adaptive statistic test filter (ASTF) and Diagnostic Bayesian Network (DBN) is presented in this paper. ASTF is proposed to obtain weak fault features under background noise, ASTF is based on statistic hypothesis testing in the frequency domain to evaluate similarity between reference signal (noise signal) and original signal, and remove the component of high similarity. The optimal level of significance α is obtained using particle swarm optimization (PSO). To evaluate the performance of the ASTF, evaluation factor Ipq is also defined. In addition, a simulation experiment is designed to verify the effectiveness and robustness of ASTF. A sensitive evaluation method using principal component analysis (PCA) is proposed to evaluate the sensitiveness of symptom parameters (SPs) for condition diagnosis. By this way, the good SPs that have high sensitiveness for condition diagnosis can be selected. A three-layer DBN is developed to identify condition of rotation machinery based on the Bayesian Belief Network (BBN) theory. Condition diagnosis experiment for rolling element bearings demonstrates the effectiveness of the proposed method.
ERIC Educational Resources Information Center
Ilgan, Abdurrahman; Parylo, Oksana; Sungu, Hilmi
2015-01-01
This quantitative research examined instructional supervision behaviours of school principals as a predictor of teacher job satisfaction through the analysis of Turkish teachers' perceptions of principals' instructional supervision behaviours. There was a statistically significant difference found between the teachers' job satisfaction level and…
ERIC Educational Resources Information Center
Chirichello, Michael
There are about 80,000 public school principals in the United States. The Bureau of Labor Statistics estimates there will be a 10 percent increase in the employment of educational administrators of all types through 2006. The National Association of Elementary School Principals estimates that more than 40 percent of principals will retire or leave…
Reproducibility-optimized test statistic for ranking genes in microarray studies.
Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero
2008-01-01
A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.
Tsai, Tsung-Yuan; Li, Jing-Sheng; Wang, Shaobai; Li, Pingyue; Kwon, Young-Min; Li, Guoan
2015-01-01
The statistical shape model (SSM) method that uses 2D images of the knee joint to predict the three-dimensional (3D) joint surface model has been reported in the literature. In this study, we constructed a SSM database using 152 human computed tomography (CT) knee joint models, including the femur, tibia and patella and analysed the characteristics of each principal component of the SSM. The surface models of two in vivo knees were predicted using the SSM and their 2D bi-plane fluoroscopic images. The predicted models were compared to their CT joint models. The differences between the predicted 3D knee joint surfaces and the CT image-based surfaces were 0.30 ± 0.81 mm, 0.34 ± 0.79 mm and 0.36 ± 0.59 mm for the femur, tibia and patella, respectively (average ± standard deviation). The computational time for each bone of the knee joint was within 30 s using a personal computer. The analysis of this study indicated that the SSM method could be a useful tool to construct 3D surface models of the knee with sub-millimeter accuracy in real time. Thus, it may have a broad application in computer-assisted knee surgeries that require 3D surface models of the knee.
PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data.
Mejia, Amanda F; Nebel, Mary Beth; Eloyan, Ani; Caffo, Brian; Lindquist, Martin A
2017-07-01
Outlier detection for high-dimensional (HD) data is a popular topic in modern statistical research. However, one source of HD data that has received relatively little attention is functional magnetic resonance images (fMRI), which consists of hundreds of thousands of measurements sampled at hundreds of time points. At a time when the availability of fMRI data is rapidly growing-primarily through large, publicly available grassroots datasets-automated quality control and outlier detection methods are greatly needed. We propose principal components analysis (PCA) leverage and demonstrate how it can be used to identify outlying time points in an fMRI run. Furthermore, PCA leverage is a measure of the influence of each observation on the estimation of principal components, which are often of interest in fMRI data. We also propose an alternative measure, PCA robust distance, which is less sensitive to outliers and has controllable statistical properties. The proposed methods are validated through simulation studies and are shown to be highly accurate. We also conduct a reliability study using resting-state fMRI data from the Autism Brain Imaging Data Exchange and find that removal of outliers using the proposed methods results in more reliable estimation of subject-level resting-state networks using independent components analysis. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Sensor Failure Detection of FASSIP System using Principal Component Analysis
NASA Astrophysics Data System (ADS)
Sudarno; Juarsa, Mulya; Santosa, Kussigit; Deswandri; Sunaryo, Geni Rina
2018-02-01
In the nuclear reactor accident of Fukushima Daiichi in Japan, the damages of core and pressure vessel were caused by the failure of its active cooling system (diesel generator was inundated by tsunami). Thus researches on passive cooling system for Nuclear Power Plant are performed to improve the safety aspects of nuclear reactors. The FASSIP system (Passive System Simulation Facility) is an installation used to study the characteristics of passive cooling systems at nuclear power plants. The accuracy of sensor measurement of FASSIP system is essential, because as the basis for determining the characteristics of a passive cooling system. In this research, a sensor failure detection method for FASSIP system is developed, so the indication of sensor failures can be detected early. The method used is Principal Component Analysis (PCA) to reduce the dimension of the sensor, with the Squarred Prediction Error (SPE) and statistic Hotteling criteria for detecting sensor failure indication. The results shows that PCA method is capable to detect the occurrence of a failure at any sensor.
Statistical shape modeling of human cochlea: alignment and principal component analysis
NASA Astrophysics Data System (ADS)
Poznyakovskiy, Anton A.; Zahnert, Thomas; Fischer, Björn; Lasurashvili, Nikoloz; Kalaidzidis, Yannis; Mürbe, Dirk
2013-02-01
The modeling of the cochlear labyrinth in living subjects is hampered by insufficient resolution of available clinical imaging methods. These methods usually provide resolutions higher than 125 μm. This is too crude to record the position of basilar membrane and, as a result, keep apart even the scala tympani from other scalae. This problem could be avoided by the means of atlas-based segmentation. The specimens can endure higher radiation loads and, conversely, provide better-resolved images. The resulting surface can be used as the seed for atlas-based segmentation. To serve this purpose, we have developed a statistical shape model (SSM) of human scala tympani based on segmentations obtained from 10 μCT image stacks. After segmentation, we aligned the resulting surfaces using Procrustes alignment. This algorithm was slightly modified to accommodate single models with nodes which do not necessarily correspond to salient features and vary in number between models. We have established correspondence by mutual proximity between nodes. Rather than using the standard Euclidean norm, we have applied an alternative logarithmic norm to improve outlier treatment. The minimization was done using BFGS method. We have also split the surface nodes along an octree to reduce computation cost. Subsequently, we have performed the principal component analysis of the training set with Jacobi eigenvalue algorithm. We expect the resulting method to help acquiring not only better understanding in interindividual variations of cochlear anatomy, but also a step towards individual models for pre-operative diagnostics prior to cochlear implant insertions.
NASA Astrophysics Data System (ADS)
Jha, S. K.; Brockman, R. A.; Hoffman, R. M.; Sinha, V.; Pilchak, A. L.; Porter, W. J.; Buchanan, D. J.; Larsen, J. M.; John, R.
2018-05-01
Principal component analysis and fuzzy c-means clustering algorithms were applied to slip-induced strain and geometric metric data in an attempt to discover unique microstructural configurations and their frequencies of occurrence in statistically representative instantiations of a titanium alloy microstructure. Grain-averaged fatigue indicator parameters were calculated for the same instantiation. The fatigue indicator parameters strongly correlated with the spatial location of the microstructural configurations in the principal components space. The fuzzy c-means clustering method identified clusters of data that varied in terms of their average fatigue indicator parameters. Furthermore, the number of points in each cluster was inversely correlated to the average fatigue indicator parameter. This analysis demonstrates that data-driven methods have significant potential for providing unbiased determination of unique microstructural configurations and their frequencies of occurrence in a given volume from the point of view of strain localization and fatigue crack initiation.
XCOM intrinsic dimensionality for low-Z elements at diagnostic energies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bornefalk, Hans
2012-02-15
Purpose: To determine the intrinsic dimensionality of linear attenuation coefficients (LACs) from XCOM for elements with low atomic number (Z = 1-20) at diagnostic x-ray energies (25-120 keV). H{sub 0}{sup q}, the hypothesis that the space of LACs is spanned by q bases, is tested for various q-values. Methods: Principal component analysis is first applied and the LACs are projected onto the first q principal component bases. The residuals of the model values vs XCOM data are determined for all energies and atomic numbers. Heteroscedasticity invalidates the prerequisite of i.i.d. errors necessary for bootstrapping residuals. Instead wild bootstrap is applied,more » which, by not mixing residuals, allows the effect of the non-i.i.d residuals to be reflected in the result. Credible regions for the eigenvalues of the correlation matrix for the bootstrapped LAC data are determined. If subsequent credible regions for the eigenvalues overlap, the corresponding principal component is not considered to represent true data structure but noise. If this happens for eigenvalues l and l + 1, for any l{<=}q, H{sub 0}{sup q} is rejected. Results: The largest value of q for which H{sub 0}{sup q} is nonrejectable at the 5%-level is q = 4. This indicates that the statistically significant intrinsic dimensionality of low-Z XCOM data at diagnostic energies is four. Conclusions: The method presented allows determination of the statistically significant dimensionality of any noisy linear subspace. Knowledge of such significant dimensionality is of interest for any method making assumptions on intrinsic dimensionality and evaluating results on noisy reference data. For LACs, knowledge of the low-Z dimensionality might be relevant when parametrization schemes are tuned to XCOM data. For x-ray imaging techniques based on the basis decomposition method (Alvarez and Macovski, Phys. Med. Biol. 21, 733-744, 1976), an underlying dimensionality of two is commonly assigned to the LAC of human tissue at diagnostic energies. The finding of a higher statistically significant dimensionality thus raises the question whether a higher assumed model dimensionality (now feasible with the advent of multibin x-ray systems) might also be practically relevant, i.e., if better tissue characterization results can be obtained.« less
Hayat, Aliasghar; Abdollahi, Bijan; Zainabadi, Hasan Reza; Arasteh, Hamid Reza
2015-07-01
The increased emphasis on standards-based school accountability since the passage of the no child left behind act of 2001 is focusing critical attention on the professional development of school principals and their ability to meet the challenges of improving the student outcomes. Due to this subject, the current study examined professional development needs of Shiraz high schools principals. The statistical population consisted of 343 principals of Shiraz high schools, of whom 250 subjects were selected using Krejcie and Morgan (1978) sample size determination table. To collect the data, a questionnaire developed by Salazar (2007) was administered. This questionnaire was designed for professional development in the leadership skills/competencies and consisted of 25 items in each leadership performance domain using five-point Likert-type scales. The content validity of the questionnaire was confirmed and the Cronbach's Alpha coefficient was (a=0.78). To analyze the data, descriptive statistics and Paired-Samples t-test were used. Also, the data was analyzed through SPSS14 software. The findings showed that principals' "Importance" ratings were always higher than their "Actual proficiency" ratings. The mean score of the difference between "Importance" and "Actual proficiency" pair on "Organizing resources" was 2.11, making it the highest "need" area. The lowest need area was "Managing the organization and operational procedures" at 0.81. Also, the results showed that there was a statistically significant difference between the means of the "Importance" and the corresponding means on the "Actual proficiency" (Difference of means=1.48, t=49.38, p<0.001). Based on the obtained results, the most important professional development needs of the principals included organizing resources, resolving complex problems, understanding student development and learning, developing the vision and the mission, building team commitment, understanding measurements, evaluation and assessment strategies, facilitating the change process, solving problems and making decisions. In other words, the principals had statistically significant professional development needs in all areas of the educational leadership. Also, the results suggested that today's school principals need to grow and learn throughout their careers by ongoing professional development.
NASA Astrophysics Data System (ADS)
Eltom, Hassan A.; Abdullatif, Osman M.; Makkawi, Mohammed H.; Eltoum, Isam-Eldin A.
2017-03-01
The interpretation of depositional environments provides important information to understand facies distribution and geometry. The classical approach to interpret depositional environments principally relies on the analysis of lithofacies, biofacies and stratigraphic data, among others. An alternative method, based on geochemical data (chemical element data), is advantageous because it can simply, reproducibly and efficiently interpret and refine the interpretation of the depositional environment of carbonate strata. Here we geochemically analyze and statistically model carbonate samples (n = 156) from seven sections of the Arab-D reservoir outcrop analog of central Saudi Arabia, to determine whether the elemental signatures (major, trace and rare earth elements [REEs]) can be effectively used to predict depositional environments. We find that lithofacies associations of the studied outcrop (peritidal to open marine depositional environments) possess altered REE signatures, and that this trend increases stratigraphically from bottom-to-top, which corresponds to an upward shallowing of depositional environments. The relationship between REEs and major, minor and trace elements indicates that contamination by detrital materials is the principal source of REEs, whereas redox condition, marine and diagenetic processes have minimal impact on the relative distribution of REEs in the lithofacies. In a statistical model (factor analysis and logistic regression), REEs, major and trace elements cluster together and serve as markers to differentiate between peritidal and open marine facies and to differentiate between intertidal and subtidal lithofacies within the peritidal facies. The results indicate that statistical modelling of the elemental composition of carbonate strata can be used as a quantitative method to predict depositional environments and regional paleogeography. The significance of this study lies in offering new assessments of the relationships between lithofacies and geochemical elements by using advanced statistical analysis, a method that could be used elsewhere to interpret depositional environment and refine facies models.
15 CFR 758.3 - Responsibilities of parties to the transaction.
Code of Federal Regulations, 2014 CFR
2014-01-01
... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...
15 CFR 758.3 - Responsibilities of parties to the transaction.
Code of Federal Regulations, 2011 CFR
2011-01-01
... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...
15 CFR 758.3 - Responsibilities of parties to the transaction.
Code of Federal Regulations, 2010 CFR
2010-01-01
... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...
15 CFR 758.3 - Responsibilities of parties to the transaction.
Code of Federal Regulations, 2013 CFR
2013-01-01
... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...
15 CFR 758.3 - Responsibilities of parties to the transaction.
Code of Federal Regulations, 2012 CFR
2012-01-01
... principal party in interest the exporter for EAR purposes. One writing may cover multiple transactions between the same principals. See § 748.4(a)(3) of the EAR. Note to paragraph (b): For statistical purposes.... principal party in interest. For purposes of licensing responsibility under the EAR, the U.S. agent of the...
1987-09-01
long been recognized as powerful nonparametric statistical methods since the introduction of the principal ideas by R.A. Fisher in 1935 . Even when...couldIatal eoand1rmSncepriet ::’x.OUld st:ll have to he - epre -erte-.. I-Itma’v in any: ph,:sIcal computing devl’:c by a C\\onux ot bit,Aa n. the
Some new mathematical methods for variational objective analysis
NASA Technical Reports Server (NTRS)
Wahba, Grace; Johnson, Donald R.
1994-01-01
Numerous results were obtained relevant to remote sensing, variational objective analysis, and data assimilation. A list of publications relevant in whole or in part is attached. The principal investigator gave many invited lectures, disseminating the results to the meteorological community as well as the statistical community. A list of invited lectures at meetings is attached, as well as a list of departmental colloquia at various universities and institutes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tardiff, Mark F.; Runkle, Robert C.; Anderson, K. K.
2006-01-23
The goal of primary radiation monitoring in support of routine screening and emergency response is to detect characteristics in vehicle radiation signatures that indicate the presence of potential threats. Two conceptual approaches to analyzing gamma-ray spectra for threat detection are isotope identification and anomaly detection. While isotope identification is the time-honored method, an emerging technique is anomaly detection that uses benign vehicle gamma ray signatures to define an expectation of the radiation signature for vehicles that do not pose a threat. Newly acquired spectra are then compared to this expectation using statistical criteria that reflect acceptable false alarm rates andmore » probabilities of detection. The gamma-ray spectra analyzed here were collected at a U.S. land Port of Entry (POE) using a NaI-based radiation portal monitor (RPM). The raw data were analyzed to develop a benign vehicle expectation by decimating the original pulse-height channels to 35 energy bins, extracting composite variables via principal components analysis (PCA), and estimating statistically weighted distances from the mean vehicle spectrum with the mahalanobis distance (MD) metric. This paper reviews the methods used to establish the anomaly identification criteria and presents a systematic analysis of the response of the combined PCA and MD algorithm to modeled mono-energetic gamma-ray sources.« less
NASA Astrophysics Data System (ADS)
Salman, Ahmad; Lapidot, Itshak; Pomerantz, Ami; Tsror, Leah; Shufan, Elad; Moreh, Raymond; Mordechai, Shaul; Huleihel, Mahmoud
2012-01-01
The early diagnosis of phytopathogens is of a great importance; it could save large economical losses due to crops damaged by fungal diseases, and prevent unnecessary soil fumigation or the use of fungicides and bactericides and thus prevent considerable environmental pollution. In this study, 18 isolates of three different fungi genera were investigated; six isolates of Colletotrichum coccodes, six isolates of Verticillium dahliae and six isolates of Fusarium oxysporum. Our main goal was to differentiate these fungi samples on the level of isolates, based on their infrared absorption spectra obtained using the Fourier transform infrared-attenuated total reflection (FTIR-ATR) sampling technique. Advanced statistical and mathematical methods: principal component analysis (PCA), linear discriminant analysis (LDA), and k-means were applied to the spectra after manipulation. Our results showed significant spectral differences between the various fungi genera examined. The use of k-means enabled classification between the genera with a 94.5% accuracy, whereas the use of PCA [3 principal components (PCs)] and LDA has achieved a 99.7% success rate. However, on the level of isolates, the best differentiation results were obtained using PCA (9 PCs) and LDA for the lower wavenumber region (800-1775 cm-1), with identification success rates of 87%, 85.5%, and 94.5% for Colletotrichum, Fusarium, and Verticillium strains, respectively.
Value assignment and uncertainty evaluation for single-element reference solutions
NASA Astrophysics Data System (ADS)
Possolo, Antonio; Bodnar, Olha; Butler, Therese A.; Molloy, John L.; Winchester, Michael R.
2018-06-01
A Bayesian statistical procedure is proposed for value assignment and uncertainty evaluation for the mass fraction of the elemental analytes in single-element solutions distributed as NIST standard reference materials. The principal novelty that we describe is the use of information about relative differences observed historically between the measured values obtained via gravimetry and via high-performance inductively coupled plasma optical emission spectrometry, to quantify the uncertainty component attributable to between-method differences. This information is encapsulated in a prior probability distribution for the between-method uncertainty component, and it is then used, together with the information provided by current measurement data, to produce a probability distribution for the value of the measurand from which an estimate and evaluation of uncertainty are extracted using established statistical procedures.
Identification of the isomers using principal component analysis (PCA) method
NASA Astrophysics Data System (ADS)
Kepceoǧlu, Abdullah; Gündoǧdu, Yasemin; Ledingham, Kenneth William David; Kilic, Hamdi Sukur
2016-03-01
In this work, we have carried out a detailed statistical analysis for experimental data of mass spectra from xylene isomers. Principle Component Analysis (PCA) was used to identify the isomers which cannot be distinguished using conventional statistical methods for interpretation of their mass spectra. Experiments have been carried out using a linear TOF-MS coupled to a femtosecond laser system as an energy source for the ionisation processes. We have performed experiments and collected data which has been analysed and interpreted using PCA as a multivariate analysis of these spectra. This demonstrates the strength of the method to get an insight for distinguishing the isomers which cannot be identified using conventional mass analysis obtained through dissociative ionisation processes on these molecules. The PCA results dependending on the laser pulse energy and the background pressure in the spectrometers have been presented in this work.
Nature of Driving Force for Protein Folding-- A Result From Analyzing the Statistical Potential
NASA Astrophysics Data System (ADS)
Li, Hao; Tang, Chao; Wingreen, Ned S.
1998-03-01
In a statistical approach to protein structure analysis, Miyazawa and Jernigan (MJ) derived a 20× 20 matrix of inter-residue contact energies between different types of amino acids. Using the method of eigenvalue decomposition, we find that the MJ matrix can be accurately reconstructed from its first two principal component vectors as M_ij=C_0+C_1(q_i+q_j)+C2 qi q_j, with constant C's, and 20 q values associated with the 20 amino acids. This regularity is due to hydrophobic interactions and a force of demixing, the latter obeying Hildebrand's solubility theory of simple liquids.
Hirayama, Jun-ichiro; Hyvärinen, Aapo; Kiviniemi, Vesa; Kawanabe, Motoaki; Yamashita, Okito
2016-01-01
Characterizing the variability of resting-state functional brain connectivity across subjects and/or over time has recently attracted much attention. Principal component analysis (PCA) serves as a fundamental statistical technique for such analyses. However, performing PCA on high-dimensional connectivity matrices yields complicated “eigenconnectivity” patterns, for which systematic interpretation is a challenging issue. Here, we overcome this issue with a novel constrained PCA method for connectivity matrices by extending the idea of the previously proposed orthogonal connectivity factorization method. Our new method, modular connectivity factorization (MCF), explicitly introduces the modularity of brain networks as a parametric constraint on eigenconnectivity matrices. In particular, MCF analyzes the variability in both intra- and inter-module connectivities, simultaneously finding network modules in a principled, data-driven manner. The parametric constraint provides a compact module-based visualization scheme with which the result can be intuitively interpreted. We develop an optimization algorithm to solve the constrained PCA problem and validate our method in simulation studies and with a resting-state functional connectivity MRI dataset of 986 subjects. The results show that the proposed MCF method successfully reveals the underlying modular eigenconnectivity patterns in more general situations and is a promising alternative to existing methods. PMID:28002474
ERIC Educational Resources Information Center
Waswas, Dima; Gasaymeh, Al-Mothana M.
2017-01-01
This study aims at identifying the role played by school principals in the Governorate of Ma'an to strengthen intellectual security of the school students; and identifying whether there are statistically significant differences in the roles of principals attributed to the variables: gender, academic level, and years of experience in…
Shin, S M; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B
2015-01-01
Objectives: To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. Methods: The sample included 24 female and 19 male patients with hand–wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Results: Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Conclusions: Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index. PMID:25411713
Bostanov, Vladimir; Kotchoubey, Boris
2006-12-01
This study was aimed at developing a method for extraction and assessment of event-related brain potentials (ERP) from single-trials. This method should be applicable in the assessment of single persons' ERPs and should be able to handle both single ERP components and whole waveforms. We adopted a recently developed ERP feature extraction method, the t-CWT, for the purposes of hypothesis testing in the statistical assessment of ERPs. The t-CWT is based on the continuous wavelet transform (CWT) and Student's t-statistics. The method was tested in two ERP paradigms, oddball and semantic priming, by assessing individual-participant data on a single-trial basis, and testing the significance of selected ERP components, P300 and N400, as well as of whole ERP waveforms. The t-CWT was also compared to other univariate and multivariate ERP assessment methods: peak picking, area computation, discrete wavelet transform (DWT) and principal component analysis (PCA). The t-CWT produced better results than all of the other assessment methods it was compared with. The t-CWT can be used as a reliable and powerful method for ERP-component detection and testing of statistical hypotheses concerning both single ERP components and whole waveforms extracted from either single persons' or group data. The t-CWT is the first such method based explicitly on the criteria of maximal statistical difference between two average ERPs in the time-frequency domain and is particularly suitable for ERP assessment of individual data (e.g. in clinical settings), but also for the investigation of small and/or novel ERP effects from group data.
Principal component analysis of the cytokine and chemokine response to human traumatic brain injury.
Helmy, Adel; Antoniades, Chrystalina A; Guilfoyle, Mathew R; Carpenter, Keri L H; Hutchinson, Peter J
2012-01-01
There is a growing realisation that neuro-inflammation plays a fundamental role in the pathology of Traumatic Brain Injury (TBI). This has led to the search for biomarkers that reflect these underlying inflammatory processes using techniques such as cerebral microdialysis. The interpretation of such biomarker data has been limited by the statistical methods used. When analysing data of this sort the multiple putative interactions between mediators need to be considered as well as the timing of production and high degree of statistical co-variance in levels of these mediators. Here we present a cytokine and chemokine dataset from human brain following human traumatic brain injury and use principal component analysis and partial least squares discriminant analysis to demonstrate the pattern of production following TBI, distinct phases of the humoral inflammatory response and the differing patterns of response in brain and in peripheral blood. This technique has the added advantage of making no assumptions about the Relative Recovery (RR) of microdialysis derived parameters. Taken together these techniques can be used in complex microdialysis datasets to summarise the data succinctly and generate hypotheses for future study.
Tchabo, William; Ma, Yongkun; Kwaw, Emmanuel; Zhang, Haining; Xiao, Lulu; Apaliya, Maurice T
2018-01-15
The four different methods of color measurement of wine proposed by Boulton, Giusti, Glories and Commission International de l'Eclairage (CIE) were applied to assess the statistical relationship between the phytochemical profile and chromatic characteristics of sulfur dioxide-free mulberry (Morus nigra) wine submitted to non-thermal maturation processes. The alteration in chromatic properties and phenolic composition of non-thermal aged mulberry wine were examined, aided by the used of Pearson correlation, cluster and principal component analysis. The results revealed a positive effect of non-thermal processes on phytochemical families of wines. From Pearson correlation analysis relationships between chromatic indexes and flavonols as well as anthocyanins were established. Cluster analysis highlighted similarities between Boulton and Giusti parameters, as well as Glories and CIE parameters in the assessment of chromatic properties of wines. Finally, principal component analysis was able to discriminate wines subjected to different maturation techniques on the basis of their chromatic and phenolics characteristics. Copyright © 2017. Published by Elsevier Ltd.
A study on the use of Gumbel approximation with the Bernoulli spatial scan statistic.
Read, S; Bath, P A; Willett, P; Maheswaran, R
2013-08-30
The Bernoulli version of the spatial scan statistic is a well established method of detecting localised spatial clusters in binary labelled point data, a typical application being the epidemiological case-control study. A recent study suggests the inferential accuracy of several versions of the spatial scan statistic (principally the Poisson version) can be improved, at little computational cost, by using the Gumbel distribution, a method now available in SaTScan(TM) (www.satscan.org). We study in detail the effect of this technique when applied to the Bernoulli version and demonstrate that it is highly effective, albeit with some increase in false alarm rates at certain significance thresholds. We explain how this increase is due to the discrete nature of the Bernoulli spatial scan statistic and demonstrate that it can affect even small p-values. Despite this, we argue that the Gumbel method is actually preferable for very small p-values. Furthermore, we extend previous research by running benchmark trials on 12 000 synthetic datasets, thus demonstrating that the overall detection capability of the Bernoulli version (i.e. ratio of power to false alarm rate) is not noticeably affected by the use of the Gumbel method. We also provide an example application of the Gumbel method using data on hospital admissions for chronic obstructive pulmonary disease. Copyright © 2013 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Lomakina, N. Ya.
2017-11-01
The work presents the results of the applied climatic division of the Siberian region into districts based on the methodology of objective classification of the atmospheric boundary layer climates by the "temperature-moisture-wind" complex realized with using the method of principal components and the special similarity criteria of average profiles and the eigen values of correlation matrices. On the territory of Siberia, it was identified 14 homogeneous regions for winter season and 10 regions were revealed for summer. The local statistical models were constructed for each region. These include vertical profiles of mean values, mean square deviations, and matrices of interlevel correlation of temperature, specific humidity, zonal and meridional wind velocity. The advantage of the obtained local statistical models over the regional models is shown.
Kong, Jessica; Giridharagopal, Rajiv; Harrison, Jeffrey S; Ginger, David S
2018-05-31
Correlating nanoscale chemical specificity with operational physics is a long-standing goal of functional scanning probe microscopy (SPM). We employ a data analytic approach combining multiple microscopy modes, using compositional information in infrared vibrational excitation maps acquired via photoinduced force microscopy (PiFM) with electrical information from conductive atomic force microscopy. We study a model polymer blend comprising insulating poly(methyl methacrylate) (PMMA) and semiconducting poly(3-hexylthiophene) (P3HT). We show that PiFM spectra are different from FTIR spectra, but can still be used to identify local composition. We use principal component analysis to extract statistically significant principal components and principal component regression to predict local current and identify local polymer composition. In doing so, we observe evidence of semiconducting P3HT within PMMA aggregates. These methods are generalizable to correlated SPM data and provide a meaningful technique for extracting complex compositional information that are impossible to measure from any one technique.
A Generic multi-dimensional feature extraction method using multiobjective genetic programming.
Zhang, Yang; Rockett, Peter I
2009-01-01
In this paper, we present a generic feature extraction method for pattern classification using multiobjective genetic programming. This not only evolves the (near-)optimal set of mappings from a pattern space to a multi-dimensional decision space, but also simultaneously optimizes the dimensionality of that decision space. The presented framework evolves vector-to-vector feature extractors that maximize class separability. We demonstrate the efficacy of our approach by making statistically-founded comparisons with a wide variety of established classifier paradigms over a range of datasets and find that for most of the pairwise comparisons, our evolutionary method delivers statistically smaller misclassification errors. At very worst, our method displays no statistical difference in a few pairwise comparisons with established classifier/dataset combinations; crucially, none of the misclassification results produced by our method is worse than any comparator classifier. Although principally focused on feature extraction, feature selection is also performed as an implicit side effect; we show that both feature extraction and selection are important to the success of our technique. The presented method has the practical consequence of obviating the need to exhaustively evaluate a large family of conventional classifiers when faced with a new pattern recognition problem in order to attain a good classification accuracy.
A power analysis for multivariate tests of temporal trend in species composition.
Irvine, Kathryn M; Dinger, Eric C; Sarr, Daniel
2011-10-01
Long-term monitoring programs emphasize power analysis as a tool to determine the sampling effort necessary to effectively document ecologically significant changes in ecosystems. Programs that monitor entire multispecies assemblages require a method for determining the power of multivariate statistical models to detect trend. We provide a method to simulate presence-absence species assemblage data that are consistent with increasing or decreasing directional change in species composition within multiple sites. This step is the foundation for using Monte Carlo methods to approximate the power of any multivariate method for detecting temporal trends. We focus on comparing the power of the Mantel test, permutational multivariate analysis of variance, and constrained analysis of principal coordinates. We find that the power of the various methods we investigate is sensitive to the number of species in the community, univariate species patterns, and the number of sites sampled over time. For increasing directional change scenarios, constrained analysis of principal coordinates was as or more powerful than permutational multivariate analysis of variance, the Mantel test was the least powerful. However, in our investigation of decreasing directional change, the Mantel test was typically as or more powerful than the other models.
NASA Astrophysics Data System (ADS)
Panahi, Nima S.
We studied the problem of understanding and computing the essential features and dynamics of molecular motions through the development of two theories for two different systems. First, we studied the process of the Berry Pseudorotation of PF5 and the rotations it induces in the molecule through its natural and intrinsic geometric nature by setting it in the language of fiber bundles and graph theory. With these tools, we successfully extracted the essentials of the process' loops and induced rotations. The infinite number of pseudorotation loops were broken down into a small set of essential loops called "super loops", with their intrinsic properties and link to the physical movements of the molecule extensively studied. In addition, only the three "self-edge loops" generated any induced rotations, and then only a finite number of classes of them. Second, we studied applying the statistical methods of Principal Components Analysis (PCA) and Principal Coordinate Analysis (PCO) to capture only the most important changes in Argon clusters so as to reduce computational costs and graph the potential energy surface (PES) in three dimensions respectively. Both methods proved successful, but PCA was only partially successful since one will only see advantages for PES database systems much larger than those both currently being studied and those that can be computationally studied in the next few decades to come. In addition, PCA is only needed for the very rare case of a PES database that does not already include Hessian eigenvalues.
ERIC Educational Resources Information Center
Perry, Teresa
2012-01-01
This study examined the perceptions of principals and teachers regarding mental health provider's impact on student achievement and behavior in high poverty schools using descriptive statistics, t-test, and two-way ANOVA. Respondents in this study shared similar views concerning principal and teacher satisfaction and levels of support for the…
Principal curvatures and area ratio of propagating surfaces in isotropic turbulence
NASA Astrophysics Data System (ADS)
Zheng, Tianhang; You, Jiaping; Yang, Yue
2017-10-01
We study the statistics of principal curvatures and the surface area ratio of propagating surfaces with a constant or nonconstant propagating velocity in isotropic turbulence using direct numerical simulation. Propagating surface elements initially constitute a plane to model a planar premixed flame front. When the statistics of evolving propagating surfaces reach the stationary stage, the statistical profiles of principal curvatures scaled by the Kolmogorov length scale versus the constant displacement speed scaled by the Kolmogorov velocity scale collapse at different Reynolds numbers. The magnitude of averaged principal curvatures and the number of surviving surface elements without cusp formation decrease with increasing displacement speed. In addition, the effect of surface stretch on the nonconstant displacement speed inhibits the cusp formation on surface elements at negative Markstein numbers. In order to characterize the wrinkling process of the global propagating surface, we develop a model to demonstrate that the increase of the surface area ratio is primarily due to positive Lagrangian time integrations of the area-weighted averaged tangential strain-rate term and propagation-curvature term. The difference between the negative averaged mean curvature and the positive area-weighted averaged mean curvature characterizes the cellular geometry of the global propagating surface.
The Global Oscillation Network Group site survey. 1: Data collection and analysis methods
NASA Technical Reports Server (NTRS)
Hill, Frank; Fischer, George; Grier, Jennifer; Leibacher, John W.; Jones, Harrison B.; Jones, Patricia P.; Kupke, Renate; Stebbins, Robin T.
1994-01-01
The Global Oscillation Network Group (GONG) Project is planning to place a set of instruments around the world to observe solar oscillations as continuously as possible for at least three years. The Project has now chosen the sites that will comprise the network. This paper describes the methods of data collection and analysis that were used to make this decision. Solar irradiance data were collected with a one-minute cadence at fifteen sites around the world and analyzed to produce statistics of cloud cover, atmospheric extinction, and transparency power spectra at the individual sites. Nearly 200 reasonable six-site networks were assembled from the individual stations, and a set of statistical measures of the performance of the networks was analyzed using a principal component analysis. An accompanying paper presents the results of the survey.
Monitoring of an antigen manufacturing process.
Zavatti, Vanessa; Budman, Hector; Legge, Raymond; Tamer, Melih
2016-06-01
Fluorescence spectroscopy in combination with multivariate statistical methods was employed as a tool for monitoring the manufacturing process of pertactin (PRN), one of the virulence factors of Bordetella pertussis utilized in whopping cough vaccines. Fluorophores such as amino acids and co-enzymes were detected throughout the process. The fluorescence data collected at different stages of the fermentation and purification process were treated employing principal component analysis (PCA). Through PCA, it was feasible to identify sources of variability in PRN production. Then, partial least square (PLS) was employed to correlate the fluorescence spectra obtained from pure PRN samples and the final protein content measured by a Kjeldahl test from these samples. In view that a statistically significant correlation was found between fluorescence and PRN levels, this approach could be further used as a method to predict the final protein content.
Computerized system for assessing heart rate variability.
Frigy, A; Incze, A; Brânzaniuc, E; Cotoi, S
1996-01-01
The principal theoretical, methodological and clinical aspects of heart rate variability (HRV) analysis are reviewed. This method has been developed over the last 10 years as a useful noninvasive method of measuring the activity of the autonomic nervous system. The main components and the functioning of the computerized rhythm-analyzer system developed by our team are presented. The system is able to perform short-term (maximum 20 minutes) time domain HRV analysis and statistical analysis of the ventricular rate in any rhythm, particularly in atrial fibrillation. The performances of our system are demonstrated by using the graphics (RR histograms, delta RR histograms, RR scattergrams) and the statistical parameters resulted from the processing of three ECG recordings. These recordings are obtained from a normal subject, from a patient with advanced heart failure, and from a patient with atrial fibrillation.
Emsley, Richard; Dunn, Graham; White, Ian R
2010-06-01
Complex intervention trials should be able to answer both pragmatic and explanatory questions in order to test the theories motivating the intervention and help understand the underlying nature of the clinical problem being tested. Key to this is the estimation of direct effects of treatment and indirect effects acting through intermediate variables which are measured post-randomisation. Using psychological treatment trials as an example of complex interventions, we review statistical methods which crucially evaluate both direct and indirect effects in the presence of hidden confounding between mediator and outcome. We review the historical literature on mediation and moderation of treatment effects. We introduce two methods from within the existing causal inference literature, principal stratification and structural mean models, and demonstrate how these can be applied in a mediation context before discussing approaches and assumptions necessary for attaining identifiability of key parameters of the basic causal model. Assuming that there is modification by baseline covariates of the effect of treatment (i.e. randomisation) on the mediator (i.e. covariate by treatment interactions), but no direct effect on the outcome of these treatment by covariate interactions leads to the use of instrumental variable methods. We describe how moderation can occur through post-randomisation variables, and extend the principal stratification approach to multiple group methods with explanatory models nested within the principal strata. We illustrate the new methodology with motivating examples of randomised trials from the mental health literature.
NASA Astrophysics Data System (ADS)
Nemoto, Mitsutaka; Nomura, Yukihiro; Hanaoka, Shohei; Masutani, Yoshitaka; Yoshikawa, Takeharu; Hayashi, Naoto; Yoshioka, Naoki; Ohtomo, Kuni
Anatomical point landmarks as most primitive anatomical knowledge are useful for medical image understanding. In this study, we propose a detection method for anatomical point landmark based on appearance models, which include gray-level statistical variations at point landmarks and their surrounding area. The models are built based on results of Principal Component Analysis (PCA) of sample data sets. In addition, we employed generative learning method by transforming ROI of sample data. In this study, we evaluated our method with 24 data sets of body trunk CT images and obtained 95.8 ± 7.3 % of the average sensitivity in 28 landmarks.
Multivariate assessment of event-related potentials with the t-CWT method.
Bostanov, Vladimir
2015-11-05
Event-related brain potentials (ERPs) are usually assessed with univariate statistical tests although they are essentially multivariate objects. Brain-computer interface applications are a notable exception to this practice, because they are based on multivariate classification of single-trial ERPs. Multivariate ERP assessment can be facilitated by feature extraction methods. One such method is t-CWT, a mathematical-statistical algorithm based on the continuous wavelet transform (CWT) and Student's t-test. This article begins with a geometric primer on some basic concepts of multivariate statistics as applied to ERP assessment in general and to the t-CWT method in particular. Further, it presents for the first time a detailed, step-by-step, formal mathematical description of the t-CWT algorithm. A new multivariate outlier rejection procedure based on principal component analysis in the frequency domain is presented as an important pre-processing step. The MATLAB and GNU Octave implementation of t-CWT is also made publicly available for the first time as free and open source code. The method is demonstrated on some example ERP data obtained in a passive oddball paradigm. Finally, some conceptually novel applications of the multivariate approach in general and of the t-CWT method in particular are suggested and discussed. Hopefully, the publication of both the t-CWT source code and its underlying mathematical algorithm along with a didactic geometric introduction to some basic concepts of multivariate statistics would make t-CWT more accessible to both users and developers in the field of neuroscience research.
Middleton, David A; Hughes, Eleri; Madine, Jillian
2004-08-11
We describe an NMR approach for detecting the interactions between phospholipid membranes and proteins, peptides, or small molecules. First, 1H-13C dipolar coupling profiles are obtained from hydrated lipid samples at natural isotope abundance using cross-polarization magic-angle spinning NMR methods. Principal component analysis of dipolar coupling profiles for synthetic lipid membranes in the presence of a range of biologically active additives reveals clusters that relate to different modes of interaction of the additives with the lipid bilayer. Finally, by representing profiles from multiple samples in the form of contour plots, it is possible to reveal statistically significant changes in dipolar couplings, which reflect perturbations in the lipid molecules at the membrane surface or within the hydrophobic interior.
NASA Astrophysics Data System (ADS)
Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef
2014-11-01
Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.
Salvatore, Stefania; Røislien, Jo; Baz-Lomba, Jose A; Bramness, Jørgen G
2017-03-01
Wastewater-based epidemiology is an alternative method for estimating the collective drug use in a community. We applied functional data analysis, a statistical framework developed for analysing curve data, to investigate weekly temporal patterns in wastewater measurements of three prescription drugs with known abuse potential: methadone, oxazepam and methylphenidate, comparing them to positive and negative control drugs. Sewage samples were collected in February 2014 from a wastewater treatment plant in Oslo, Norway. The weekly pattern of each drug was extracted by fitting of generalized additive models, using trigonometric functions to model the cyclic behaviour. From the weekly component, the main temporal features were then extracted using functional principal component analysis. Results are presented through the functional principal components (FPCs) and corresponding FPC scores. Clinically, the most important weekly feature of the wastewater-based epidemiology data was the second FPC, representing the difference between average midweek level and a peak during the weekend, representing possible recreational use of a drug in the weekend. Estimated scores on this FPC indicated recreational use of methylphenidate, with a high weekend peak, but not for methadone and oxazepam. The functional principal component analysis uncovered clinically important temporal features of the weekly patterns of the use of prescription drugs detected from wastewater analysis. This may be used as a post-marketing surveillance method to monitor prescription drugs with abuse potential. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Defining the ecological hydrology of Taiwan Rivers using multivariate statistical methods
NASA Astrophysics Data System (ADS)
Chang, Fi-John; Wu, Tzu-Ching; Tsai, Wen-Ping; Herricks, Edwin E.
2009-09-01
SummaryThe identification and verification of ecohydrologic flow indicators has found new support as the importance of ecological flow regimes is recognized in modern water resources management, particularly in river restoration and reservoir management. An ecohydrologic indicator system reflecting the unique characteristics of Taiwan's water resources and hydrology has been developed, the Taiwan ecohydrological indicator system (TEIS). A major challenge for the water resources community is using the TEIS to provide environmental flow rules that improve existing water resources management. This paper examines data from the extensive network of flow monitoring stations in Taiwan using TEIS statistics to define and refine environmental flow options in Taiwan. Multivariate statistical methods were used to examine TEIS statistics for 102 stations representing the geographic and land use diversity of Taiwan. The Pearson correlation coefficient showed high multicollinearity between the TEIS statistics. Watersheds were separated into upper and lower-watershed locations. An analysis of variance indicated significant differences between upstream, more natural, and downstream, more developed, locations in the same basin with hydrologic indicator redundancy in flow change and magnitude statistics. Issues of multicollinearity were examined using a Principal Component Analysis (PCA) with the first three components related to general flow and high/low flow statistics, frequency and time statistics, and quantity statistics. These principle components would explain about 85% of the total variation. A major conclusion is that managers must be aware of differences among basins, as well as differences within basins that will require careful selection of management procedures to achieve needed flow regimes.
ERIC Educational Resources Information Center
DiPonio, Joseph M.
2010-01-01
The primary object of this study was to determine whether racial and/or gender bias were evidenced in the use of the ICIS-Principal. Specifically, will the use of the ICIS-Principal result in biased scores at a statistically significant level when rating current practicing administrators of varying gender and race. The study involved simulated…
ERIC Educational Resources Information Center
Sharief, Mostafa; Naderi, Mahin; Hiedari, Maryam Shoja; Roodbari, Omolbanin; Jalilvand, Mohammad Reza
2012-01-01
The aim of current study is to determine the strengths and weaknesses of descriptive evaluation from the viewpoint of principals, teachers and experts of Chaharmahal and Bakhtiari province. A descriptive survey was performed. Statistical population includes 208 principals, 303 teachers, and 100 executive experts of descriptive evaluation scheme in…
ERIC Educational Resources Information Center
Omidian, Faranak; Nedayeh Ali, Farzaneh
2015-01-01
The aim of this study was to investigate the attitudes of students, instructors, and educational principals to electronic administration of final-semester examinations at undergraduate and post- graduate levels in Payame Noor University in Khuzestan. The statistical population of this study consisted of all educational principals, instructors, of…
Statistics for laminar flamelet modeling
NASA Technical Reports Server (NTRS)
Cant, R. S.; Rutland, C. J.; Trouve, A.
1990-01-01
Statistical information required to support modeling of turbulent premixed combustion by laminar flamelet methods is extracted from a database of the results of Direct Numerical Simulation of turbulent flames. The simulations were carried out previously by Rutland (1989) using a pseudo-spectral code on a three dimensional mesh of 128 points in each direction. One-step Arrhenius chemistry was employed together with small heat release. A framework for the interpretation of the data is provided by the Bray-Moss-Libby model for the mean turbulent reaction rate. Probability density functions are obtained over surfaces of the constant reaction progress variable for the tangential strain rate and the principal curvature. New insights are gained which will greatly aid the development of modeling approaches.
Multivariate statistical approach to estimate mixing proportions for unknown end members
Valder, Joshua F.; Long, Andrew J.; Davis, Arden D.; Kenner, Scott J.
2012-01-01
A multivariate statistical method is presented, which includes principal components analysis (PCA) and an end-member mixing model to estimate unknown end-member hydrochemical compositions and the relative mixing proportions of those end members in mixed waters. PCA, together with the Hotelling T2 statistic and a conceptual model of groundwater flow and mixing, was used in selecting samples that best approximate end members, which then were used as initial values in optimization of the end-member mixing model. This method was tested on controlled datasets (i.e., true values of estimates were known a priori) and found effective in estimating these end members and mixing proportions. The controlled datasets included synthetically generated hydrochemical data, synthetically generated mixing proportions, and laboratory analyses of sample mixtures, which were used in an evaluation of the effectiveness of this method for potential use in actual hydrological settings. For three different scenarios tested, correlation coefficients (R2) for linear regression between the estimated and known values ranged from 0.968 to 0.993 for mixing proportions and from 0.839 to 0.998 for end-member compositions. The method also was applied to field data from a study of end-member mixing in groundwater as a field example and partial method validation.
Climatic change projections for winter streamflow in Guadalquivir river
NASA Astrophysics Data System (ADS)
Jesús Esteban Parra, María; Hidalgo Muñoz, José Manuel; García-Valdecasas-Ojeda, Matilde; Raquel Gámiz Fortis, Sonia; Castro Díez, Yolanda
2015-04-01
In this work we have obtained climate change projections for winter streamflow of the Guadalquivir River in the period 2071-2100 using the Principal Component Regression (PCR) method. The streamflow data base used has been provided by the Center for Studies and Experimentation of Public Works, CEDEX. Series from gauging stations and reservoirs with less than 10% of missing data (filled by regression with well correlated neighboring stations) have been considered. The homogeneity of these series has been evaluated through the Pettit test and degree of human alteration by the Common Area Index. The application of these criteria led to the selection of 13 streamflow time series homogeneously distributed over the basin, covering the period 1952-2011. For this streamflow data, winter seasonal values were obtained by averaging the monthly values from January to March. The PCR method has been applied using the Principal Components of the mean anomalies of sea level pressure (SLP) in winter (December to February averaged) as predictors of streamflow for the development of a downscaled statistical model. The SLP database is the NCEP reanalysis covering the North Atlantic region, and the calibration and validation periods used for fitting and evaluating the ability of the model are 1952-1992 and 1993-2011, respectively. In general, using four Principal Components, regression models are able to explain up to 70% of the variance of the streamflow data. Finally, the statistical model obtained for the observational data was applied to the SLP data for the period 2071-2100, using the outputs of different GCMs of the CMIP5 under the RPC8.5 scenario. The results found for the end of the century show no significant changes or moderate decrease in the streamflow of this river for most GCMs in winter, but for some of them the decrease is very strong. Keywords: Statistical downscaling, streamflow, Guadalquivir River, climate change. ACKNOWLEDGEMENTS This work has been financed by the projects P11-RNM-7941 (Junta de Andalucía-Spain) and CGL2013-48539-R (MINECO-Spain, FEDER).
Modified neural networks for rapid recovery of tokamak plasma parameters for real time control
NASA Astrophysics Data System (ADS)
Sengupta, A.; Ranjan, P.
2002-07-01
Two modified neural network techniques are used for the identification of the equilibrium plasma parameters of the Superconducting Steady State Tokamak I from external magnetic measurements. This is expected to ultimately assist in a real time plasma control. As different from the conventional network structure where a single network with the optimum number of processing elements calculates the outputs, a multinetwork system connected in parallel does the calculations here in one of the methods. This network is called the double neural network. The accuracy of the recovered parameters is clearly more than the conventional network. The other type of neural network used here is based on the statistical function parametrization combined with a neural network. The principal component transformation removes linear dependences from the measurements and a dimensional reduction process reduces the dimensionality of the input space. This reduced and transformed input set, rather than the entire set, is fed into the neural network input. This is known as the principal component transformation-based neural network. The accuracy of the recovered parameters in the latter type of modified network is found to be a further improvement over the accuracy of the double neural network. This result differs from that obtained in an earlier work where the double neural network showed better performance. The conventional network and the function parametrization methods have also been used for comparison. The conventional network has been used for an optimization of the set of magnetic diagnostics. The effective set of sensors, as assessed by this network, are compared with the principal component based network. Fault tolerance of the neural networks has been tested. The double neural network showed the maximum resistance to faults in the diagnostics, while the principal component based network performed poorly. Finally the processing times of the methods have been compared. The double network and the principal component network involve the minimum computation time, although the conventional network also performs well enough to be used in real time.
Donato, Gianluca; Bartlett, Marian Stewart; Hager, Joseph C.; Ekman, Paul; Sejnowski, Terrence J.
2010-01-01
The Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions. PMID:21188284
Yu, Marcia M L; Sandercock, P Mark L
2012-01-01
During the forensic examination of textile fibers, fibers are usually mounted on glass slides for visual inspection and identification under the microscope. One method that has the capability to accurately identify single textile fibers without subsequent demounting is Raman microspectroscopy. The effect of the mountant Entellan New on the Raman spectra of fibers was investigated to determine if it is suitable for fiber analysis. Raman spectra of synthetic fibers mounted in three different ways were collected and subjected to multivariate analysis. Principal component analysis score plots revealed that while spectra from different fiber classes formed distinct groups, fibers of the same class formed a single group regardless of the mounting method. The spectra of bare fibers and those mounted in Entellan New were found to be statistically indistinguishable by analysis of variance calculations. These results demonstrate that fibers mounted in Entellan New may be identified directly by Raman microspectroscopy without further sample preparation. © 2011 American Academy of Forensic Sciences.
Gouvinhas, Irene; Machado, Nelson; Carvalho, Teresa; de Almeida, José M M M; Barros, Ana I R N A
2015-01-01
Extra virgin olive oils produced from three cultivars on different maturation stages were characterized using Raman spectroscopy. Chemometric methods (principal component analysis, discriminant analysis, principal component regression and partial least squares regression) applied to Raman spectral data were utilized to evaluate and quantify the statistical differences between cultivars and their ripening process. The models for predicting the peroxide value and free acidity of olive oils showed good calibration and prediction values and presented high coefficients of determination (>0.933). Both the R(2), and the correlation equations between the measured chemical parameters, and the values predicted by each approach are presented; these comprehend both PCR and PLS, used to assess SNV normalized Raman data, as well as first and second derivative of the spectra. This study demonstrates that a combination of Raman spectroscopy with multivariate analysis methods can be useful to predict rapidly olive oil chemical characteristics during the maturation process. Copyright © 2014 Elsevier B.V. All rights reserved.
Spectral discrimination of serum from liver cancer and liver cirrhosis using Raman spectroscopy
NASA Astrophysics Data System (ADS)
Yang, Tianyue; Li, Xiaozhou; Yu, Ting; Sun, Ruomin; Li, Siqi
2011-07-01
In this paper, Raman spectra of human serum were measured using Raman spectroscopy, then the spectra was analyzed by multivariate statistical methods of principal component analysis (PCA). Then linear discriminant analysis (LDA) was utilized to differentiate the loading score of different diseases as the diagnosing algorithm. Artificial neural network (ANN) was used for cross-validation. The diagnosis sensitivity and specificity by PCA-LDA are 88% and 79%, while that of the PCA-ANN are 89% and 95%. It can be seen that modern analyzing method is a useful tool for the analysis of serum spectra for diagnosing diseases.
Across-cohort QC analyses of GWAS summary statistics from complex traits.
Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M
2016-01-01
Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics F st statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy.
Across-cohort QC analyses of GWAS summary statistics from complex traits
Chen, Guo-Bo; Lee, Sang Hong; Robinson, Matthew R; Trzaskowski, Maciej; Zhu, Zhi-Xiang; Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Kutalik, Zoltán; Loos, Ruth J F; Frayling, Timothy M; Hirschhorn, Joel N; Yang, Jian; Wray, Naomi R; Visscher, Peter M
2017-01-01
Genome-wide association studies (GWASs) have been successful in discovering SNP trait associations for many quantitative traits and common diseases. Typically, the effect sizes of SNP alleles are very small and this requires large genome-wide association meta-analyses (GWAMAs) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study, we propose four metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We propose methods to examine the concordance between demographic information, and summary statistics and methods to investigate sample overlap. (I) We use the population genetics Fst statistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. (II) We conduct principal component analysis based on reported allele frequencies, and are able to recover the ancestral information for each cohort. (III) We propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. (IV) To quantify unknown sample overlap across all pairs of cohorts, we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy. PMID:27552965
Principal components of wrist circumduction from electromagnetic surgical tracking.
Rasquinha, Brian J; Rainbow, Michael J; Zec, Michelle L; Pichora, David R; Ellis, Randy E
2017-02-01
An electromagnetic (EM) surgical tracking system was used for a functionally calibrated kinematic analysis of wrist motion. Circumduction motions were tested for differences in subject gender and for differences in the sense of the circumduction as clockwise or counter-clockwise motion. Twenty subjects were instrumented for EM tracking. Flexion-extension motion was used to identify the functional axis. Subjects performed unconstrained wrist circumduction in a clockwise and counter-clockwise sense. Data were decomposed into orthogonal flexion-extension motions and radial-ulnar deviation motions. PCA was used to concisely represent motions. Nonparametric Wilcoxon tests were used to distinguish the groups. Flexion-extension motions were projected onto a direction axis with a root-mean-square error of [Formula: see text]. Using the first three principal components, there was no statistically significant difference in gender (all [Formula: see text]). For motion sense, radial-ulnar deviation distinguished the sense of circumduction in the first principal component ([Formula: see text]) and in the third principal component ([Formula: see text]); flexion-extension distinguished the sense in the second principal component ([Formula: see text]). The clockwise sense of circumduction could be distinguished by a multifactorial combination of components; there were no gender differences in this small population. These data constitute a baseline for normal wrist circumduction. The multifactorial PCA findings suggest that a higher-dimensional method, such as manifold analysis, may be a more concise way of representing circumduction in human joints.
Investigation of domain walls in PPLN by confocal raman microscopy and PCA analysis
NASA Astrophysics Data System (ADS)
Shur, Vladimir Ya.; Zelenovskiy, Pavel; Bourson, Patrice
2017-07-01
Confocal Raman microscopy (CRM) is a powerful tool for investigation of ferroelectric domains. Mechanical stresses and electric fields existed in the vicinity of neutral and charged domain walls modify frequency, intensity and width of spectral lines [1], thus allowing to visualize micro- and nanodomain structures both at the surface and in the bulk of the crystal [2,3]. Stresses and fields are naturally coupled in ferroelectrics due to inverse piezoelectric effect and hardly can be separated in Raman spectra. PCA is a powerful statistical method for analysis of large data matrix providing a set of orthogonal variables, called principal components (PCs). PCA is widely used for classification of experimental data, for example, in crystallization experiments, for detection of small amounts of components in solid mixtures etc. [4,5]. In Raman spectroscopy PCA was applied for analysis of phase transitions and provided critical pressure with good accuracy [6]. In the present work we for the first time applied Principal Component Analysis (PCA) method for analysis of Raman spectra measured in periodically poled lithium niobate (PPLN). We found that principal components demonstrate different sensitivity to mechanical stresses and electric fields in the vicinity of the domain walls. This allowed us to separately visualize spatial distribution of fields and electric fields at the surface and in the bulk of PPLN.
Vina, Andres; Peters, Albert J.; Ji, Lei
2003-01-01
There is a global concern about the increase in atmospheric concentrations of greenhouse gases. One method being discussed to encourage greenhouse gas mitigation efforts is based on a trading system whereby carbon emitters can buy effective mitigation efforts from farmers implementing conservation tillage practices. These practices sequester carbon from the atmosphere, and such a trading system would require a low-cost and accurate method of verification. Remote sensing technology can offer such a verification technique. This paper is focused on the use of standard image processing procedures applied to a multispectral Ikonos image, to determine whether it is possible to validate that farmers have complied with agreements to implement conservation tillage practices. A principal component analysis (PCA) was performed in order to isolate image variance in cropped fields. Analyses of variance (ANOVA) statistical procedures were used to evaluate the capability of each Ikonos band and each principal component to discriminate between conventional and conservation tillage practices. A logistic regression model was implemented on the principal component most effective in discriminating between conventional and conservation tillage, in order to produce a map of the probability of conventional tillage. The Ikonos imagery, in combination with ground-reference information, proved to be a useful tool for verification of conservation tillage practices.
Clark, Neil R.; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D.; Jones, Matthew R.; Ma’ayan, Avi
2016-01-01
Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community. PMID:26848405
Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi
2015-11-01
Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.
Kalegowda, Yogesh; Harmer, Sarah L
2012-03-20
Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.
Statistical process control of cocrystallization processes: A comparison between OPLS and PLS.
Silva, Ana F T; Sarraguça, Mafalda Cruz; Ribeiro, Paulo R; Santos, Adenilson O; De Beer, Thomas; Lopes, João Almeida
2017-03-30
Orthogonal partial least squares regression (OPLS) is being increasingly adopted as an alternative to partial least squares (PLS) regression due to the better generalization that can be achieved. Particularly in multivariate batch statistical process control (BSPC), the use of OPLS for estimating nominal trajectories is advantageous. In OPLS, the nominal process trajectories are expected to be captured in a single predictive principal component while uncorrelated variations are filtered out to orthogonal principal components. In theory, OPLS will yield a better estimation of the Hotelling's T 2 statistic and corresponding control limits thus lowering the number of false positives and false negatives when assessing the process disturbances. Although OPLS advantages have been demonstrated in the context of regression, its use on BSPC was seldom reported. This study proposes an OPLS-based approach for BSPC of a cocrystallization process between hydrochlorothiazide and p-aminobenzoic acid monitored on-line with near infrared spectroscopy and compares the fault detection performance with the same approach based on PLS. A series of cocrystallization batches with imposed disturbances were used to test the ability to detect abnormal situations by OPLS and PLS-based BSPC methods. Results demonstrated that OPLS was generally superior in terms of sensibility and specificity in most situations. In some abnormal batches, it was found that the imposed disturbances were only detected with OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Zhu; Li, Hongbin; Tang, Dengping; Hu, Chen; Jiao, Yang
2017-10-01
Metering performance is the key parameter of an electronic voltage transformer (EVT), and it requires high accuracy. The conventional off-line calibration method using a standard voltage transformer is not suitable for the key equipment in a smart substation, which needs on-line monitoring. In this article, we propose a method for monitoring the metering performance of an EVT on-line based on cyber-physics correlation analysis. By the electrical and physical properties of a substation running in three-phase symmetry, the principal component analysis method is used to separate the metering deviation caused by the primary fluctuation and the EVT anomaly. The characteristic statistics of the measured data during operation are extracted, and the metering performance of the EVT is evaluated by analyzing the change in statistics. The experimental results show that the method successfully monitors the metering deviation of a Class 0.2 EVT accurately. The method demonstrates the accurate evaluation of on-line monitoring of the metering performance on an EVT without a standard voltage transformer.
Use of multivariate statistics to identify unreliable data obtained using CASA.
Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón
2013-06-01
In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.
Zhang, Hong-Guang; Yang, Qin-Min; Lu, Jian-Gang
2014-04-01
In this paper, a novel discriminant methodology based on near infrared spectroscopic analysis technique and least square support vector machine was proposed for rapid and nondestructive discrimination of different types of Polyacrylamide. The diffuse reflectance spectra of samples of Non-ionic Polyacrylamide, Anionic Polyacrylamide and Cationic Polyacrylamide were measured. Then principal component analysis method was applied to reduce the dimension of the spectral data and extract of the principal compnents. The first three principal components were used for cluster analysis of the three different types of Polyacrylamide. Then those principal components were also used as inputs of least square support vector machine model. The optimization of the parameters and the number of principal components used as inputs of least square support vector machine model was performed through cross validation based on grid search. 60 samples of each type of Polyacrylamide were collected. Thus a total of 180 samples were obtained. 135 samples, 45 samples for each type of Polyacrylamide, were randomly split into a training set to build calibration model and the rest 45 samples were used as test set to evaluate the performance of the developed model. In addition, 5 Cationic Polyacrylamide samples and 5 Anionic Polyacrylamide samples adulterated with different proportion of Non-ionic Polyacrylamide were also prepared to show the feasibilty of the proposed method to discriminate the adulterated Polyacrylamide samples. The prediction error threshold for each type of Polyacrylamide was determined by F statistical significance test method based on the prediction error of the training set of corresponding type of Polyacrylamide in cross validation. The discrimination accuracy of the built model was 100% for prediction of the test set. The prediction of the model for the 10 mixing samples was also presented, and all mixing samples were accurately discriminated as adulterated samples. The overall results demonstrate that the discrimination method proposed in the present paper can rapidly and nondestructively discriminate the different types of Polyacrylamide and the adulterated Polyacrylamide samples, and offered a new approach to discriminate the types of Polyacrylamide.
Cole, Jacqueline M; Cheng, Xie; Payne, Michael C
2016-11-07
The use of principal component analysis (PCA) to statistically infer features of local structure from experimental pair distribution function (PDF) data is assessed on a case study of rare-earth phosphate glasses (REPGs). Such glasses, codoped with two rare-earth ions (R and R') of different sizes and optical properties, are of interest to the laser industry. The determination of structure-property relationships in these materials is an important aspect of their technological development. Yet, realizing the local structure of codoped REPGs presents significant challenges relative to their singly doped counterparts; specifically, R and R' are difficult to distinguish in terms of establishing relative material compositions, identifying atomic pairwise correlation profiles in a PDF that are associated with each ion, and resolving peak overlap of such profiles in PDFs. This study demonstrates that PCA can be employed to help overcome these structural complications, by statistically inferring trends in PDFs that exist for a restricted set of experimental data on REPGs, and using these as training data to predict material compositions and PDF profiles in unknown codoped REPGs. The application of these PCA methods to resolve individual atomic pairwise correlations in t(r) signatures is also presented. The training methods developed for these structural predictions are prevalidated by testing their ability to reproduce known physical phenomena, such as the lanthanide contraction, on PDF signatures of the structurally simpler singly doped REPGs. The intrinsic limitations of applying PCA to analyze PDFs relative to the quality control of source data, data processing, and sample definition, are also considered. While this case study is limited to lanthanide-doped REPGs, this type of statistical inference may easily be extended to other inorganic solid-state materials and be exploited in large-scale data-mining efforts that probe many t(r) functions.
Li, Jinling; He, Ming; Han, Wei; Gu, Yifan
2009-05-30
An investigation on heavy metal sources, i.e., Cu, Zn, Ni, Pb, Cr, and Cd in the coastal soils of Shanghai, China, was conducted using multivariate statistical methods (principal component analysis, clustering analysis, and correlation analysis). All the results of the multivariate analysis showed that: (i) Cu, Ni, Pb, and Cd had anthropogenic sources (e.g., overuse of chemical fertilizers and pesticides, industrial and municipal discharges, animal wastes, sewage irrigation, etc.); (ii) Zn and Cr were associated with parent materials and therefore had natural sources (e.g., the weathering process of parent materials and subsequent pedo-genesis due to the alluvial deposits). The effect of heavy metals in the soils was greatly affected by soil formation, atmospheric deposition, and human activities. These findings provided essential information on the possible sources of heavy metals, which would contribute to the monitoring and assessment process of agricultural soils in worldwide regions.
Rapid analysis of pharmaceutical drugs using LIBS coupled with multivariate analysis.
Tiwari, P K; Awasthi, S; Kumar, R; Anand, R K; Rai, P K; Rai, A K
2018-02-01
Type 2 diabetes drug tablets containing voglibose having dose strengths of 0.2 and 0.3 mg of various brands have been examined, using laser-induced breakdown spectroscopy (LIBS) technique. The statistical methods such as the principal component analysis (PCA) and the partial least square regression analysis (PLSR) have been employed on LIBS spectral data for classifying and developing the calibration models of drug samples. We have developed the ratio-based calibration model applying PLSR in which relative spectral intensity ratios H/C, H/N and O/N are used. Further, the developed model has been employed to predict the relative concentration of element in unknown drug samples. The experiment has been performed in air and argon atmosphere, respectively, and the obtained results have been compared. The present model provides rapid spectroscopic method for drug analysis with high statistical significance for online control and measurement process in a wide variety of pharmaceutical industrial applications.
Statistics as Tools in Library Planning: On the State and Institutional Level.
ERIC Educational Resources Information Center
Trezza, Alphonse F.
The principal uses of statistics in library planning may be illustrated by examples from the state of Illinois. State law specifies that the Illinois State Library compile and publish statistics on libraries. State agencies also play an important and expanding role in this effort. The state library now compiles statistics on all types of…
Karain, Wael I
2017-11-28
Proteins undergo conformational transitions over different time scales. These transitions are closely intertwined with the protein's function. Numerous standard techniques such as principal component analysis are used to detect these transitions in molecular dynamics simulations. In this work, we add a new method that has the ability to detect transitions in dynamics based on the recurrences in the dynamical system. It combines bootstrapping and recurrence quantification analysis. We start from the assumption that a protein has a "baseline" recurrence structure over a given period of time. Any statistically significant deviation from this recurrence structure, as inferred from complexity measures provided by recurrence quantification analysis, is considered a transition in the dynamics of the protein. We apply this technique to a 132 ns long molecular dynamics simulation of the β-Lactamase Inhibitory Protein BLIP. We are able to detect conformational transitions in the nanosecond range in the recurrence dynamics of the BLIP protein during the simulation. The results compare favorably to those extracted using the principal component analysis technique. The recurrence quantification analysis based bootstrap technique is able to detect transitions between different dynamics states for a protein over different time scales. It is not limited to linear dynamics regimes, and can be generalized to any time scale. It also has the potential to be used to cluster frames in molecular dynamics trajectories according to the nature of their recurrence dynamics. One shortcoming for this method is the need to have large enough time windows to insure good statistical quality for the recurrence complexity measures needed to detect the transitions.
Infrared micro-spectroscopic studies of epithelial cells
Romeo, Melissa; Mohlenhoff, Brian; Jennings, Michael; Diem, Max
2009-01-01
We report results from a study of human and canine mucosal cells, investigated by infrared micro-spectroscopy, and analyzed by methods of multivariate statistics. We demonstrate that the infrared spectra of individual cells are sensitive to the stage of maturation, and that a distinction between healthy and diseased cells will be possible. Since this report is written for an audience not familiar with infrared micro-spectroscopy, a short introduction into this field is presented along with a summary of principal component analysis. PMID:16797481
Improving the Principal Selection Process to Enhance the Opportunities for Women.
ERIC Educational Resources Information Center
Chapman, Judith
1986-01-01
Presents statistical profiles of Australian women principals and reviews research on school administrator selection in Australia, the United Kingdom, and the United States. To ensure equity, specific recommendations are given concerning vacancy announcements, criteria identification, consideration of evidence, and interviewing and decision-making…
Teacher Contract Non-Renewal: Midwest, Rocky Mountains, and Southeast
ERIC Educational Resources Information Center
Nixon, Andy; Dam, Margaret; Packard, Abbot L.
2012-01-01
This quantitative study investigated reasons that school principals recommend non-renewal of probationary teachers' contracts. Principal survey results from three regions of the US (Midwest, Rocky Mountains, & Southeast) were analyzed using the Kruskal-Wallis and Mann-Whitney U statistical procedures, while significance was tested applying a…
Machine learning of frustrated classical spin models. I. Principal component analysis
NASA Astrophysics Data System (ADS)
Wang, Ce; Zhai, Hui
2017-10-01
This work aims at determining whether artificial intelligence can recognize a phase transition without prior human knowledge. If this were successful, it could be applied to, for instance, analyzing data from the quantum simulation of unsolved physical models. Toward this goal, we first need to apply the machine learning algorithm to well-understood models and see whether the outputs are consistent with our prior knowledge, which serves as the benchmark for this approach. In this work, we feed the computer data generated by the classical Monte Carlo simulation for the X Y model in frustrated triangular and union jack lattices, which has two order parameters and exhibits two phase transitions. We show that the outputs of the principal component analysis agree very well with our understanding of different orders in different phases, and the temperature dependences of the major components detect the nature and the locations of the phase transitions. Our work offers promise for using machine learning techniques to study sophisticated statistical models, and our results can be further improved by using principal component analysis with kernel tricks and the neural network method.
Conlon, Anna S C; Taylor, Jeremy M G; Elliott, Michael R
2014-04-01
In clinical trials, a surrogate outcome variable (S) can be measured before the outcome of interest (T) and may provide early information regarding the treatment (Z) effect on T. Using the principal surrogacy framework introduced by Frangakis and Rubin (2002. Principal stratification in causal inference. Biometrics 58, 21-29), we consider an approach that has a causal interpretation and develop a Bayesian estimation strategy for surrogate validation when the joint distribution of potential surrogate and outcome measures is multivariate normal. From the joint conditional distribution of the potential outcomes of T, given the potential outcomes of S, we propose surrogacy validation measures from this model. As the model is not fully identifiable from the data, we propose some reasonable prior distributions and assumptions that can be placed on weakly identified parameters to aid in estimation. We explore the relationship between our surrogacy measures and the surrogacy measures proposed by Prentice (1989. Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in Medicine 8, 431-440). The method is applied to data from a macular degeneration study and an ovarian cancer study.
Conlon, Anna S. C.; Taylor, Jeremy M. G.; Elliott, Michael R.
2014-01-01
In clinical trials, a surrogate outcome variable (S) can be measured before the outcome of interest (T) and may provide early information regarding the treatment (Z) effect on T. Using the principal surrogacy framework introduced by Frangakis and Rubin (2002. Principal stratification in causal inference. Biometrics 58, 21–29), we consider an approach that has a causal interpretation and develop a Bayesian estimation strategy for surrogate validation when the joint distribution of potential surrogate and outcome measures is multivariate normal. From the joint conditional distribution of the potential outcomes of T, given the potential outcomes of S, we propose surrogacy validation measures from this model. As the model is not fully identifiable from the data, we propose some reasonable prior distributions and assumptions that can be placed on weakly identified parameters to aid in estimation. We explore the relationship between our surrogacy measures and the surrogacy measures proposed by Prentice (1989. Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in Medicine 8, 431–440). The method is applied to data from a macular degeneration study and an ovarian cancer study. PMID:24285772
Zhang, Jian; Hou, Dibo; Wang, Ke; Huang, Pingjie; Zhang, Guangxin; Loáiciga, Hugo
2017-05-01
The detection of organic contaminants in water distribution systems is essential to protect public health from potential harmful compounds resulting from accidental spills or intentional releases. Existing methods for detecting organic contaminants are based on quantitative analyses such as chemical testing and gas/liquid chromatography, which are time- and reagent-consuming and involve costly maintenance. This study proposes a novel procedure based on discrete wavelet transform and principal component analysis for detecting organic contamination events from ultraviolet spectral data. Firstly, the spectrum of each observation is transformed using discrete wavelet with a coiflet mother wavelet to capture the abrupt change along the wavelength. Principal component analysis is then employed to approximate the spectra based on capture and fusion features. The significant value of Hotelling's T 2 statistics is calculated and used to detect outliers. An alarm of contamination event is triggered by sequential Bayesian analysis when the outliers appear continuously in several observations. The effectiveness of the proposed procedure is tested on-line using a pilot-scale setup and experimental data.
ERIC Educational Resources Information Center
Martin, Tammy Faith
2012-01-01
The purpose of this study was to examine principal leadership styles and their influence on school performance as measured by adequate yearly progress at selected Title I schools in South Carolina. The main focus of the research study was to complete descriptive statistics on principal leadership styles in schools that met or did not meet adequate…
Dihedral angle principal component analysis of molecular dynamics simulations.
Altis, Alexandros; Nguyen, Phuong H; Hegger, Rainer; Stock, Gerhard
2007-06-28
It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {phi(n)} to the metric coordinate space {x(n)=cos phi(n),y(n)=sin phi(n)} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300 ns molecular dynamics simulation, a critical comparison of the various methods is given.
Dihedral angle principal component analysis of molecular dynamics simulations
NASA Astrophysics Data System (ADS)
Altis, Alexandros; Nguyen, Phuong H.; Hegger, Rainer; Stock, Gerhard
2007-06-01
It has recently been suggested by Mu et al. [Proteins 58, 45 (2005)] to use backbone dihedral angles instead of Cartesian coordinates in a principal component analysis of molecular dynamics simulations. Dihedral angles may be advantageous because internal coordinates naturally provide a correct separation of internal and overall motion, which was found to be essential for the construction and interpretation of the free energy landscape of a biomolecule undergoing large structural rearrangements. To account for the circular statistics of angular variables, a transformation from the space of dihedral angles {φn} to the metric coordinate space {xn=cosφn,yn=sinφn} was employed. To study the validity and the applicability of the approach, in this work the theoretical foundations underlying the dihedral angle principal component analysis (dPCA) are discussed. It is shown that the dPCA amounts to a one-to-one representation of the original angle distribution and that its principal components can readily be characterized by the corresponding conformational changes of the peptide. Furthermore, a complex version of the dPCA is introduced, in which N angular variables naturally lead to N eigenvalues and eigenvectors. Applying the methodology to the construction of the free energy landscape of decaalanine from a 300ns molecular dynamics simulation, a critical comparison of the various methods is given.
Shin, S M; Kim, Y-I; Choi, Y-S; Yamaguchi, T; Maki, K; Cho, B-H; Park, S-B
2015-01-01
To evaluate axial cervical vertebral (ACV) shape quantitatively and to build a prediction model for skeletal maturation level using statistical shape analysis for Japanese individuals. The sample included 24 female and 19 male patients with hand-wrist radiographs and CBCT images. Through generalized Procrustes analysis and principal components (PCs) analysis, the meaningful PCs were extracted from each ACV shape and analysed for the estimation regression model. Each ACV shape had meaningful PCs, except for the second axial cervical vertebra. Based on these models, the smallest prediction intervals (PIs) were from the combination of the shape space PCs, age and gender. Overall, the PIs of the male group were smaller than those of the female group. There was no significant correlation between centroid size as a size factor and skeletal maturation level. Our findings suggest that the ACV maturation method, which was applied by statistical shape analysis, could confirm information about skeletal maturation in Japanese individuals as an available quantifier of skeletal maturation and could be as useful a quantitative method as the skeletal maturation index.
NASA Technical Reports Server (NTRS)
Aires, Filipe; Rossow, William B.; Chedin, Alain; Hansen, James E. (Technical Monitor)
2001-01-01
The Independent Component Analysis is a recently developed technique for component extraction. This new method requires the statistical independence of the extracted components, a stronger constraint that uses higher-order statistics, instead of the classical decorrelation, a weaker constraint that uses only second-order statistics. This technique has been used recently for the analysis of geophysical time series with the goal of investigating the causes of variability in observed data (i.e. exploratory approach). We demonstrate with a data simulation experiment that, if initialized with a Principal Component Analysis, the Independent Component Analysis performs a rotation of the classical PCA (or EOF) solution. This rotation uses no localization criterion like other Rotation Techniques (RT), only the global generalization of decorrelation by statistical independence is used. This rotation of the PCA solution seems to be able to solve the tendency of PCA to mix several physical phenomena, even when the signal is just their linear sum.
NASA Astrophysics Data System (ADS)
He, Shiyuan; Wang, Lifan; Huang, Jianhua Z.
2018-04-01
With growing data from ongoing and future supernova surveys, it is possible to empirically quantify the shapes of SNIa light curves in more detail, and to quantitatively relate the shape parameters with the intrinsic properties of SNIa. Building such relationships is critical in controlling systematic errors associated with supernova cosmology. Based on a collection of well-observed SNIa samples accumulated in the past years, we construct an empirical SNIa light curve model using a statistical method called the functional principal component analysis (FPCA) for sparse and irregularly sampled functional data. Using this method, the entire light curve of an SNIa is represented by a linear combination of principal component functions, and the SNIa is represented by a few numbers called “principal component scores.” These scores are used to establish relations between light curve shapes and physical quantities such as intrinsic color, interstellar dust reddening, spectral line strength, and spectral classes. These relations allow for descriptions of some critical physical quantities based purely on light curve shape parameters. Our study shows that some important spectral feature information is being encoded in the broad band light curves; for instance, we find that the light curve shapes are correlated with the velocity and velocity gradient of the Si II λ6355 line. This is important for supernova surveys (e.g., LSST and WFIRST). Moreover, the FPCA light curve model is used to construct the entire light curve shape, which in turn is used in a functional linear form to adjust intrinsic luminosity when fitting distance models.
Baseline estimation in flame's spectra by using neural networks and robust statistics
NASA Astrophysics Data System (ADS)
Garces, Hugo; Arias, Luis; Rojas, Alejandro
2014-09-01
This work presents a baseline estimation method in flame spectra based on artificial intelligence structure as a neural network, combining robust statistics with multivariate analysis to automatically discriminate measured wavelengths belonging to continuous feature for model adaptation, surpassing restriction of measuring target baseline for training. The main contributions of this paper are: to analyze a flame spectra database computing Jolliffe statistics from Principal Components Analysis detecting wavelengths not correlated with most of the measured data corresponding to baseline; to systematically determine the optimal number of neurons in hidden layers based on Akaike's Final Prediction Error; to estimate baseline in full wavelength range sampling measured spectra; and to train an artificial intelligence structure as a Neural Network which allows to generalize the relation between measured and baseline spectra. The main application of our research is to compute total radiation with baseline information, allowing to diagnose combustion process state for optimization in early stages.
The Trauma of Adolescent Suicide: A Time for Special Leadership by Principals.
ERIC Educational Resources Information Center
Dempsey, Richard A.
This monograph provides principals and school officials with information about coping with adolescent suicide. Section 1, "Introduction," discusses the uncomfortable nature of the topic, cites statistics, and recommends that preventive programs be developed. Section 2, "Causes of Suicide," analyzes stress and depression among youth and suggests…
50 CFR 300.183 - Permit holder reporting and recordkeeping requirements.
Code of Federal Regulations, 2010 CFR
2010-10-01
... person required to obtain a trade permit under § 300.182 retains, at his/her principal place of business... his/her principal place of business, a copy of each biweekly report and all supporting records for a... regulated under this subpart, biweekly reports, statistical documents, catch documents, re-export...
Connecting Principal Leadership, Teacher Collaboration, and Student Achievement
ERIC Educational Resources Information Center
Goddard, Yvonne L.; Miller, Robert; Larsen, Ross; Goddard, Roger; Madsen, Jean; Schroeder, Patricia
2010-01-01
The purpose of this paper was to test the relationship between principal leadership and teacher collaboration around instructional improvement to determine whether these measures were statistically related and whether, together, they were associated with academic achievement in elementary schools. Data were obtained from 1,600 teachers in 96…
Smith, Joseph M.; Mather, Martha E.
2012-01-01
Ecological indicators are science-based tools used to assess how human activities have impacted environmental resources. For monitoring and environmental assessment, existing species assemblage data can be used to make these comparisons through time or across sites. An impediment to using assemblage data, however, is that these data are complex and need to be simplified in an ecologically meaningful way. Because multivariate statistics are mathematical relationships, statistical groupings may not make ecological sense and will not have utility as indicators. Our goal was to define a process to select defensible and ecologically interpretable statistical simplifications of assemblage data in which researchers and managers can have confidence. For this, we chose a suite of statistical methods, compared the groupings that resulted from these analyses, identified convergence among groupings, then we interpreted the groupings using species and ecological guilds. When we tested this approach using a statewide stream fish dataset, not all statistical methods worked equally well. For our dataset, logistic regression (Log), detrended correspondence analysis (DCA), cluster analysis (CL), and non-metric multidimensional scaling (NMDS) provided consistent, simplified output. Specifically, the Log, DCA, CL-1, and NMDS-1 groupings were ≥60% similar to each other, overlapped with the fluvial-specialist ecological guild, and contained a common subset of species. Groupings based on number of species (e.g., Log, DCA, CL and NMDS) outperformed groupings based on abundance [e.g., principal components analysis (PCA) and Poisson regression]. Although the specific methods that worked on our test dataset have generality, here we are advocating a process (e.g., identifying convergent groupings with redundant species composition that are ecologically interpretable) rather than the automatic use of any single statistical tool. We summarize this process in step-by-step guidance for the future use of these commonly available ecological and statistical methods in preparing assemblage data for use in ecological indicators.
NASA Astrophysics Data System (ADS)
Chlebda, Damian K.; Majda, Alicja; Łojewski, Tomasz; Łojewska, Joanna
2016-11-01
Differentiation of the written text can be performed with a non-invasive and non-contact tool that connects conventional imaging methods with spectroscopy. Hyperspectral imaging (HSI) is a relatively new and rapid analytical technique that can be applied in forensic science disciplines. It allows an image of the sample to be acquired, with full spectral information within every pixel. For this paper, HSI and three statistical methods (hierarchical cluster analysis, principal component analysis, and spectral angle mapper) were used to distinguish between traces of modern black gel pen inks. Non-invasiveness and high efficiency are among the unquestionable advantages of ink differentiation using HSI. It is also less time-consuming than traditional methods such as chromatography. In this study, a set of 45 modern gel pen ink marks deposited on a paper sheet were registered. The spectral characteristics embodied in every pixel were extracted from an image and analysed using statistical methods, externally and directly on the hypercube. As a result, different black gel inks deposited on paper can be distinguished and classified into several groups, in a non-invasive manner.
Rollins, Derrick K; Teh, Ailing
2010-12-17
Microarray data sets provide relative expression levels for thousands of genes for a small number, in comparison, of different experimental conditions called assays. Data mining techniques are used to extract specific information of genes as they relate to the assays. The multivariate statistical technique of principal component analysis (PCA) has proven useful in providing effective data mining methods. This article extends the PCA approach of Rollins et al. to the development of ranking genes of microarray data sets that express most differently between two biologically different grouping of assays. This method is evaluated on real and simulated data and compared to a current approach on the basis of false discovery rate (FDR) and statistical power (SP) which is the ability to correctly identify important genes. This work developed and evaluated two new test statistics based on PCA and compared them to a popular method that is not PCA based. Both test statistics were found to be effective as evaluated in three case studies: (i) exposing E. coli cells to two different ethanol levels; (ii) application of myostatin to two groups of mice; and (iii) a simulated data study derived from the properties of (ii). The proposed method (PM) effectively identified critical genes in these studies based on comparison with the current method (CM). The simulation study supports higher identification accuracy for PM over CM for both proposed test statistics when the gene variance is constant and for one of the test statistics when the gene variance is non-constant. PM compares quite favorably to CM in terms of lower FDR and much higher SP. Thus, PM can be quite effective in producing accurate signatures from large microarray data sets for differential expression between assays groups identified in a preliminary step of the PCA procedure and is, therefore, recommended for use in these applications.
Chapman, Benjamin P.; Weiss, Alexander; Duberstein, Paul
2016-01-01
Statistical learning theory (SLT) is the statistical formulation of machine learning theory, a body of analytic methods common in “big data” problems. Regression-based SLT algorithms seek to maximize predictive accuracy for some outcome, given a large pool of potential predictors, without overfitting the sample. Research goals in psychology may sometimes call for high dimensional regression. One example is criterion-keyed scale construction, where a scale with maximal predictive validity must be built from a large item pool. Using this as a working example, we first introduce a core principle of SLT methods: minimization of expected prediction error (EPE). Minimizing EPE is fundamentally different than maximizing the within-sample likelihood, and hinges on building a predictive model of sufficient complexity to predict the outcome well, without undue complexity leading to overfitting. We describe how such models are built and refined via cross-validation. We then illustrate how three common SLT algorithms–Supervised Principal Components, Regularization, and Boosting—can be used to construct a criterion-keyed scale predicting all-cause mortality, using a large personality item pool within a population cohort. Each algorithm illustrates a different approach to minimizing EPE. Finally, we consider broader applications of SLT predictive algorithms, both as supportive analytic tools for conventional methods, and as primary analytic tools in discovery phase research. We conclude that despite their differences from the classic null-hypothesis testing approach—or perhaps because of them–SLT methods may hold value as a statistically rigorous approach to exploratory regression. PMID:27454257
Damage localization by statistical evaluation of signal-processed mode shapes
NASA Astrophysics Data System (ADS)
Ulriksen, M. D.; Damkilde, L.
2015-07-01
Due to their inherent, ability to provide structural information on a local level, mode shapes and t.lieir derivatives are utilized extensively for structural damage identification. Typically, more or less advanced mathematical methods are implemented to identify damage-induced discontinuities in the spatial mode shape signals, hereby potentially facilitating damage detection and/or localization. However, by being based on distinguishing damage-induced discontinuities from other signal irregularities, an intrinsic deficiency in these methods is the high sensitivity towards measurement, noise. The present, article introduces a damage localization method which, compared to the conventional mode shape-based methods, has greatly enhanced robustness towards measurement, noise. The method is based on signal processing of spatial mode shapes by means of continuous wavelet, transformation (CWT) and subsequent, application of a generalized discrete Teager-Kaiser energy operator (GDTKEO) to identify damage-induced mode shape discontinuities. In order to evaluate whether the identified discontinuities are in fact, damage-induced, outlier analysis of principal components of the signal-processed mode shapes is conducted on the basis of T2-statistics. The proposed method is demonstrated in the context, of analytical work with a free-vibrating Euler-Bernoulli beam under noisy conditions.
Heath, Anna; Manolopoulou, Ioanna; Baio, Gianluca
2016-10-15
The Expected Value of Perfect Partial Information (EVPPI) is a decision-theoretic measure of the 'cost' of parametric uncertainty in decision making used principally in health economic decision making. Despite this decision-theoretic grounding, the uptake of EVPPI calculations in practice has been slow. This is in part due to the prohibitive computational time required to estimate the EVPPI via Monte Carlo simulations. However, recent developments have demonstrated that the EVPPI can be estimated by non-parametric regression methods, which have significantly decreased the computation time required to approximate the EVPPI. Under certain circumstances, high-dimensional Gaussian Process (GP) regression is suggested, but this can still be prohibitively expensive. Applying fast computation methods developed in spatial statistics using Integrated Nested Laplace Approximations (INLA) and projecting from a high-dimensional into a low-dimensional input space allows us to decrease the computation time for fitting these high-dimensional GP, often substantially. We demonstrate that the EVPPI calculated using our method for GP regression is in line with the standard GP regression method and that despite the apparent methodological complexity of this new method, R functions are available in the package BCEA to implement it simply and efficiently. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Gabriel, Erin E.; Gilbert, Peter B.
2014-01-01
Principal surrogate (PS) endpoints are relatively inexpensive and easy to measure study outcomes that can be used to reliably predict treatment effects on clinical endpoints of interest. Few statistical methods for assessing the validity of potential PSs utilize time-to-event clinical endpoint information and to our knowledge none allow for the characterization of time-varying treatment effects. We introduce the time-dependent and surrogate-dependent treatment efficacy curve, \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\mathrm {TE}}(t|s)$\\end{document}, and a new augmented trial design for assessing the quality of a biomarker as a PS. We propose a novel Weibull model and an estimated maximum likelihood method for estimation of the \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}${\\mathrm {TE}}(t|s)$\\end{document} curve. We describe the operating characteristics of our methods via simulations. We analyze data from the Diabetes Control and Complications Trial, in which we find evidence of a biomarker with value as a PS. PMID:24337534
NASA Technical Reports Server (NTRS)
Rampe, E. B.; Lanza, N. L.
2012-01-01
Orbital near-infrared (NIR) reflectance spectra of the martian surface from the OMEGA and CRISM instruments have identified a variety of phyllosilicates in Noachian terrains. The types of phyllosilicates present on Mars have important implications for the aqueous environments in which they formed, and, thus, for recognizing locales that may have been habitable. Current identifications of phyllosilicates from martian NIR data are based on the positions of spectral absorptions relative to laboratory data of well-characterized samples and from spectral ratios; however, some phyllosilicates can be difficult to distinguish from one another with these methods (i.e. illite vs. muscovite). Here we employ a multivariate statistical technique, principal component analysis (PCA), to differentiate between spectrally similar phyllosilicate minerals. PCA is commonly used in a variety of industries (pharmaceutical, agricultural, viticultural) to discriminate between samples. Previous work using PCA to analyze raw NIR reflectance data from mineral mixtures has shown that this is a viable technique for identifying mineral types, abundances, and particle sizes. Here, we evaluate PCA of second-derivative NIR reflectance data as a method for classifying phyllosilicates and test whether this method can be used to identify phyllosilicates on Mars.
Quantification of intensity variations in functional MR images using rotated principal components
NASA Astrophysics Data System (ADS)
Backfrieder, W.; Baumgartner, R.; Sámal, M.; Moser, E.; Bergmann, H.
1996-08-01
In functional MRI (fMRI), the changes in cerebral haemodynamics related to stimulated neural brain activity are measured using standard clinical MR equipment. Small intensity variations in fMRI data have to be detected and distinguished from non-neural effects by careful image analysis. Based on multivariate statistics we describe an algorithm involving oblique rotation of the most significant principal components for an estimation of the temporal and spatial distribution of the stimulated neural activity over the whole image matrix. This algorithm takes advantage of strong local signal variations. A mathematical phantom was designed to generate simulated data for the evaluation of the method. In simulation experiments, the potential of the method to quantify small intensity changes, especially when processing data sets containing multiple sources of signal variations, was demonstrated. In vivo fMRI data collected in both visual and motor stimulation experiments were analysed, showing a proper location of the activated cortical regions within well known neural centres and an accurate extraction of the activation time profile. The suggested method yields accurate absolute quantification of in vivo brain activity without the need of extensive prior knowledge and user interaction.
A Novel Acoustic Sensor Approach to Classify Seeds Based on Sound Absorption Spectra
Gasso-Tortajada, Vicent; Ward, Alastair J.; Mansur, Hasib; Brøchner, Torben; Sørensen, Claus G.; Green, Ole
2010-01-01
A non-destructive and novel in situ acoustic sensor approach based on the sound absorption spectra was developed for identifying and classifying different seed types. The absorption coefficient spectra were determined by using the impedance tube measurement method. Subsequently, a multivariate statistical analysis, i.e., principal component analysis (PCA), was performed as a way to generate a classification of the seeds based on the soft independent modelling of class analogy (SIMCA) method. The results show that the sound absorption coefficient spectra of different seed types present characteristic patterns which are highly dependent on seed size and shape. In general, seed particle size and sphericity were inversely related with the absorption coefficient. PCA presented reliable grouping capabilities within the diverse seed types, since the 95% of the total spectral variance was described by the first two principal components. Furthermore, the SIMCA classification model based on the absorption spectra achieved optimal results as 100% of the evaluation samples were correctly classified. This study contains the initial structuring of an innovative method that will present new possibilities in agriculture and industry for classifying and determining physical properties of seeds and other materials. PMID:22163455
Conditional Random Fields for Fast, Large-Scale Genome-Wide Association Studies
Huang, Jim C.; Meek, Christopher; Kadie, Carl; Heckerman, David
2011-01-01
Understanding the role of genetic variation in human diseases remains an important problem to be solved in genomics. An important component of such variation consist of variations at single sites in DNA, or single nucleotide polymorphisms (SNPs). Typically, the problem of associating particular SNPs to phenotypes has been confounded by hidden factors such as the presence of population structure, family structure or cryptic relatedness in the sample of individuals being analyzed. Such confounding factors lead to a large number of spurious associations and missed associations. Various statistical methods have been proposed to account for such confounding factors such as linear mixed-effect models (LMMs) or methods that adjust data based on a principal components analysis (PCA), but these methods either suffer from low power or cease to be tractable for larger numbers of individuals in the sample. Here we present a statistical model for conducting genome-wide association studies (GWAS) that accounts for such confounding factors. Our method scales in runtime quadratic in the number of individuals being studied with only a modest loss in statistical power as compared to LMM-based and PCA-based methods when testing on synthetic data that was generated from a generalized LMM. Applying our method to both real and synthetic human genotype/phenotype data, we demonstrate the ability of our model to correct for confounding factors while requiring significantly less runtime relative to LMMs. We have implemented methods for fitting these models, which are available at http://www.microsoft.com/science. PMID:21765897
Whole vertebral bone segmentation method with a statistical intensity-shape model based approach
NASA Astrophysics Data System (ADS)
Hanaoka, Shouhei; Fritscher, Karl; Schuler, Benedikt; Masutani, Yoshitaka; Hayashi, Naoto; Ohtomo, Kuni; Schubert, Rainer
2011-03-01
An automatic segmentation algorithm for the vertebrae in human body CT images is presented. Especially we focused on constructing and utilizing 4 different statistical intensity-shape combined models for the cervical, upper / lower thoracic and lumbar vertebrae, respectively. For this purpose, two previously reported methods were combined: a deformable model-based initial segmentation method and a statistical shape-intensity model-based precise segmentation method. The former is used as a pre-processing to detect the position and orientation of each vertebra, which determines the initial condition for the latter precise segmentation method. The precise segmentation method needs prior knowledge on both the intensities and the shapes of the objects. After PCA analysis of such shape-intensity expressions obtained from training image sets, vertebrae were parametrically modeled as a linear combination of the principal component vectors. The segmentation of each target vertebra was performed as fitting of this parametric model to the target image by maximum a posteriori estimation, combined with the geodesic active contour method. In the experimental result by using 10 cases, the initial segmentation was successful in 6 cases and only partially failed in 4 cases (2 in the cervical area and 2 in the lumbo-sacral). In the precise segmentation, the mean error distances were 2.078, 1.416, 0.777, 0.939 mm for cervical, upper and lower thoracic, lumbar spines, respectively. In conclusion, our automatic segmentation algorithm for the vertebrae in human body CT images showed a fair performance for cervical, thoracic and lumbar vertebrae.
ERIC Educational Resources Information Center
Engel, Mimi
2013-01-01
Purpose: Relatively little is known about how principals make decisions about teacher hiring. This article uses mixed methods to examine what characteristics principals look for in teachers. Research Methods: Data were gathered using a mixed method approach, including in-depth interviews with a representative sample of 31 principals as well as an…
What Are the Characteristics of Principals Identified As Effective by Teachers?
ERIC Educational Resources Information Center
Fowler, William J., Jr.
This exploratory study investigated which characteristics of a principal are identified as effective by teachers in the same school setting. The data were obtained from the Schools and Staffing Study of 1988, from the National Center for Education Statistics (NCES). The Teacher Questionnaire of the Schools and Staffing Survey (SASS) questioned…
An Independent Filter for Gene Set Testing Based on Spectral Enrichment.
Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H
2015-01-01
Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.
NASA Astrophysics Data System (ADS)
Díaz-Ayil, Gilberto; Amouroux, Marine; Clanché, Fabien; Granjon, Yves; Blondel, Walter C. P. M.
2009-07-01
Spatially-resolved bimodal spectroscopy (multiple AutoFluorescence AF excitation and Diffuse Reflectance DR), was used in vivo to discriminate various healthy and precancerous skin stages in a pre-clinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A specific data preprocessing scheme was applied to intensity spectra (filtering, spectral correction and intensity normalization), and several sets of spectral characteristics were automatically extracted and selected based on their discrimination power, statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of Sensibility (Se) and Specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibres distances and of the numbers of principal components, such that: Se and Sp ~ 100% when discriminating CH vs. others; Sp ~ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ~ 74% and Se ~ 63% for AH vs. D.
Ma, Li; Sun, Jing; Yang, Zhaoguang; Wang, Lin
2015-12-01
Heavy metal contamination attracted a wide spread attention due to their strong toxicity and persistence. The Ganxi River, located in Chenzhou City, Southern China, has been severely polluted by lead/zinc ore mining activities. This work investigated the heavy metal pollution in agricultural soils around the Ganxi River. The total concentrations of heavy metals were determined by inductively coupled plasma-mass spectrometry. The potential risk associated with the heavy metals in soil was assessed by Nemerow comprehensive index and potential ecological risk index. In both methods, the study area was rated as very high risk. Multivariate statistical methods including Pearson's correlation analysis, hierarchical cluster analysis, and principal component analysis were employed to evaluate the relationships between heavy metals, as well as the correlation between heavy metals and pH, to identify the metal sources. Three distinct clusters have been observed by hierarchical cluster analysis. In principal component analysis, a total of two components were extracted to explain over 90% of the total variance, both of which were associated with anthropogenic sources.
A Principal Component Analysis/Fuzzy Comprehensive Evaluation for Rockburst Potential in Kimberlite
NASA Astrophysics Data System (ADS)
Pu, Yuanyuan; Apel, Derek; Xu, Huawei
2018-02-01
Kimberlite is an igneous rock which sometimes bears diamonds. Most of the diamonds mined in the world today are found in kimberlite ores. Burst potential in kimberlite has not been investigated, because kimberlite is mostly mined using open-pit mining, which poses very little threat of rock bursting. However, as the mining depth keeps increasing, the mines convert to underground mining methods, which can pose a threat of rock bursting in kimberlite. This paper focuses on the burst potential of kimberlite at a diamond mine in northern Canada. A combined model with the methods of principal component analysis (PCA) and fuzzy comprehensive evaluation (FCE) is developed to process data from 12 different locations in kimberlite pipes. Based on calculated 12 fuzzy evaluation vectors, 8 locations show a moderate burst potential, 2 locations show no burst potential, and 2 locations show strong and violent burst potential, respectively. Using statistical principles, a Mahalanobis distance is adopted to build a comprehensive fuzzy evaluation vector for the whole mine and the final evaluation for burst potential is moderate, which is verified by a practical rockbursting situation at mine site.
Statistical analysis of fNIRS data: a comprehensive review.
Tak, Sungho; Ye, Jong Chul
2014-01-15
Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Winn, Kathleen Mary
The Next Generation Science Standards (NGSS) are the newest K-12 science content standards created by a coalition of educators, scientists, and researchers available for adoption by states and schools. Principals are important actors during policy implementation especially since principals are charged with assuming the role of an instructional leader for their teachers in all subject areas. Science poses a unique challenge to the elementary curricular landscape because traditionally, elementary teachers report low levels of self-efficacy in the subject. Support in this area therefore becomes important for a successful integration of a new science education agenda. This study analyzed self-reported survey data from public elementary principals (N=667) to address the following three research questions: (1) What type of science backgrounds do elementary principals have? (2) What indicators predict if elementary principals will engage in instructional leadership behaviors in science? (3) Does self-efficacy mediate the relationship between science background and a capacity for instructional leadership in science? The survey data were analyzed quantitatively. Descriptive statistics address the first research question and inferential statistics (hierarchal regression analysis and a mediation analysis) answer the second and third research questions.The sample data show that about 21% of elementary principals have a formal science degree and 26% have a degree in a STEM field. Most principals have not had recent experience teaching science, nor were they every exclusively a science teacher. The analyses suggests that demographic, experiential, and self-efficacy variables predict instructional leadership practices in science.
What Should Be Done with "Fit" in Principal Selection?
ERIC Educational Resources Information Center
Palmer, Brandon; Kelly, Joseph; Mullooly, James
2016-01-01
Although the school principal's role has been growing in importance, the methods used to select principals have not changed much since the 1950s. Moreover, researchers have seldom scrutinized principal selection methods; yet, significant procedural issues exist. The concept of "fit" has been used within principal selection for decades,…
Infrared face recognition based on LBP histogram and KW feature selection
NASA Astrophysics Data System (ADS)
Xie, Zhihua
2014-07-01
The conventional LBP-based feature as represented by the local binary pattern (LBP) histogram still has room for performance improvements. This paper focuses on the dimension reduction of LBP micro-patterns and proposes an improved infrared face recognition method based on LBP histogram representation. To extract the local robust features in infrared face images, LBP is chosen to get the composition of micro-patterns of sub-blocks. Based on statistical test theory, Kruskal-Wallis (KW) feature selection method is proposed to get the LBP patterns which are suitable for infrared face recognition. The experimental results show combination of LBP and KW features selection improves the performance of infrared face recognition, the proposed method outperforms the traditional methods based on LBP histogram, discrete cosine transform(DCT) or principal component analysis(PCA).
Nonlinear multivariate and time series analysis by neural network methods
NASA Astrophysics Data System (ADS)
Hsieh, William W.
2004-03-01
Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.
Estimation of In Situ Stresses with Hydro-Fracturing Tests and a Statistical Method
NASA Astrophysics Data System (ADS)
Lee, Hikweon; Ong, See Hong
2018-03-01
At great depths, where borehole-based field stress measurements such as hydraulic fracturing are challenging due to difficult downhole conditions or prohibitive costs, in situ stresses can be indirectly estimated using wellbore failures such as borehole breakouts and/or drilling-induced tensile failures detected by an image log. As part of such efforts, a statistical method has been developed in which borehole breakouts detected on an image log are used for this purpose (Song et al. in Proceedings on the 7th international symposium on in situ rock stress, 2016; Song and Chang in J Geophys Res Solid Earth 122:4033-4052, 2017). The method employs a grid-searching algorithm in which the least and maximum horizontal principal stresses ( S h and S H) are varied, and the corresponding simulated depth-related breakout width distribution as a function of the breakout angle ( θ B = 90° - half of breakout width) is compared to that observed along the borehole to determine a set of S h and S H having the lowest misfit between them. An important advantage of the method is that S h and S H can be estimated simultaneously in vertical wells. To validate the statistical approach, the method is applied to a vertical hole where a set of field hydraulic fracturing tests have been carried out. The stress estimations using the proposed method were found to be in good agreement with the results interpreted from the hydraulic fracturing test measurements.
The Higher Education System in Israel: Statistical Abstract and Analysis.
ERIC Educational Resources Information Center
Herskovic, Shlomo
This edition of a statistical abstract published every few years on the higher education system in Israel presents the most recent data available through 1990-91. The data were gathered through the cooperation of the Central Bureau of Statistics and institutions of higher education. Chapter 1 presents a summary of principal findings covering the…
Integrated Approaches On Archaeo-Geophysical Data
NASA Astrophysics Data System (ADS)
Kucukdemirci, M.; Piro, S.; Zamuner, D.; Ozer, E.
2015-12-01
Key words: Ground Penetrating Radar (GPR), Magnetometry, Geophysical Data Integration, Principal Component Analyse (PCA), Aizanoi Archaeological Site An application of geophysical integration methods which often appealed are divided into two classes as qualitative and quantitative approaches. This work focused on the application of quantitative integration approaches, which involve the mathematical and statistical integration techniques, on the archaeo-geophysical data obtained in Aizanoi Archaeological Site,Turkey. Two geophysical methods were applied as Ground Penetrating Radar (GPR) and Magnetometry for archaeological prospection on the selected archaeological site. After basic data processing of each geophysical method, the mathematical approaches of Sums and Products and the statistical approach of Principal Component Analysis (PCA) have been applied for the integration. These integration approches were first tested on synthetic digital images before application to field data. Then the same approaches were applied to 2D magnetic maps and 2D GPR time slices which were obtained on the same unit grids in the archaeological site. Initially, the geophysical data were examined individually by referencing with archeological maps and informations obtained from archaeologists and some important structures as possible walls, roads and relics were determined. The results of all integration approaches provided very important and different details about the anomalies related to archaeological features. By using all those applications, integrated images can provide complementary informations as well about the archaeological relics under the ground. Acknowledgements The authors would like to thanks to Scientific and Technological Research Council of Turkey (TUBITAK), Fellowship for Visiting Scientists Programme for their support, Istanbul University Scientific Research Project Fund, (Project.No:12302) and archaeologist team of Aizanoi Archaeological site for their support during the field work.
Multivariate Analysis and Prediction of Dioxin-Furan ...
Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE
Quantitation of flavonoid constituents in citrus fruits.
Kawaii, S; Tomono, Y; Katase, E; Ogawa, K; Yano, M
1999-09-01
Twenty-four flavonoids have been determined in 66 Citrus species and near-citrus relatives, grown in the same field and year, by means of reversed phase high-performance liquid chromatography analysis. Statistical methods have been applied to find relations among the species. The F ratios of 21 flavonoids obtained by applying ANOVA analysis are significant, indicating that a classification of the species using these variables is reasonable to pursue. Principal component analysis revealed that the distributions of Citrus species belonging to different classes were largely in accordance with Tanaka's classification system.
Playford, Denese; Power, Phoebe; Boothroyd, Alarna; Manickavasagar, Usha; Ng, Wen Qi; Riley, Geoff
2013-10-01
This study compared rural location identified through the National Registration (AHPRA) registry with location obtained through labour-intensive personal contact. Longitudinal cohort study with two methods to identify the work locations of medical graduates from The Rural Clinical School of Western Australia (RCSWA). Consenting alumni from the University of Western Australia and the University of Notre Dame Fremantle participating in RCSWA between 2002 and 2009 inclusive and available to contact in 2011. Percentage location matches between two contact methods. There was 80% agreement for principal suburb, 92% agreement for principal city and 94% agreement for principal state between RCSWA personal contact and the AHPRA registry. AHPRA identified nearly two times as many graduate locations. However, there was only 31% agreement for a rural placement location (of any length). In more detail, for year-long rural placement, personal contact was 88% concordant with AHPRA; work six months or more were less concordant (44% agreement); work less than six months were not concordant (4% agreement). AHPRA data matched RCSWA alumni data only for graduates in full-time rural work. Since medical alumni spend up to 10 years in pre-vocational and vocational training, which includes many rural options, personal contact was able to pick up the myriad of rural choices, whereas the AHPRA database was not sensitive enough to identify them. Until graduates have stably finished training, the optimal method to identify rural work is through personal contact but statistical correction for missing data needs to be considered. © 2013 The Authors. Australian Journal of Rural Health © National Rural Health Alliance Inc.
NASA Astrophysics Data System (ADS)
Nagai, Toshiki; Mitsutake, Ayori; Takano, Hiroshi
2013-02-01
A new relaxation mode analysis method, which is referred to as the principal component relaxation mode analysis method, has been proposed to handle a large number of degrees of freedom of protein systems. In this method, principal component analysis is carried out first and then relaxation mode analysis is applied to a small number of principal components with large fluctuations. To reduce the contribution of fast relaxation modes in these principal components efficiently, we have also proposed a relaxation mode analysis method using multiple evolution times. The principal component relaxation mode analysis method using two evolution times has been applied to an all-atom molecular dynamics simulation of human lysozyme in aqueous solution. Slow relaxation modes and corresponding relaxation times have been appropriately estimated, demonstrating that the method is applicable to protein systems.
Rank estimation and the multivariate analysis of in vivo fast-scan cyclic voltammetric data
Keithley, Richard B.; Carelli, Regina M.; Wightman, R. Mark
2010-01-01
Principal component regression has been used in the past to separate current contributions from different neuromodulators measured with in vivo fast-scan cyclic voltammetry. Traditionally, a percent cumulative variance approach has been used to determine the rank of the training set voltammetric matrix during model development, however this approach suffers from several disadvantages including the use of arbitrary percentages and the requirement of extreme precision of training sets. Here we propose that Malinowski’s F-test, a method based on a statistical analysis of the variance contained within the training set, can be used to improve factor selection for the analysis of in vivo fast-scan cyclic voltammetric data. These two methods of rank estimation were compared at all steps in the calibration protocol including the number of principal components retained, overall noise levels, model validation as determined using a residual analysis procedure, and predicted concentration information. By analyzing 119 training sets from two different laboratories amassed over several years, we were able to gain insight into the heterogeneity of in vivo fast-scan cyclic voltammetric data and study how differences in factor selection propagate throughout the entire principal component regression analysis procedure. Visualizing cyclic voltammetric representations of the data contained in the retained and discarded principal components showed that using Malinowski’s F-test for rank estimation of in vivo training sets allowed for noise to be more accurately removed. Malinowski’s F-test also improved the robustness of our criterion for judging multivariate model validity, even though signal-to-noise ratios of the data varied. In addition, pH change was the majority noise carrier of in vivo training sets while dopamine prediction was more sensitive to noise. PMID:20527815
Characteristics of Teachers Nominated for an Accelerated Principal Preparation Program
ERIC Educational Resources Information Center
Rios, Steve J.; Reyes-Guerra, Daniel
2012-01-01
This article reports the initial evaluation results of a new accelerated, job-embedded principal preparation program funded by a Race to the Top Grant (U.S. Department of Education, 2012a) in Florida. Descriptive statistics, t-tests, and chi-square analyses were used to describe the characteristics of a group of potential applicants nominated to…
ERIC Educational Resources Information Center
Center for Education Statistics (ED/OERI), Washington, DC.
A survey of public high school principals asked which policies, programs, and practices designed to improve learning were currently in operation at their schools, and whether these policies were instituted or substantially strengthened in the past 5 years. These policies reflect the school-level recommendations for education reform made in "A…
Adolescent Suicide. The Trauma of Adolescent Suicide. A Time for Special Leadership by Principals.
ERIC Educational Resources Information Center
Dempsey, Richard A.
This monograph was written to help principals and other school personnel explore the issues of adolescent suicide and its prevention. Chapter 1 presents statistics on the incidence of adolescent suicides and suicide attempts. Chapter 2, Causes of Suicide, reviews developmental tasks of adolescence, lists several contributors to adolescent suicide,…
A Correlational Study of Principals' Leadership Style and Teacher Absenteeism
ERIC Educational Resources Information Center
Carter, Jason
2010-01-01
The purpose of this study was to determine whether McGregor's Theory X and Theory Y, gender, age, and years of experience of principals form a composite explaining the variation in teacher absences. It sought to determine whether all or any of these variables would be statistically significant in explaining the variance in absences for teachers.…
ERIC Educational Resources Information Center
Brusco, Michael J.; Singh, Renu; Steinley, Douglas
2009-01-01
The selection of a subset of variables from a pool of candidates is an important problem in several areas of multivariate statistics. Within the context of principal component analysis (PCA), a number of authors have argued that subset selection is crucial for identifying those variables that are required for correct interpretation of the…
Cocco, Simona; Monasson, Remi; Weigt, Martin
2013-01-01
Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold. PMID:23990764
Liu, Zechang; Wang, Liping; Liu, Yumei
2018-01-18
Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
Modeling vertebrate diversity in Oregon using satellite imagery
NASA Astrophysics Data System (ADS)
Cablk, Mary Elizabeth
Vertebrate diversity was modeled for the state of Oregon using a parametric approach to regression tree analysis. This exploratory data analysis effectively modeled the non-linear relationships between vertebrate richness and phenology, terrain, and climate. Phenology was derived from time-series NOAA-AVHRR satellite imagery for the year 1992 using two methods: principal component analysis and derivation of EROS data center greenness metrics. These two measures of spatial and temporal vegetation condition incorporated the critical temporal element in this analysis. The first three principal components were shown to contain spatial and temporal information about the landscape and discriminated phenologically distinct regions in Oregon. Principal components 2 and 3, 6 greenness metrics, elevation, slope, aspect, annual precipitation, and annual seasonal temperature difference were investigated as correlates to amphibians, birds, all vertebrates, reptiles, and mammals. Variation explained for each regression tree by taxa were: amphibians (91%), birds (67%), all vertebrates (66%), reptiles (57%), and mammals (55%). Spatial statistics were used to quantify the pattern of each taxa and assess validity of resulting predictions from regression tree models. Regression tree analysis was relatively robust against spatial autocorrelation in the response data and graphical results indicated models were well fit to the data.
Chapman, Benjamin P; Weiss, Alexander; Duberstein, Paul R
2016-12-01
Statistical learning theory (SLT) is the statistical formulation of machine learning theory, a body of analytic methods common in "big data" problems. Regression-based SLT algorithms seek to maximize predictive accuracy for some outcome, given a large pool of potential predictors, without overfitting the sample. Research goals in psychology may sometimes call for high dimensional regression. One example is criterion-keyed scale construction, where a scale with maximal predictive validity must be built from a large item pool. Using this as a working example, we first introduce a core principle of SLT methods: minimization of expected prediction error (EPE). Minimizing EPE is fundamentally different than maximizing the within-sample likelihood, and hinges on building a predictive model of sufficient complexity to predict the outcome well, without undue complexity leading to overfitting. We describe how such models are built and refined via cross-validation. We then illustrate how 3 common SLT algorithms-supervised principal components, regularization, and boosting-can be used to construct a criterion-keyed scale predicting all-cause mortality, using a large personality item pool within a population cohort. Each algorithm illustrates a different approach to minimizing EPE. Finally, we consider broader applications of SLT predictive algorithms, both as supportive analytic tools for conventional methods, and as primary analytic tools in discovery phase research. We conclude that despite their differences from the classic null-hypothesis testing approach-or perhaps because of them-SLT methods may hold value as a statistically rigorous approach to exploratory regression. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Using radar imagery for crop discrimination: a statistical and conditional probability study
Haralick, R.M.; Caspall, F.; Simonett, D.S.
1970-01-01
A number of the constraints with which remote sensing must contend in crop studies are outlined. They include sensor, identification accuracy, and congruencing constraints; the nature of the answers demanded of the sensor system; and the complex temporal variances of crops in large areas. Attention is then focused on several methods which may be used in the statistical analysis of multidimensional remote sensing data.Crop discrimination for radar K-band imagery is investigated by three methods. The first one uses a Bayes decision rule, the second a nearest-neighbor spatial conditional probability approach, and the third the standard statistical techniques of cluster analysis and principal axes representation.Results indicate that crop type and percent of cover significantly affect the strength of the radar return signal. Sugar beets, corn, and very bare ground are easily distinguishable, sorghum, alfalfa, and young wheat are harder to distinguish. Distinguishability will be improved if the imagery is examined in time sequence so that changes between times of planning, maturation, and harvest provide additional discriminant tools. A comparison between radar and photography indicates that radar performed surprisingly well in crop discrimination in western Kansas and warrants further study.
Medial-based deformable models in nonconvex shape-spaces for medical image segmentation.
McIntosh, Chris; Hamarneh, Ghassan
2012-01-01
We explore the application of genetic algorithms (GA) to deformable models through the proposition of a novel method for medical image segmentation that combines GA with nonconvex, localized, medial-based shape statistics. We replace the more typical gradient descent optimizer used in deformable models with GA, and the convex, implicit, global shape statistics with nonconvex, explicit, localized ones. Specifically, we propose GA to reduce typical deformable model weaknesses pertaining to model initialization, pose estimation and local minima, through the simultaneous evolution of a large number of models. Furthermore, we constrain the evolution, and thus reduce the size of the search-space, by using statistically-based deformable models whose deformations are intuitive (stretch, bulge, bend) and are driven in terms of localized principal modes of variation, instead of modes of variation across the entire shape that often fail to capture localized shape changes. Although GA are not guaranteed to achieve the global optima, our method compares favorably to the prevalent optimization techniques, convex/nonconvex gradient-based optimizers and to globally optimal graph-theoretic combinatorial optimization techniques, when applied to the task of corpus callosum segmentation in 50 mid-sagittal brain magnetic resonance images.
Sun, Gang; Hoff, Steven J; Zelle, Brian C; Nelson, Minda A
2008-12-01
It is vital to forecast gas and particle matter concentrations and emission rates (GPCER) from livestock production facilities to assess the impact of airborne pollutants on human health, ecological environment, and global warming. Modeling source air quality is a complex process because of abundant nonlinear interactions between GPCER and other factors. The objective of this study was to introduce statistical methods and radial basis function (RBF) neural network to predict daily source air quality in Iowa swine deep-pit finishing buildings. The results show that four variables (outdoor and indoor temperature, animal units, and ventilation rates) were identified as relative important model inputs using statistical methods. It can be further demonstrated that only two factors, the environment factor and the animal factor, were capable of explaining more than 94% of the total variability after performing principal component analysis. The introduction of fewer uncorrelated variables to the neural network would result in the reduction of the model structure complexity, minimize computation cost, and eliminate model overfitting problems. The obtained results of RBF network prediction were in good agreement with the actual measurements, with values of the correlation coefficient between 0.741 and 0.995 and very low values of systemic performance indexes for all the models. The good results indicated the RBF network could be trained to model these highly nonlinear relationships. Thus, the RBF neural network technology combined with multivariate statistical methods is a promising tool for air pollutant emissions modeling.
Statistical Inference for Porous Materials using Persistent Homology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moon, Chul; Heath, Jason E.; Mitchell, Scott A.
2017-12-01
We propose a porous materials analysis pipeline using persistent homology. We rst compute persistent homology of binarized 3D images of sampled material subvolumes. For each image we compute sets of homology intervals, which are represented as summary graphics called persistence diagrams. We convert persistence diagrams into image vectors in order to analyze the similarity of the homology of the material images using the mature tools for image analysis. Each image is treated as a vector and we compute its principal components to extract features. We t a statistical model using the loadings of principal components to estimate material porosity, permeability,more » anisotropy, and tortuosity. We also propose an adaptive version of the structural similarity index (SSIM), a similarity metric for images, as a measure to determine the statistical representative elementary volumes (sREV) for persistence homology. Thus we provide a capability for making a statistical inference of the uid ow and transport properties of porous materials based on their geometry and connectivity.« less
Quality of stormwater runoff discharged from Massachusetts highways, 2005-07
Smith, Kirk P.; Granato, Gregory E.
2010-01-01
The U.S. Geological Survey (USGS), in cooperation with U.S. Department of Transportation Federal Highway Administration and the Massachusetts Department of Transportation, conducted a field study from September 2005 through September 2007 to characterize the quality of highway runoff for a wide range of constituents. The highways studied had annual average daily traffic (AADT) volumes from about 3,000 to more than 190,000 vehicles per day. Highway-monitoring stations were installed at 12 locations in Massachusetts on 8 highways. The 12 monitoring stations were subdivided into 4 primary, 4 secondary, and 4 test stations. Each site contained a 100-percent impervious drainage area that included two or more catch basins sharing a common outflow pipe. Paired primary and secondary stations were located within a few miles of each other on a limited-access section of the same highway. Most of the data were collected at the primary and secondary stations, which were located on four principal highways (Route 119, Route 2, Interstate 495, and Interstate 95). The secondary stations were operated simultaneously with the primary stations for at least a year. Data from the four test stations (Route 8, Interstate 195, Interstate 190, and Interstate 93) were used to determine the transferability of the data collected from the principal highways to other highways characterized by different construction techniques, land use, and geography. Automatic-monitoring techniques were used to collect composite samples of highway runoff and make continuous measurements of several physical characteristics. Flowweighted samples of highway runoff were collected automatically during approximately 140 rain and mixed rain, sleet, and snowstorms. These samples were analyzed for physical characteristics and concentrations of 6 dissolved major ions, total nutrients, 8 total-recoverable metals, suspended sediment, and 85 semivolatile organic compounds (SVOCs), which include priority polyaromatic hydrocarbons (PAHs), phthalate esters, and other anthropogenic or naturally occurring organic compounds. The distribution of particle size of suspended sediment also was determined for composite samples of highway runoff. Samples of highway runoff were collected year round and under various dry antecedent conditions throughout the 2-year sampling period. In addition to samples of highway runoff, supplemental samples also were collected of sediment in highway runoff, background soils, berm materials, maintenance sands, deicing compounds, and vegetation matter. These additional samples were collected near or on the highways to support data analysis. There were few statistically significant differences between populations of constituent concentrations in samples from the primary and secondary stations on the same principal highways (Mann-Whitney test, 95-percent confidence level). Similarly, there were few statistically significant differences between populations of constituent concentrations for the four principal highways (data from the paired primary and secondary stations for each principal highway) and populations for test stations with similar AADT volumes. Exceptions to this include several total-recoverable metals for stations on Route 2 and Interstate 195 (highways with moderate AADT volumes), and for stations on Interstate 95 and Interstate 93 (highways with high AADT volumes). Supplemental data collected during this study indicate that many of these differences may be explained by the quantity, as well as the quality, of the sediment in samples of highway runoff. Nonparametric statistical methods also were used to test for differences between populations of sample constituent concentrations among the four principal highways that differed mainly in traffic volume. These results indicate that there were few statistically significant differences (Mann-Whitney test, 95-percent confidence level) for populations of concentrations of most total-recoverable metals
RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis.
Glaab, Enrico; Schneider, Reinhard
2015-07-01
High-throughput omics datasets often contain technical replicates included to account for technical sources of noise in the measurement process. Although summarizing these replicate measurements by using robust averages may help to reduce the influence of noise on downstream data analysis, the information on the variance across the replicate measurements is lost in the averaging process and therefore typically disregarded in subsequent statistical analyses.We introduce RepExplore, a web-service dedicated to exploit the information captured in the technical replicate variance to provide more reliable and informative differential expression and abundance statistics for omics datasets. The software builds on previously published statistical methods, which have been applied successfully to biomedical omics data but are difficult to use without prior experience in programming or scripting. RepExplore facilitates the analysis by providing a fully automated data processing and interactive ranking tables, whisker plot, heat map and principal component analysis visualizations to interpret omics data and derived statistics. Freely available at http://www.repexplore.tk enrico.glaab@uni.lu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Senkbeil, J. C.; Brommer, D. M.; Comstock, I. J.; Loyd, T.
2012-07-01
Extratropical cyclones (ETCs) in the southern United States are often overlooked when compared with tropical cyclones in the region and ETCs in the northern United States. Although southern ETCs are significant weather events, there is currently not an operational scheme used for identifying and discussing these nameless storms. In this research, we classified 84 ETCs (1970-2009). We manually identified five distinct formation regions and seven unique ETC types using statistical classification. Statistical classification employed the use of principal components analysis and two methods of cluster analysis. Both manual and statistical storm types generally showed positive (negative) relationships with El Niño (La Niña). Manual storm types displayed precipitation swaths consistent with discrete storm tracks which further legitimizes the existence of multiple modes of southern ETCs. Statistical storm types also displayed unique precipitation intensity swaths, but these swaths were less indicative of track location. It is hoped that by classifying southern ETCs into types, that forecasters, hydrologists, and broadcast meteorologists might be able to better anticipate projected amounts of precipitation at their locations.
Zhou, Xiangrong; Xu, Rui; Hara, Takeshi; Hirano, Yasushi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Hoshi, Hiroaki; Kido, Shoji; Fujita, Hiroshi
2014-07-01
The shapes of the inner organs are important information for medical image analysis. Statistical shape modeling provides a way of quantifying and measuring shape variations of the inner organs in different patients. In this study, we developed a universal scheme that can be used for building the statistical shape models for different inner organs efficiently. This scheme combines the traditional point distribution modeling with a group-wise optimization method based on a measure called minimum description length to provide a practical means for 3D organ shape modeling. In experiments, the proposed scheme was applied to the building of five statistical shape models for hearts, livers, spleens, and right and left kidneys by use of 50 cases of 3D torso CT images. The performance of these models was evaluated by three measures: model compactness, model generalization, and model specificity. The experimental results showed that the constructed shape models have good "compactness" and satisfied the "generalization" performance for different organ shape representations; however, the "specificity" of these models should be improved in the future.
Dong, Skye T; Costa, Daniel S J; Butow, Phyllis N; Lovell, Melanie R; Agar, Meera; Velikova, Galina; Teckle, Paulos; Tong, Allison; Tebbutt, Niall C; Clarke, Stephen J; van der Hoek, Kim; King, Madeleine T; Fayers, Peter M
2016-01-01
Symptom clusters in advanced cancer can influence patient outcomes. There is large heterogeneity in the methods used to identify symptom clusters. To investigate the consistency of symptom cluster composition in advanced cancer patients using different statistical methodologies for all patients across five primary cancer sites, and to examine which clusters predict functional status, a global assessment of health and global quality of life. Principal component analysis and exploratory factor analysis (with different rotation and factor selection methods) and hierarchical cluster analysis (with different linkage and similarity measures) were used on a data set of 1562 advanced cancer patients who completed the European Organization for the Research and Treatment of Cancer Quality of Life Questionnaire-Core 30. Four clusters consistently formed for many of the methods and cancer sites: tense-worry-irritable-depressed (emotional cluster), fatigue-pain, nausea-vomiting, and concentration-memory (cognitive cluster). The emotional cluster was a stronger predictor of overall quality of life than the other clusters. Fatigue-pain was a stronger predictor of overall health than the other clusters. The cognitive cluster and fatigue-pain predicted physical functioning, role functioning, and social functioning. The four identified symptom clusters were consistent across statistical methods and cancer types, although there were some noteworthy differences. Statistical derivation of symptom clusters is in need of greater methodological guidance. A psychosocial pathway in the management of symptom clusters may improve quality of life. Biological mechanisms underpinning symptom clusters need to be delineated by future research. A framework for evidence-based screening, assessment, treatment, and follow-up of symptom clusters in advanced cancer is essential. Copyright © 2016 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Vision-based localization of the center of mass of large space debris via statistical shape analysis
NASA Astrophysics Data System (ADS)
Biondi, G.; Mauro, S.; Pastorelli, S.
2017-08-01
The current overpopulation of artificial objects orbiting the Earth has increased the interest of the space agencies on planning missions for de-orbiting the largest inoperative satellites. Since this kind of operations involves the capture of the debris, the accurate knowledge of the position of their center of mass is a fundamental safety requirement. As ground observations are not sufficient to reach the required accuracy level, this information should be acquired in situ just before any contact between the chaser and the target. Some estimation methods in the literature rely on the usage of stereo cameras for tracking several features of the target surface. The actual positions of these features are estimated together with the location of the center of mass by state observers. The principal drawback of these methods is related to possible sudden disappearances of one or more features from the field of view of the cameras. An alternative method based on 3D Kinematic registration is presented in this paper. The method, which does not suffer of the mentioned drawback, considers a preliminary reduction of the inaccuracies in detecting features by the usage of statistical shape analysis.
Statistical downscaling modeling with quantile regression using lasso to estimate extreme rainfall
NASA Astrophysics Data System (ADS)
Santri, Dewi; Wigena, Aji Hamim; Djuraidah, Anik
2016-02-01
Rainfall is one of the climatic elements with high diversity and has many negative impacts especially extreme rainfall. Therefore, there are several methods that required to minimize the damage that may occur. So far, Global circulation models (GCM) are the best method to forecast global climate changes include extreme rainfall. Statistical downscaling (SD) is a technique to develop the relationship between GCM output as a global-scale independent variables and rainfall as a local- scale response variable. Using GCM method will have many difficulties when assessed against observations because GCM has high dimension and multicollinearity between the variables. The common method that used to handle this problem is principal components analysis (PCA) and partial least squares regression. The new method that can be used is lasso. Lasso has advantages in simultaneuosly controlling the variance of the fitted coefficients and performing automatic variable selection. Quantile regression is a method that can be used to detect extreme rainfall in dry and wet extreme. Objective of this study is modeling SD using quantile regression with lasso to predict extreme rainfall in Indramayu. The results showed that the estimation of extreme rainfall (extreme wet in January, February and December) in Indramayu could be predicted properly by the model at quantile 90th.
ACCOUNTING FOR CALIBRATION UNCERTAINTIES IN X-RAY ANALYSIS: EFFECTIVE AREAS IN SPECTRAL FITTING
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Hyunsook; Kashyap, Vinay L.; Drake, Jeremy J.
2011-04-20
While considerable advance has been made to account for statistical uncertainties in astronomical analyses, systematic instrumental uncertainties have been generally ignored. This can be crucial to a proper interpretation of analysis results because instrumental calibration uncertainty is a form of systematic uncertainty. Ignoring it can underestimate error bars and introduce bias into the fitted values of model parameters. Accounting for such uncertainties currently requires extensive case-specific simulations if using existing analysis packages. Here, we present general statistical methods that incorporate calibration uncertainties into spectral analysis of high-energy data. We first present a method based on multiple imputation that can bemore » applied with any fitting method, but is necessarily approximate. We then describe a more exact Bayesian approach that works in conjunction with a Markov chain Monte Carlo based fitting. We explore methods for improving computational efficiency, and in particular detail a method of summarizing calibration uncertainties with a principal component analysis of samples of plausible calibration files. This method is implemented using recently codified Chandra effective area uncertainties for low-resolution spectral analysis and is verified using both simulated and actual Chandra data. Our procedure for incorporating effective area uncertainty is easily generalized to other types of calibration uncertainties.« less
ERIC Educational Resources Information Center
Harris, Cydnie Ellen Smith
2012-01-01
The effect of the leadership style of the secondary school principal on student achievement in select public schools in Louisiana was examined in this study. The null hypothesis was that there was no statistically significant difference between principal leadership style and student academic achievement. The researcher submitted the LEAD-Self…
ERIC Educational Resources Information Center
Lavigne, Heather J.; Shakman, Karen; Zweig, Jacqueline; Greller, Sara L.
2016-01-01
This study describes how principals reported spending their time and what professional development they reported participating in, based on data collected through the Schools and Staffing Survey by the National Center for Education Statistics during the 2011/12 school year. The study analyzes schools by grade level, poverty level, and within…
ERIC Educational Resources Information Center
Khammar, Zahra; Heidarzadegan, Alireza; Balaghat, Seyed Reza; Salehi, Hadi
2013-01-01
This study aimed to investigate the relationship between knowledge management and organizational learning with the effectiveness of ordinary and smart high school principals in Zahedan Pre-province. The statistical community of this research is 1350 male and female teachers teaching in ordinary and smart students of high schools in that 300 ones…
NASA Astrophysics Data System (ADS)
Hoell, Simon; Omenzetter, Piotr
2015-03-01
The development of large wind turbines that enable to harvest energy more efficiently is a consequence of the increasing demand for renewables in the world. To optimize the potential energy output, light and flexible wind turbine blades (WTBs) are designed. However, the higher flexibilities and lower buckling capacities adversely affect the long-term safety and reliability of WTBs, and thus the increased operation and maintenance costs reduce the expected revenue. Effective structural health monitoring techniques can help to counteract this by limiting inspection efforts and avoiding unplanned maintenance actions. Vibration-based methods deserve high attention due to the moderate instrumentation efforts and the applicability for in-service measurements. The present paper proposes the use of cross-correlations (CCs) of acceleration responses between sensors at different locations for structural damage detection in WTBs. CCs were in the past successfully applied for damage detection in numerical and experimental beam structures while utilizing only single lags between the signals. The present approach uses vectors of CC coefficients for multiple lags between measurements of two selected sensors taken from multiple possible combinations of sensors. To reduce the dimensionality of the damage sensitive feature (DSF) vectors, principal component analysis is performed. The optimal number of principal components (PCs) is chosen with respect to a statistical threshold. Finally, the detection phase uses the selected PCs of the healthy structure to calculate scores from a current DSF vector, where statistical hypothesis testing is performed for making a decision about the current structural state. The method is applied to laboratory experiments conducted on a small WTB with non-destructive damage scenarios.
76 FR 27563 - Margin and Capital Requirements for Covered Swap Entities
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-11
.... Board: Sean D. Campbell, Deputy Associate Director, Division of Research and Statistics, (202) 452-3761, Michael Gibson, Senior Associate Director, Division of Research and Statistics, (202) 452- 2495, or Jeremy..., DC 20429. FHFA: Robert Collender, Principal Policy Analyst, Office of Policy Analysis and Research...
Guo, Hui; Zhang, Zhen; Yao, Yuan; Liu, Jialin; Chang, Ruirui; Liu, Zhao; Hao, Hongyuan; Huang, Taohong; Wen, Jun; Zhou, Tingting
2018-08-30
Semen sojae praeparatum with homology of medicine and food is a famous traditional Chinese medicine. A simple and effective quality fingerprint analysis, coupled with chemometrics methods, was developed for quality assessment of Semen sojae praeparatum. First, similarity analysis (SA) and hierarchical clusting analysis (HCA) were applied to select the qualitative markers, which obviously influence the quality of Semen sojae praeparatum. 21 chemicals were selected and characterized by high resolution ion trap/time-of-flight mass spectrometry (LC-IT-TOF-MS). Subsequently, principal components analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) were conducted to select the quantitative markers of Semen sojae praeparatum samples from different origins. Moreover, 11 compounds with statistical significance were determined quantitatively, which provided an accurate and informative data for quality evaluation. This study proposes a new strategy for "statistic analysis-based fingerprint establishment", which would be a valuable reference for further study. Copyright © 2018 Elsevier Ltd. All rights reserved.
Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye
2016-01-13
A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.
Cole, Jacqueline M.; Cheng, Xie; Payne, Michael C.
2016-10-18
The use of principal component analysis (PCA) to statistically infer features of local structure from experimental pair distribution function (PDF) data is assessed on a case study of rare-earth phosphate glasses (REPGs). Such glasses, co-doped with two rare-earth ions (R and R’) of different sizes and optical properties, are of interest to the laser industry. The determination of structure-property relationships in these materials is an important aspect of their technological development. Yet, realizing the local structure of co-doped REPGs presents significant challenges relative to their singly-doped counterparts; specifically, R and R’ are difficult to distinguish in terms of establishing relativemore » material compositions, identifying atomic pairwise correlation profiles in a PDF that are associated with each ion, and resolving peak overlap of such profiles in PDFs. This study demonstrates that PCA can be employed to help overcome these structural complications, by statistically inferring trends in PDFs that exist for a restricted set of experimental data on REPGs, and using these as training data to predict material compositions and PDF profiles in unknown co-doped REPGs. The application of these PCA methods to resolve individual atomic pairwise correlations in t(r) signatures is also presented. The training methods developed for these structural predictions are pre-validated by testing their ability to reproduce known physical phenomena, such as the lanthanide contraction, on PDF signatures of the structurally simpler singly-doped REPGs. The intrinsic limitations of applying PCA to analyze PDFs relative to the quality control of source data, data processing, and sample definition, are also considered. Furthermore, while this case study is limited to lanthanide-doped REPGs, this type of statistical inference may easily be extended to other inorganic solid-state materials, and be exploited in large-scale data-mining efforts that probe many t(r) functions.« less
Early forest fire detection using principal component analysis of infrared video
NASA Astrophysics Data System (ADS)
Saghri, John A.; Radjabi, Ryan; Jacobs, John T.
2011-09-01
A land-based early forest fire detection scheme which exploits the infrared (IR) temporal signature of fire plume is described. Unlike common land-based and/or satellite-based techniques which rely on measurement and discrimination of fire plume directly from its infrared and/or visible reflectance imagery, this scheme is based on exploitation of fire plume temporal signature, i.e., temperature fluctuations over the observation period. The method is simple and relatively inexpensive to implement. The false alarm rate is expected to be lower that of the existing methods. Land-based infrared (IR) cameras are installed in a step-stare-mode configuration in potential fire-prone areas. The sequence of IR video frames from each camera is digitally processed to determine if there is a fire within camera's field of view (FOV). The process involves applying a principal component transformation (PCT) to each nonoverlapping sequence of video frames from the camera to produce a corresponding sequence of temporally-uncorrelated principal component (PC) images. Since pixels that form a fire plume exhibit statistically similar temporal variation (i.e., have a unique temporal signature), PCT conveniently renders the footprint/trace of the fire plume in low-order PC images. The PC image which best reveals the trace of the fire plume is then selected and spatially filtered via simple threshold and median filter operations to remove the background clutter, such as traces of moving tree branches due to wind.
NASA Astrophysics Data System (ADS)
Song, Bowen; Zhang, Guopeng; Wang, Huafeng; Zhu, Wei; Liang, Zhengrong
2013-02-01
Various types of features, e.g., geometric features, texture features, projection features etc., have been introduced for polyp detection and differentiation tasks via computer aided detection and diagnosis (CAD) for computed tomography colonography (CTC). Although these features together cover more information of the data, some of them are statistically highly-related to others, which made the feature set redundant and burdened the computation task of CAD. In this paper, we proposed a new dimension reduction method which combines hierarchical clustering and principal component analysis (PCA) for false positives (FPs) reduction task. First, we group all the features based on their similarity using hierarchical clustering, and then PCA is employed within each group. Different numbers of principal components are selected from each group to form the final feature set. Support vector machine is used to perform the classification. The results show that when three principal components were chosen from each group we can achieve an area under the curve of receiver operating characteristics of 0.905, which is as high as the original dataset. Meanwhile, the computation time is reduced by 70% and the feature set size is reduce by 77%. It can be concluded that the proposed method captures the most important information of the feature set and the classification accuracy is not affected after the dimension reduction. The result is promising and further investigation, such as automatically threshold setting, are worthwhile and are under progress.
Statistical analysis and machine learning algorithms for optical biopsy
NASA Astrophysics Data System (ADS)
Wu, Binlin; Liu, Cheng-hui; Boydston-White, Susie; Beckman, Hugh; Sriramoju, Vidyasagar; Sordillo, Laura; Zhang, Chunyuan; Zhang, Lin; Shi, Lingyan; Smith, Jason; Bailin, Jacob; Alfano, Robert R.
2018-02-01
Analyzing spectral or imaging data collected with various optical biopsy methods is often times difficult due to the complexity of the biological basis. Robust methods that can utilize the spectral or imaging data and detect the characteristic spectral or spatial signatures for different types of tissue is challenging but highly desired. In this study, we used various machine learning algorithms to analyze a spectral dataset acquired from human skin normal and cancerous tissue samples using resonance Raman spectroscopy with 532nm excitation. The algorithms including principal component analysis, nonnegative matrix factorization, and autoencoder artificial neural network are used to reduce dimension of the dataset and detect features. A support vector machine with a linear kernel is used to classify the normal tissue and cancerous tissue samples. The efficacies of the methods are compared.
Study on 1H-NMR fingerprinting of Rhodiolae Crenulatae Radix et Rhizoma.
Wen, Shi-yuan; Zhou, Jiang-tao; Chen, Yan-yan; Ding, Li-qin; Jiang, Miao-miao
2015-07-01
Nuclear magnetic resonance (1H-NMR) fingerprint of Rhodiola rosea medicinal materials was established, and used to distinguish the quality of raw materials from different sources. Pulse sequence for water peak inhibition was employed to acquire 1H-NMR spectra with the temperature at 298 K and spectrometer frequency of 400.13 MHz. Through subsection integral method, the obtained NMR data was subjected to similarity analysis and principal component analysis (PCA). 10 batches raw materials of Rhodiola rosea from different origins were successfully distinguished by PCA. The statistical results indicated that rhodiola glucoside, butyl alcohol, maleic acid and alanine were the main differential ingredients. This method provides an auxiliary method of Chinese quality approach to evaluate the quality of Rhodiola crenulata without using natural reference substances.
Principal Selection: A National Study of Selection Criteria and Procedures
ERIC Educational Resources Information Center
Palmer, Brandon
2017-01-01
Despite empirical evidence correlating the role of the principal with student achievement, researchers have seldom scrutinized principal selection methods over the past 60 years. This mixed methods study investigated the processes by which school principals are selected. A national sample of top-level school district administrators was used to…
Principal Selection: A National Study of Selection Criteria and Procedures
ERIC Educational Resources Information Center
Palmer, Brandon
2016-01-01
Despite empirical evidence correlating the role of the principal with student achievement, researchers have seldom scrutinized principal selection methods over the past 60 years. This mixed methods study investigated the processes by which school principals are selected. A national sample of top-level school district administrators was used to…
Using Bayes' theorem for free energy calculations
NASA Astrophysics Data System (ADS)
Rogers, David M.
Statistical mechanics is fundamentally based on calculating the probabilities of molecular-scale events. Although Bayes' theorem has generally been recognized as providing key guiding principals for setup and analysis of statistical experiments [83], classical frequentist models still predominate in the world of computational experimentation. As a starting point for widespread application of Bayesian methods in statistical mechanics, we investigate the central quantity of free energies from this perspective. This dissertation thus reviews the basics of Bayes' view of probability theory, and the maximum entropy formulation of statistical mechanics before providing examples of its application to several advanced research areas. We first apply Bayes' theorem to a multinomial counting problem in order to determine inner shell and hard sphere solvation free energy components of Quasi-Chemical Theory [140]. We proceed to consider the general problem of free energy calculations from samples of interaction energy distributions. From there, we turn to spline-based estimation of the potential of mean force [142], and empirical modeling of observed dynamics using integrator matching. The results of this research are expected to advance the state of the art in coarse-graining methods, as they allow a systematic connection from high-resolution (atomic) to low-resolution (coarse) structure and dynamics. In total, our work on these problems constitutes a critical starting point for further application of Bayes' theorem in all areas of statistical mechanics. It is hoped that the understanding so gained will allow for improvements in comparisons between theory and experiment.
Milyo, Jeffrey; Mellor, Jennifer M
2003-01-01
Objective To illustrate the potential sensitivity of ecological associations between mortality and certain socioeconomic factors to different methods of age-adjustment. Data Sources Secondary analysis employing state-level data from several publicly available sources. Crude and age-adjusted mortality rates for 1990 are obtained from the U.S. Centers for Disease Control. The Gini coefficient for family income and percent of persons below the federal poverty line are from the U.S. Bureau of Labor Statistics. Putnam's (2000) Social Capital Index was downloaded from ; the Social Mistrust Index was calculated from responses to the General Social Survey, following the method described in Kawachi et al. (1997). All other covariates are obtained from the U.S. Census Bureau. Study Design We use least squares regression to estimate the effect of several state-level socioeconomic factors on mortality rates. We examine whether these statistical associations are sensitive to the use of alternative methods of accounting for the different age composition of state populations. Following several previous studies, we present results for the case when only mortality rates are age-adjusted. We contrast these results with those obtained from regressions of crude mortality on age variables. Principal Findings Different age-adjustment methods can cause a change in the sign or statistical significance of the association between mortality and various socioeconomic factors. When age variables are included as regressors, we find no significant association between mortality and either income inequality, minority racial concentration, or social capital. Conclusions Ecological associations between certain socioeconomic factors and mortality may be extremely sensitive to different age-adjustment methods. PMID:14727797
Rowlands, G J; Musoke, A J; Morzaria, S P; Nagda, S M; Ballingall, K T; McKeever, D J
2000-04-01
A statistically derived disease reaction index based on parasitological, clinical and haematological measurements observed in 309 5 to 8-month-old Boran cattle following laboratory challenge with Theileria parva is described. Principal component analysis was applied to 13 measures including first appearance of schizonts, first appearance of piroplasms and first occurrence of pyrexia, together with the duration and severity of these symptoms, and white blood cell count. The first principal component, which was based on approximately equal contributions of the 13 variables, provided the definition for the disease reaction index, defined on a scale of 0-10. As well as providing a more objective measure of the severity of the reaction, the continuous nature of the index score enables more powerful statistical analysis of the data compared with that which has been previously possible through clinically derived categories of non-, mild, moderate and severe reactions.
Fernee, Christianne; Browne, Martin; Zakrzewski, Sonia
2017-01-01
This paper introduces statistical shape modelling (SSM) for use in osteoarchaeology research. SSM is a full field, multi-material analytical technique, and is presented as a supplementary geometric morphometric (GM) tool. Lower mandibular canines from two archaeological populations and one modern population were sampled, digitised using micro-CT, aligned, registered to a baseline and statistically modelled using principal component analysis (PCA). Sample material properties were incorporated as a binary enamel/dentin parameter. Results were assessed qualitatively and quantitatively using anatomical landmarks. Finally, the technique’s application was demonstrated for inter-sample comparison through analysis of the principal component (PC) weights. It was found that SSM could provide high detail qualitative and quantitative insight with respect to archaeological inter- and intra-sample variability. This technique has value for archaeological, biomechanical and forensic applications including identification, finite element analysis (FEA) and reconstruction from partial datasets. PMID:29216199
La estadistica en el planeamiento educativo (Statistics in Educational Planning).
ERIC Educational Resources Information Center
Leon Pacheco, Tomas
1971-01-01
This document is an English-language abstract (approximately 1500 words) summarizing the author's definitions of the principal physical and human characteristics of elementary and secondary education as presently constituted in Mexico so that school personnel may comply with Mexican regulations that force them to supply educational statistics. For…
NASA Astrophysics Data System (ADS)
Chattopadhyay, Goutami; Chattopadhyay, Surajit; Chakraborthy, Parthasarathi
2012-07-01
The present study deals with daily total ozone concentration time series over four metro cities of India namely Kolkata, Mumbai, Chennai, and New Delhi in the multivariate environment. Using the Kaiser-Meyer-Olkin measure, it is established that the data set under consideration are suitable for principal component analysis. Subsequently, by introducing rotated component matrix for the principal components, the predictors suitable for generating artificial neural network (ANN) for daily total ozone prediction are identified. The multicollinearity is removed in this way. Models of ANN in the form of multilayer perceptron trained through backpropagation learning are generated for all of the study zones, and the model outcomes are assessed statistically. Measuring various statistics like Pearson correlation coefficients, Willmott's indices, percentage errors of prediction, and mean absolute errors, it is observed that for Mumbai and Kolkata the proposed ANN model generates very good predictions. The results are supported by the linearly distributed coordinates in the scatterplots.
NASA Astrophysics Data System (ADS)
Hoell, Simon; Omenzetter, Piotr
2017-04-01
The increasing demand for carbon neutral energy in a challenging economic environment is a driving factor for erecting ever larger wind turbines in harsh environments using novel wind turbine blade (WTBs) designs characterized by high flexibilities and lower buckling capacities. To counteract resulting increasing of operation and maintenance costs, efficient structural health monitoring systems can be employed to prevent dramatic failures and to schedule maintenance actions according to the true structural state. This paper presents a novel methodology for classifying structural damages using vibrational responses from a single sensor. The method is based on statistical classification using Bayes' theorem and an advanced statistic, which allows controlling the performance by varying the number of samples which represent the current state. This is done for multivariate damage sensitive features defined as partial autocorrelation coefficients (PACCs) estimated from vibrational responses and principal component analysis scores from PACCs. Additionally, optimal DSFs are composed not only for damage classification but also for damage detection based on binary statistical hypothesis testing, where features selections are found with a fast forward procedure. The method is applied to laboratory experiments with a small scale WTB with wind-like excitation and non-destructive damage scenarios. The obtained results demonstrate the advantages of the proposed procedure and are promising for future applications of vibration-based structural health monitoring in WTBs.
ERIC Educational Resources Information Center
Ireland, Lakisha Nicole
2017-01-01
This study attempted to determine if there were statistically significant relationships between leadership traits and personality traits of female elementary school principals who serve in school districts located within the Hampton Roads area of Virginia. This study examined randomly selected participants from three school divisions. These…
"A Compliment Is All I Need"--Teachers Telling Principals How to Promote Their Staff's Self-Efficacy
ERIC Educational Resources Information Center
Kass, Efrat
2013-01-01
The purpose of the present study is to compare the perceptions of teachers representing opposite ends of the self-efficacy spectrum regarding the effects of the principal's behavior on their professional self-efficacy. In the first quantitative stage, a statistical procedure was conducted to identify the two groups of teachers: a group of 16…
Anatomical curve identification
Bowman, Adrian W.; Katina, Stanislav; Smith, Joanna; Brown, Denise
2015-01-01
Methods for capturing images in three dimensions are now widely available, with stereo-photogrammetry and laser scanning being two common approaches. In anatomical studies, a number of landmarks are usually identified manually from each of these images and these form the basis of subsequent statistical analysis. However, landmarks express only a very small proportion of the information available from the images. Anatomically defined curves have the advantage of providing a much richer expression of shape. This is explored in the context of identifying the boundary of breasts from an image of the female torso and the boundary of the lips from a facial image. The curves of interest are characterised by ridges or valleys. Key issues in estimation are the ability to navigate across the anatomical surface in three-dimensions, the ability to recognise the relevant boundary and the need to assess the evidence for the presence of the surface feature of interest. The first issue is addressed by the use of principal curves, as an extension of principal components, the second by suitable assessment of curvature and the third by change-point detection. P-spline smoothing is used as an integral part of the methods but adaptations are made to the specific anatomical features of interest. After estimation of the boundary curves, the intermediate surfaces of the anatomical feature of interest can be characterised by surface interpolation. This allows shape variation to be explored using standard methods such as principal components. These tools are applied to a collection of images of women where one breast has been reconstructed after mastectomy and where interest lies in shape differences between the reconstructed and unreconstructed breasts. They are also applied to a collection of lip images where possible differences in shape between males and females are of interest. PMID:26041943
Voukantsis, Dimitris; Karatzas, Kostas; Kukkonen, Jaakko; Räsänen, Teemu; Karppinen, Ari; Kolehmainen, Mikko
2011-03-01
In this paper we propose a methodology consisting of specific computational intelligence methods, i.e. principal component analysis and artificial neural networks, in order to inter-compare air quality and meteorological data, and to forecast the concentration levels for environmental parameters of interest (air pollutants). We demonstrate these methods to data monitored in the urban areas of Thessaloniki and Helsinki in Greece and Finland, respectively. For this purpose, we applied the principal component analysis method in order to inter-compare the patterns of air pollution in the two selected cities. Then, we proceeded with the development of air quality forecasting models for both studied areas. On this basis, we formulated and employed a novel hybrid scheme in the selection process of input variables for the forecasting models, involving a combination of linear regression and artificial neural networks (multi-layer perceptron) models. The latter ones were used for the forecasting of the daily mean concentrations of PM₁₀ and PM₂.₅ for the next day. Results demonstrated an index of agreement between measured and modelled daily averaged PM₁₀ concentrations, between 0.80 and 0.85, while the kappa index for the forecasting of the daily averaged PM₁₀ concentrations reached 60% for both cities. Compared with previous corresponding studies, these statistical parameters indicate an improved performance of air quality parameters forecasting. It was also found that the performance of the models for the forecasting of the daily mean concentrations of PM₁₀ was not substantially different for both cities, despite the major differences of the two urban environments under consideration. Copyright © 2011 Elsevier B.V. All rights reserved.
Long-term strength of metals in complex stress state (a survey)
NASA Astrophysics Data System (ADS)
Lokoshchenko, A. M.
2012-05-01
An analytic survey of experimental data and theoretical approaches characterizing the long-term strength of metals in complex stress state is given. In Sections 2 and 3, the results of plane stress tests (with opposite and equal signs of the nonzero principal stresses, respectively) are analyzed. In Section 4, the results of inhomogeneous stress tests (thick-walled tubes under the action of internal pressures and tensile forces) are considered. All known experimental data (35 test series) are analyzed by a criterion approach. An equivalent stress σ e is introduced as a characteristic of the stress state. Attention is mainly paid to the dependence of σ e on the principal stresses. Statistical methods are used to obtain an expression for σ e, which can be used to study various types of the complex stress state. It is shown that for the long-term strength criterion one can use the power or power-fractional dependence of the time to rupture on the equivalent stress. The methods proposed to describe the test results give a good correspondence between the experimental and theoretical values of the time to rupture. In Section 5, the possibilities of complicating the expressions for σ e by using additional material constants are considered.
A SAR and QSAR study of new artemisinin compounds with antimalarial activity.
Santos, Cleydson Breno R; Vieira, Josinete B; Lobato, Cleison C; Hage-Melim, Lorane I S; Souto, Raimundo N P; Lima, Clarissa S; Costa, Elizabeth V M; Brasil, Davi S B; Macêdo, Williams Jorge C; Carvalho, José Carlos T
2013-12-30
The Hartree-Fock method and the 6-31G** basis set were employed to calculate the molecular properties of artemisinin and 20 derivatives with antimalarial activity. Maps of molecular electrostatic potential (MEPs) and molecular docking were used to investigate the interaction between ligands and the receptor (heme). Principal component analysis and hierarchical cluster analysis were employed to select the most important descriptors related to activity. The correlation between biological activity and molecular properties was obtained using the partial least squares and principal component regression methods. The regression PLS and PCR models built in this study were also used to predict the antimalarial activity of 30 new artemisinin compounds with unknown activity. The models obtained showed not only statistical significance but also predictive ability. The significant molecular descriptors related to the compounds with antimalarial activity were the hydration energy (HE), the charge on the O11 oxygen atom (QO11), the torsion angle O1-O2-Fe-N2 (D2) and the maximum rate of R/Sanderson Electronegativity (RTe+). These variables led to a physical and structural explanation of the molecular properties that should be selected for when designing new ligands to be used as antimalarial agents.
Gooya, Ali; Lekadir, Karim; Alba, Xenia; Swift, Andrew J; Wild, Jim M; Frangi, Alejandro F
2015-01-01
Construction of Statistical Shape Models (SSMs) from arbitrary point sets is a challenging problem due to significant shape variation and lack of explicit point correspondence across the training data set. In medical imaging, point sets can generally represent different shape classes that span healthy and pathological exemplars. In such cases, the constructed SSM may not generalize well, largely because the probability density function (pdf) of the point sets deviates from the underlying assumption of Gaussian statistics. To this end, we propose a generative model for unsupervised learning of the pdf of point sets as a mixture of distinctive classes. A Variational Bayesian (VB) method is proposed for making joint inferences on the labels of point sets, and the principal modes of variations in each cluster. The method provides a flexible framework to handle point sets with no explicit point-to-point correspondences. We also show that by maximizing the marginalized likelihood of the model, the optimal number of clusters of point sets can be determined. We illustrate this work in the context of understanding the anatomical phenotype of the left and right ventricles in heart. To this end, we use a database containing hearts of healthy subjects, patients with Pulmonary Hypertension (PH), and patients with Hypertrophic Cardiomyopathy (HCM). We demonstrate that our method can outperform traditional PCA in both generalization and specificity measures.
NASA Astrophysics Data System (ADS)
Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.
2014-12-01
Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.
ERIC Educational Resources Information Center
Foster, Emily M.
1942-01-01
The U.S. Office of Education is required by law to collect statistics to show the condition and progress of education. Statistics can be made available, on a national scale, to the extent that school administrators, principals, and college officials cooperate on a voluntary basis with the Office of Education in making the facts available. This…
Multivariate frequency domain analysis of protein dynamics
NASA Astrophysics Data System (ADS)
Matsunaga, Yasuhiro; Fuchigami, Sotaro; Kidera, Akinori
2009-03-01
Multivariate frequency domain analysis (MFDA) is proposed to characterize collective vibrational dynamics of protein obtained by a molecular dynamics (MD) simulation. MFDA performs principal component analysis (PCA) for a bandpass filtered multivariate time series using the multitaper method of spectral estimation. By applying MFDA to MD trajectories of bovine pancreatic trypsin inhibitor, we determined the collective vibrational modes in the frequency domain, which were identified by their vibrational frequencies and eigenvectors. At near zero temperature, the vibrational modes determined by MFDA agreed well with those calculated by normal mode analysis. At 300 K, the vibrational modes exhibited characteristic features that were considerably different from the principal modes of the static distribution given by the standard PCA. The influences of aqueous environments were discussed based on two different sets of vibrational modes, one derived from a MD simulation in water and the other from a simulation in vacuum. Using the varimax rotation, an algorithm of the multivariate statistical analysis, the representative orthogonal set of eigenmodes was determined at each vibrational frequency.
NASA Astrophysics Data System (ADS)
Soares dos Santos, T.; Mendes, D.; Rodrigues Torres, R.
2016-01-01
Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANNs) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon; northeastern Brazil; and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model output and observed monthly precipitation. We used general circulation model (GCM) experiments for the 20th century (RCP historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANNs significantly outperform the MLR downscaling of monthly precipitation variability.
NASA Astrophysics Data System (ADS)
dos Santos, T. S.; Mendes, D.; Torres, R. R.
2015-08-01
Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANN) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon, Northeastern Brazil and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model out- put and observed monthly precipitation. We used GCMs experiments for the 20th century (RCP Historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANN significantly outperforms the MLR downscaling of monthly precipitation variability.
Discrimination of serum Raman spectroscopy between normal and colorectal cancer
NASA Astrophysics Data System (ADS)
Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi
2011-07-01
Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.
Foong, Shaohui; Sun, Zhenglong
2016-08-12
In this paper, a novel magnetic field-based sensing system employing statistically optimized concurrent multiple sensor outputs for precise field-position association and localization is presented. This method capitalizes on the independence between simultaneous spatial field measurements at multiple locations to induce unique correspondences between field and position. This single-source-multi-sensor configuration is able to achieve accurate and precise localization and tracking of translational motion without contact over large travel distances for feedback control. Principal component analysis (PCA) is used as a pseudo-linear filter to optimally reduce the dimensions of the multi-sensor output space for computationally efficient field-position mapping with artificial neural networks (ANNs). Numerical simulations are employed to investigate the effects of geometric parameters and Gaussian noise corruption on PCA assisted ANN mapping performance. Using a 9-sensor network, the sensing accuracy and closed-loop tracking performance of the proposed optimal field-based sensing system is experimentally evaluated on a linear actuator with a significantly more expensive optical encoder as a comparison.
NASA Astrophysics Data System (ADS)
Babanova, Sofia; Artyushkova, Kateryna; Ulyanova, Yevgenia; Singhal, Sameer; Atanassov, Plamen
2014-01-01
Two statistical methods, design of experiments (DOE) and principal component analysis (PCA) are employed to investigate and improve performance of air-breathing gas-diffusional enzymatic electrodes. DOE is utilized as a tool for systematic organization and evaluation of various factors affecting the performance of the composite system. Based on the results from the DOE, an improved cathode is constructed. The current density generated utilizing the improved cathode (755 ± 39 μA cm-2 at 0.3 V vs. Ag/AgCl) is 2-5 times higher than the highest current density previously achieved. Three major factors contributing to the cathode performance are identified: the amount of enzyme, the volume of phosphate buffer used to immobilize the enzyme, and the thickness of the gas-diffusion layer (GDL). PCA is applied as an independent confirmation tool to support conclusions made by DOE and to visualize the contribution of factors in individual cathode configurations.
Parallel auto-correlative statistics with VTK.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pebay, Philippe Pierre; Bennett, Janine Camille
2013-08-01
This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.
NASA Technical Reports Server (NTRS)
Park, Steve
1990-01-01
A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.
Forest statistics for Southeast Georgia, 1996
Michael T. Thompson; Raymond M. Sheffield
1997-01-01
This report highlights the principal findings of the seventh forest survey of Southeast Georgia. Field work began in November 1995 and was completed in November 1996. Six previous surveys, completed in 1934, 1952, 1960, 1971, 1981, and 1988 provide statistics for measuring changes and trends over the past 62 years. This report primarily emphasizes the changes and...
NASA Astrophysics Data System (ADS)
Kozoderov, V. V.; Kondranin, T. V.; Dmitriev, E. V.
2017-12-01
The basic model for the recognition of natural and anthropogenic objects using their spectral and textural features is described in the problem of hyperspectral air-borne and space-borne imagery processing. The model is based on improvements of the Bayesian classifier that is a computational procedure of statistical decision making in machine-learning methods of pattern recognition. The principal component method is implemented to decompose the hyperspectral measurements on the basis of empirical orthogonal functions. Application examples are shown of various modifications of the Bayesian classifier and Support Vector Machine method. Examples are provided of comparing these classifiers and a metrical classifier that operates on finding the minimal Euclidean distance between different points and sets in the multidimensional feature space. A comparison is also carried out with the " K-weighted neighbors" method that is close to the nonparametric Bayesian classifier.
ERIC Educational Resources Information Center
Hill, Jason; Ottem, Randolph; DeRoche, John
2016-01-01
Using data from seven administrations of the Schools and Staffing Survey (SASS), this Statistics in Brief examines trends in public and private school principal demographics, experience, and compensation over 25 years, from 1987-88 through 2011-12. Data are drawn from the 1987-88, 1990-91, 1993-94, 1999-2000, 2003-04, 2007-08, and 2011-12 survey…
ERIC Educational Resources Information Center
St. Martin, Kimberly
2010-01-01
This mixed methods study attempted to determine how a system of regular feedback from teachers on principal's work influenced the way in which two principals incorporated the 21 Leadership Responsibilities into their daily practice. Since Principals are the front line of school reform, even when a school commits to an improvement effort that is…
Likelihood-based methods for evaluating principal surrogacy in augmented vaccine trials.
Liu, Wei; Zhang, Bo; Zhang, Hui; Zhang, Zhiwei
2017-04-01
There is growing interest in assessing immune biomarkers, which are quick to measure and potentially predictive of long-term efficacy, as surrogate endpoints in randomized, placebo-controlled vaccine trials. This can be done under a principal stratification approach, with principal strata defined using a subject's potential immune responses to vaccine and placebo (the latter may be assumed to be zero). In this context, principal surrogacy refers to the extent to which vaccine efficacy varies across principal strata. Because a placebo recipient's potential immune response to vaccine is unobserved in a standard vaccine trial, augmented vaccine trials have been proposed to produce the information needed to evaluate principal surrogacy. This article reviews existing methods based on an estimated likelihood and a pseudo-score (PS) and proposes two new methods based on a semiparametric likelihood (SL) and a pseudo-likelihood (PL), for analyzing augmented vaccine trials. Unlike the PS method, the SL method does not require a model for missingness, which can be advantageous when immune response data are missing by happenstance. The SL method is shown to be asymptotically efficient, and it performs similarly to the PS and PL methods in simulation experiments. The PL method appears to have a computational advantage over the PS and SL methods.
What Effective Principals Do to Improve Instruction and Increase Student Achievement
ERIC Educational Resources Information Center
Turner, Elizabeth Anne
2013-01-01
The purposes of this mixed method study were to (a) Examine the relationships among principal effectiveness, principal instructional leadership, and student achievement; (b) examine the differences among principal effectiveness, principal instructional leadership and student achievement; and (c) investigate what effective principals do to improve…
Comparing Networks from a Data Analysis Perspective
NASA Astrophysics Data System (ADS)
Li, Wei; Yang, Jing-Yu
To probe network characteristics, two predominant ways of network comparison are global property statistics and subgraph enumeration. However, they suffer from limited information and exhaustible computing. Here, we present an approach to compare networks from the perspective of data analysis. Initially, the approach projects each node of original network as a high-dimensional data point, and the network is seen as clouds of data points. Then the dispersion information of the principal component analysis (PCA) projection of the generated data clouds can be used to distinguish networks. We applied this node projection method to the yeast protein-protein interaction networks and the Internet Autonomous System networks, two types of networks with several similar higher properties. The method can efficiently distinguish one from the other. The identical result of different datasets from independent sources also indicated that the method is a robust and universal framework.
Compressive strength of human openwedges: a selection method
NASA Astrophysics Data System (ADS)
Follet, H.; Gotteland, M.; Bardonnet, R.; Sfarghiu, A. M.; Peyrot, J.; Rumelhart, C.
2004-02-01
A series of 44 samples of bone wedges of human origin, intended for allograft openwedge osteotomy and obtained without particular precautions during hip arthroplasty were re-examined. After viral inactivity chemical treatment, lyophilisation and radio-sterilisation (intended to produce optimal health safety), the compressive strength, independent of age, sex and the height of the sample (or angle of cut), proved to be too widely dispersed [ 10{-}158 MPa] in the first study. We propose a method for selecting samples which takes into account their geometry (width, length, thicknesses, cortical surface area). Statistical methods (Principal Components Analysis PCA, Hierarchical Cluster Analysis, Multilinear regression) allowed final selection of 29 samples having a mean compressive strength σ_{max} =103 MPa ± 26 and with variation [ 61{-}158 MPa] . These results are equivalent or greater than average materials currently used in openwedge osteotomy.
Population Analysis of Disabled Children by Departments in France
NASA Astrophysics Data System (ADS)
Meidatuzzahra, Diah; Kuswanto, Heri; Pech, Nicolas; Etchegaray, Amélie
2017-06-01
In this study, a statistical analysis is performed by model the variations of the disabled about 0-19 years old population among French departments. The aim is to classify the departments according to their profile determinants (socioeconomic and behavioural profiles). The analysis is focused on two types of methods: principal component analysis (PCA) and multiple correspondences factorial analysis (MCA) to review which one is the best methods for interpretation of the correlation between the determinants of disability (independent variable). The PCA is the best method for interpretation of the correlation between the determinants of disability (independent variable). The PCA reduces 14 determinants of disability to 4 axes, keeps 80% of total information, and classifies them into 7 classes. The MCA reduces the determinants to 3 axes, retains only 30% of information, and classifies them into 4 classes.
Lancaster, Cady; Espinoza, Edgard
2012-05-15
International trade of several Dalbergia wood species is regulated by The Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). In order to supplement morphological identification of these species, a rapid chemical method of analysis was developed. Using Direct Analysis in Real Time (DART) ionization coupled with Time-of-Flight (TOF) Mass Spectrometry (MS), selected Dalbergia and common trade species were analyzed. Each of the 13 wood species was classified using principal component analysis and linear discriminant analysis (LDA). These statistical data clusters served as reliable anchors for species identification of unknowns. Analysis of 20 or more samples from the 13 species studied in this research indicates that the DART-TOFMS results are reproducible. Statistical analysis of the most abundant ions gave good classifications that were useful for identifying unknown wood samples. DART-TOFMS and LDA analysis of 13 species of selected timber samples and the statistical classification allowed for the correct assignment of unknown wood samples. This method is rapid and can be useful when anatomical identification is difficult but needed in order to support CITES enforcement. Published 2012. This article is a US Government work and is in the public domain in the USA.
Chandrasekaran, A; Ravisankar, R; Harikrishnan, N; Satapathy, K K; Prasad, M V R; Kanagasabapathy, K V
2015-02-25
Anthropogenic activities increase the accumulation of heavy metals in the soil environment. Soil pollution significantly reduces environmental quality and affects the human health. In the present study soil samples were collected at different locations of Yelagiri Hills, Tamilnadu, India for heavy metal analysis. The samples were analyzed for twelve selected heavy metals (Mg, Al, K, Ca, Ti, Fe, V, Cr, Mn, Co, Ni and Zn) using energy dispersive X-ray fluorescence (EDXRF) spectroscopy. Heavy metals concentration in soil were investigated using enrichment factor (EF), geo-accumulation index (Igeo), contamination factor (CF) and pollution load index (PLI) to determine metal accumulation, distribution and its pollution status. Heavy metal toxicity risk was assessed using soil quality guidelines (SQGs) given by target and intervention values of Dutch soil standards. The concentration of Ni, Co, Zn, Cr, Mn, Fe, Ti, K, Al, Mg were mainly controlled by natural sources. Multivariate statistical methods such as correlation matrix, principal component analysis and cluster analysis were applied for the identification of heavy metal sources (anthropogenic/natural origin). Geo-statistical methods such as kirging identified hot spots of metal contamination in road areas influenced mainly by presence of natural rocks. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Moustafa, Azza Aziz; Salem, Hesham; Hegazy, Maha; Ali, Omnia
2015-02-01
Simple, accurate, and selective methods have been developed and validated for simultaneous determination of a ternary mixture of Chlorpheniramine maleate (CPM), Pseudoephedrine HCl (PSE) and Ibuprofen (IBF), in tablet dosage form. Four univariate methods manipulating ratio spectra were applied, method A is the double divisor-ratio difference spectrophotometric method (DD-RD). Method B is double divisor-derivative ratio spectrophotometric method (DD-RD). Method C is derivative ratio spectrum-zero crossing method (DRZC), while method D is mean centering of ratio spectra (MCR). Two multivariate methods were also developed and validated, methods E and F are Principal Component Regression (PCR) and Partial Least Squares (PLSs). The proposed methods have the advantage of simultaneous determination of the mentioned drugs without prior separation steps. They were successfully applied to laboratory-prepared mixtures and to commercial pharmaceutical preparation without any interference from additives. The proposed methods were validated according to the ICH guidelines. The obtained results were statistically compared with the official methods where no significant difference was observed regarding both accuracy and precision.
2012-01-01
Background Gene Set Analysis (GSA) has proven to be a useful approach to microarray analysis. However, most of the method development for GSA has focused on the statistical tests to be used rather than on the generation of sets that will be tested. Existing methods of set generation are often overly simplistic. The creation of sets from individual pathways (in isolation) is a poor reflection of the complexity of the underlying metabolic network. We have developed a novel approach to set generation via the use of Principal Component Analysis of the Laplacian matrix of a metabolic network. We have analysed a relatively simple data set to show the difference in results between our method and the current state-of-the-art pathway-based sets. Results The sets generated with this method are semi-exhaustive and capture much of the topological complexity of the metabolic network. The semi-exhaustive nature of this method has also allowed us to design a hypergeometric enrichment test to determine which genes are likely responsible for set significance. We show that our method finds significant aspects of biology that would be missed (i.e. false negatives) and addresses the false positive rates found with the use of simple pathway-based sets. Conclusions The set generation step for GSA is often neglected but is a crucial part of the analysis as it defines the full context for the analysis. As such, set generation methods should be robust and yield as complete a representation of the extant biological knowledge as possible. The method reported here achieves this goal and is demonstrably superior to previous set analysis methods. PMID:22876834
Maranesi, E; Merlo, A; Fioretti, S; Zemp, D D; Campanini, I; Quadri, P
2016-02-01
Identification of future non-fallers, infrequent and frequent fallers among older people would permit focusing the delivery of prevention programs on selected individuals. Posturographic parameters have been proven to differentiate between non-fallers and frequent fallers, but not between the first group and infrequent fallers. In this study, postural stability with eyes open and closed on both a firm and a compliant surface and while performing a cognitive task was assessed in a consecutive sample of 130 cognitively able elderly, mean age 77(7)years, categorized as non-fallers (N=67), infrequent fallers (one/two falls, N=45) and frequent fallers (more than two falls, N=18) according to their last year fall history. Principal Component Analysis was used to select the most significant features from a set of 17posturographic parameters. Next, variables derived from principal component analysis were used to test, in each task, group differences between the three groups. One parameter based on a combination of a set of Centre of Pressure anterior-posterior variables obtained from the eyes-open on a compliant surface task was statistically different among all groups, thus distinguishing infrequent fallers from both non-fallers (P<0.05) and frequent fallers (P<0.05). For the first time, a method based on posturographic data to retrospectively discriminate infrequent fallers was obtained. The joint use of both the eyes-open on a compliant surface condition and this new parameter could be used, in a future study, to improve the performance of protocols and to verify the ability of this method to identify new-fallers in elderly without cognitive impairment. Copyright © 2015 Elsevier Ltd. All rights reserved.
Stupák, Ivan; Pavloková, Sylvie; Vysloužil, Jakub; Dohnal, Jiří; Čulen, Martin
2017-11-23
Biorelevant dissolution instruments represent an important tool for pharmaceutical research and development. These instruments are designed to simulate the dissolution of drug formulations in conditions most closely mimicking the gastrointestinal tract. In this work, we focused on the optimization of dissolution compartments/vessels for an updated version of the biorelevant dissolution apparatus-Golem v2. We designed eight compartments of uniform size but different inner geometry. The dissolution performance of the compartments was tested using immediate release caffeine tablets and evaluated by standard statistical methods and principal component analysis. Based on two phases of dissolution testing (using 250 and 100 mL of dissolution medium), we selected two compartment types yielding the highest measurement reproducibility. We also confirmed a statistically ssignificant effect of agitation rate and dissolution volume on the extent of drug dissolved and measurement reproducibility.
Provenance establishment of coffee using solution ICP-MS and ICP-AES.
Valentin, Jenna L; Watling, R John
2013-11-01
Statistical interpretation of the concentrations of 59 elements, determined using solution based inductively coupled plasma mass spectrometry (ICP-MS) and inductively coupled plasma emission spectroscopy (ICP-AES), was used to establish the provenance of coffee samples from 15 countries across five continents. Data confirmed that the harvest year, degree of ripeness and whether the coffees were green or roasted had little effect on the elemental composition of the coffees. The application of linear discriminant analysis and principal component analysis of the elemental concentrations permitted up to 96.9% correct classification of the coffee samples according to their continent of origin. When samples from each continent were considered separately, up to 100% correct classification of coffee samples into their countries, and plantations of origin was achieved. This research demonstrates the potential of using elemental composition, in combination with statistical classification methods, for accurate provenance establishment of coffee. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Baddari, Kamel; Bellalem, Fouzi; Baddari, Ibtihel; Makdeche, Said
2016-10-01
Statistical tests have been used to adjust the Zemmouri seismic data using a distribution function. The Pareto law has been used and the probabilities of various expected earthquakes were computed. A mathematical expression giving the quantiles was established. The extreme values limiting law confirmed the accuracy of the adjustment method. Using the moment magnitude scale, a probabilistic model was made to predict the occurrences of strong earthquakes. The seismic structure has been characterized by the slope of the recurrence plot γ, fractal dimension D, concentration parameter K sr, Hurst exponents H r and H t. The values of D, γ, K sr, H r, and H t diminished many months before the principal seismic shock ( M = 6.9) of the studied seismoactive zone has occurred. Three stages of the deformation of the geophysical medium are manifested in the variation of the coefficient G% of the clustering of minor seismic events.
Multi-element fingerprinting as a tool in origin authentication of four east China marine species.
Guo, Lipan; Gong, Like; Yu, Yanlei; Zhang, Hong
2013-12-01
The contents of 25 elements in 4 types of commercial marine species from the East China Sea were determined by inductively coupled plasma mass spectrometry and atomic absorption spectrometry. The elemental composition was used to differentiate marine species according to geographical origin by multivariate statistical analysis. The results showed that principal component analysis could distinguish samples from different areas and reveal the elements which played the most important role in origin diversity. The established models by partial least squares discriminant analysis (PLS-DA) and by probabilistic neural network (PNN) can both precisely predict the origin of the marine species. Further study indicated that PLS-DA and PNN were efficacious in regional discrimination. The models from these 2 statistical methods, with an accuracy of 97.92% and 100%, respectively, could both distinguish samples from different areas without the need for species differentiation. © 2013 Institute of Food Technologists®
NASA Technical Reports Server (NTRS)
Stefanick, M.; Jurdy, D. M.
1984-01-01
Statistical analyses are compared for two published hot spot data sets, one minimal set of 42 and another larger set of 117, using three different approaches. First, the earths surface is divided into 16 equal-area fractions and the observed distribution of hot spots among them is analyzed using chi-square tests. Second, cumulative distributions about the principal axes of the hot spot inertia tensor are used to describe hot spot distribution. Finally, a hot spot density function is constructed for each of the two hot spot data sets. The methods all indicate that hot spots have a nonuniform distribution, even when statistical fluctuations are considered. To the first order, hot spots are concentrated on one half of of the earth's surface area; within that portion, the distribution is consistent with a uniform distribution. The observed hot spot densities for neither data set are explained solely by plate speed.
Sayago, Ana; González-Domínguez, Raúl; Beltrán, Rafael; Fernández-Recamales, Ángeles
2018-09-30
This work explores the potential of multi-element fingerprinting in combination with advanced data mining strategies to assess the geographical origin of extra virgin olive oil samples. For this purpose, the concentrations of 55 elements were determined in 125 oil samples from multiple Spanish geographic areas. Several unsupervised and supervised multivariate statistical techniques were used to build classification models and investigate the relationship between mineral composition of olive oils and their provenance. Results showed that Spanish extra virgin olive oils exhibit characteristic element profiles, which can be differentiated on the basis of their origin in accordance with three geographical areas: Atlantic coast (Huelva province), Mediterranean coast and inland regions. Furthermore, statistical modelling yielded high sensitivity and specificity, principally when random forest and support vector machines were employed, thus demonstrating the utility of these techniques in food traceability and authenticity research. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Talpe, Matthieu J.; Nerem, R. Steven; Forootan, Ehsan; Schmidt, Michael; Lemoine, Frank G.; Enderlin, Ellyn M.; Landerer, Felix W.
2017-01-01
We construct long-term time series of Greenland and Antarctic ice sheet mass change from satellite gravity measurements. A statistical reconstruction approach is developed based on a principal component analysis (PCA) to combine high-resolution spatial modes from the Gravity Recovery and Climate Experiment (GRACE) mission with the gravity information from conventional satellite tracking data. Uncertainties of this reconstruction are rigorously assessed; they include temporal limitations for short GRACE measurements, spatial limitations for the low-resolution conventional tracking data measurements, and limitations of the estimated statistical relationships between low- and high-degree potential coefficients reflected in the PCA modes. Trends of mass variations in Greenland and Antarctica are assessed against a number of previous studies. The resulting time series for Greenland show a higher rate of mass loss than other methods before 2000, while the Antarctic ice sheet appears heavily influenced by interannual variations.
3D Texture Analysis in Renal Cell Carcinoma Tissue Image Grading
Cho, Nam-Hoon; Choi, Heung-Kook
2014-01-01
One of the most significant processes in cancer cell and tissue image analysis is the efficient extraction of features for grading purposes. This research applied two types of three-dimensional texture analysis methods to the extraction of feature values from renal cell carcinoma tissue images, and then evaluated the validity of the methods statistically through grade classification. First, we used a confocal laser scanning microscope to obtain image slices of four grades of renal cell carcinoma, which were then reconstructed into 3D volumes. Next, we extracted quantitative values using a 3D gray level cooccurrence matrix (GLCM) and a 3D wavelet based on two types of basis functions. To evaluate their validity, we predefined 6 different statistical classifiers and applied these to the extracted feature sets. In the grade classification results, 3D Haar wavelet texture features combined with principal component analysis showed the best discrimination results. Classification using 3D wavelet texture features was significantly better than 3D GLCM, suggesting that the former has potential for use in a computer-based grading system. PMID:25371701
NASA Astrophysics Data System (ADS)
Luo, Shuwen; Chen, Changshui; Mao, Hua; Jin, Shaoqin
2013-06-01
The feasibility of early detection of gastric cancer using near-infrared (NIR) Raman spectroscopy (RS) by distinguishing premalignant lesions (adenomatous polyp, n=27) and cancer tissues (adenocarcinoma, n=33) from normal gastric tissues (n=45) is evaluated. Significant differences in Raman spectra are observed among the normal, adenomatous polyp, and adenocarcinoma gastric tissues at 936, 1003, 1032, 1174, 1208, 1323, 1335, 1450, and 1655 cm-1. Diverse statistical methods are employed to develop effective diagnostic algorithms for classifying the Raman spectra of different types of ex vivo gastric tissues, including principal component analysis (PCA), linear discriminant analysis (LDA), and naive Bayesian classifier (NBC) techniques. Compared with PCA-LDA algorithms, PCA-NBC techniques together with leave-one-out, cross-validation method provide better discriminative results of normal, adenomatous polyp, and adenocarcinoma gastric tissues, resulting in superior sensitivities of 96.3%, 96.9%, and 96.9%, and specificities of 93%, 100%, and 95.2%, respectively. Therefore, NIR RS associated with multivariate statistical algorithms has the potential for early diagnosis of gastric premalignant lesions and cancer tissues in molecular level.
Han, Sheng-Nan
2014-07-01
Chemometrics is a new branch of chemistry which is widely applied to various fields of analytical chemistry. Chemometrics can use theories and methods of mathematics, statistics, computer science and other related disciplines to optimize the chemical measurement process and maximize access to acquire chemical information and other information on material systems by analyzing chemical measurement data. In recent years, traditional Chinese medicine has attracted widespread attention. In the research of traditional Chinese medicine, it has been a key problem that how to interpret the relationship between various chemical components and its efficacy, which seriously restricts the modernization of Chinese medicine. As chemometrics brings the multivariate analysis methods into the chemical research, it has been applied as an effective research tool in the composition-activity relationship research of Chinese medicine. This article reviews the applications of chemometrics methods in the composition-activity relationship research in recent years. The applications of multivariate statistical analysis methods (such as regression analysis, correlation analysis, principal component analysis, etc. ) and artificial neural network (such as back propagation artificial neural network, radical basis function neural network, support vector machine, etc. ) are summarized, including the brief fundamental principles, the research contents and the advantages and disadvantages. Finally, the existing main problems and prospects of its future researches are proposed.
Schiff, Steven J.; Kiwanuka, Julius; Riggio, Gina; Nguyen, Lan; Mu, Kevin; Sproul, Emily; Bazira, Joel; Mwanga-Amumpaire, Juliet; Tumusiime, Dickson; Nyesigire, Eunice; Lwanga, Nkangi; Bogale, Kaleb T.; Kapur, Vivek; Broach, James R.; Morton, Sarah U.; Warf, Benjamin C.; Poss, Mary
2016-01-01
Neonatal sepsis (NS) is responsible for over 1 million yearly deaths worldwide. In the developing world, NS is often treated without an identified microbial pathogen. Amplicon sequencing of the bacterial 16S rRNA gene can be used to identify organisms that are difficult to detect by routine microbiological methods. However, contaminating bacteria are ubiquitous in both hospital settings and research reagents and must be accounted for to make effective use of these data. In this study, we sequenced the bacterial 16S rRNA gene obtained from blood and cerebrospinal fluid (CSF) of 80 neonates presenting with NS to the Mbarara Regional Hospital in Uganda. Assuming that patterns of background contamination would be independent of pathogenic microorganism DNA, we applied a novel quantitative approach using principal orthogonal decomposition to separate background contamination from potential pathogens in sequencing data. We designed our quantitative approach contrasting blood, CSF, and control specimens and employed a variety of statistical random matrix bootstrap hypotheses to estimate statistical significance. These analyses demonstrate that Leptospira appears present in some infants presenting within 48 h of birth, indicative of infection in utero, and up to 28 days of age, suggesting environmental exposure. This organism cannot be cultured in routine bacteriological settings and is enzootic in the cattle that often live in close proximity to the rural peoples of western Uganda. Our findings demonstrate that statistical approaches to remove background organisms common in 16S sequence data can reveal putative pathogens in small volume biological samples from newborns. This computational analysis thus reveals an important medical finding that has the potential to alter therapy and prevention efforts in a critically ill population. PMID:27379237
Bioclimatic Classification of Northeast Asia for climate change response
NASA Astrophysics Data System (ADS)
Choi, Y.; Jeon, S. W.; Lim, C. H.
2016-12-01
As climate change has been getting worse, we should monitor the change of biodiversity, and distribution of species to handle the crisis and take advantage of climate change. The development of bioclimatic map which classifies land into homogenous zones by similar environment properties is the first step to establish a strategy. Statistically derived classifications of land provide useful spatial frameworks to support ecosystem research, monitoring and policy decisions. Many countries are trying to make this kind of map and actively utilize it to ecosystem conservation and management. However, the Northeast Asia including North Korea doesn't have detailed environmental information, and has not built environmental classification map. Therefore, this study presents a bioclimatic map of Northeast Asia based on statistical clustering of bioclimate data. Bioclim data ver1.4 which provided by WorldClim were considered for inclusion in a model. Eight of the most relevant climate variables were selected by correlation analysis, based on previous studies. Principal Components Analysis (PCA) was used to explain 86% of the variation into three independent dimensions, which were subsequently clustered using an ISODATA clustering. The bioclimatic zone of Northeast Asia could consist of 29, 35, and 50 zones. This bioclimatic map has a 30' resolution. To assess the accuracy, the correlation coefficient was calculated between the first principal component values of the classification variables and the vegetation index, Gross Primary Production (GPP). It shows about 0.5 Pearson correlation coefficient. This study constructed Northeast Asia bioclimatic map by statistical method with high resolution, but in order to better reflect the realities, the variety of climate variables should be considered. Also, further studies should do more quantitative and qualitative validation in various ways. Then, this could be used more effectively to support decision making on climate change adaptation.
Pouch, Alison M; Vergnat, Mathieu; McGarvey, Jeremy R; Ferrari, Giovanni; Jackson, Benjamin M; Sehgal, Chandra M; Yushkevich, Paul A; Gorman, Robert C; Gorman, Joseph H
2014-01-01
The basis of mitral annuloplasty ring design has progressed from qualitative surgical intuition to experimental and theoretical analysis of annular geometry with quantitative imaging techniques. In this work, we present an automated three-dimensional (3D) echocardiographic image analysis method that can be used to statistically assess variability in normal mitral annular geometry to support advancement in annuloplasty ring design. Three-dimensional patient-specific models of the mitral annulus were automatically generated from 3D echocardiographic images acquired from subjects with normal mitral valve structure and function. Geometric annular measurements including annular circumference, annular height, septolateral diameter, intercommissural width, and the annular height to intercommissural width ratio were automatically calculated. A mean 3D annular contour was computed, and principal component analysis was used to evaluate variability in normal annular shape. The following mean ± standard deviations were obtained from 3D echocardiographic image analysis: annular circumference, 107.0 ± 14.6 mm; annular height, 7.6 ± 2.8 mm; septolateral diameter, 28.5 ± 3.7 mm; intercommissural width, 33.0 ± 5.3 mm; and annular height to intercommissural width ratio, 22.7% ± 6.9%. Principal component analysis indicated that shape variability was primarily related to overall annular size, with more subtle variation in the skewness and height of the anterior annular peak, independent of annular diameter. Patient-specific 3D echocardiographic-based modeling of the human mitral valve enables statistical analysis of physiologically normal mitral annular geometry. The tool can potentially lead to the development of a new generation of annuloplasty rings that restore the diseased mitral valve annulus back to a truly normal geometry. Copyright © 2014 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Feizbakhsh, Masood; Kadkhodaei, Mahmoud; Zandian, Dana; Hosseinpour, Zahra
2017-01-01
Background: One of the most effective ways for distal movement of molars to treat Class II malocclusion is using extraoral force through a headgear device. The purpose of this study was the comparison of stress distribution in maxillary first molar periodontium using straight pull headgear in vertical and horizontal tubes through finite element method. Materials and Methods: Based on the real geometry model, a basic model of the first molar and maxillary bone was obtained using three-dimensional imaging of the skull. After the geometric modeling of periodontium components through CATIA software and the definition of mechanical properties and element classification, a force of 150 g for each headgear was defined in ABAQUS software. Consequently, Von Mises and Principal stresses were evaluated. The statistical analysis was performed using T-paired and Wilcoxon nonparametric tests. Results: Extension of areas with Von Mises and Principal stresses utilizing straight pull headgear with a vertical tube was not different from that of using a horizontal tube, but the numerical value of the Von Mises stress in the vertical tube was significantly reduced (P < 0/05). On the other hand, the difference of the principal stress between both tubes was not significant (P > 0/05). Conclusion: Based on the results, when force applied to the straight pull headgear with a vertical tube, Von Mises stress was reduced significantly in comparison with the horizontal tube. Therefore, to correct the mesiolingual movement of the maxillary first molar, vertical headgear tube is recommended. PMID:28584535
Reflexion on linear regression trip production modelling method for ensuring good model quality
NASA Astrophysics Data System (ADS)
Suprayitno, Hitapriya; Ratnasari, Vita
2017-11-01
Transport Modelling is important. For certain cases, the conventional model still has to be used, in which having a good trip production model is capital. A good model can only be obtained from a good sample. Two of the basic principles of a good sampling is having a sample capable to represent the population characteristics and capable to produce an acceptable error at a certain confidence level. It seems that this principle is not yet quite understood and used in trip production modeling. Therefore, investigating the Trip Production Modelling practice in Indonesia and try to formulate a better modeling method for ensuring the Model Quality is necessary. This research result is presented as follows. Statistics knows a method to calculate span of prediction value at a certain confidence level for linear regression, which is called Confidence Interval of Predicted Value. The common modeling practice uses R2 as the principal quality measure, the sampling practice varies and not always conform to the sampling principles. An experiment indicates that small sample is already capable to give excellent R2 value and sample composition can significantly change the model. Hence, good R2 value, in fact, does not always mean good model quality. These lead to three basic ideas for ensuring good model quality, i.e. reformulating quality measure, calculation procedure, and sampling method. A quality measure is defined as having a good R2 value and a good Confidence Interval of Predicted Value. Calculation procedure must incorporate statistical calculation method and appropriate statistical tests needed. A good sampling method must incorporate random well distributed stratified sampling with a certain minimum number of samples. These three ideas need to be more developed and tested.
Forest statistics for Central Georgia, 1982
Raymond M. Sheffield; John B. Tansey
1982-01-01
This report highlights the principal findings of the fifth forest survey of Central Georgia. Fieldwork began in October 1981 and was completed in June 1982. Four previous surveys, completed in 1936, 1952, 1961, and 1972, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on the changes and trends since...
Forest statistics for North Central Georgia, 1998
Michael T. Thompson
1998-01-01
This report highlights the principal findings of the seventh forest survey of North Central Georgia. Field work began in June 1997 and was completed in November 1997. Six previous surveys, completed in 1936, 1953, 196 1, 1972, 1983, and 1989 provide statistics for measuring changes and trends over the past 6 1 years. This report primarily emphasizes the changes and...
Forest statistics for South Florida, 1995
Michael T. Thompson
1996-01-01
This report highlights the principal findings of the seventh forest survey of South Florida. Field work began in September 1994 and was completed in November 1994. Six previous surveys, completed in 1936, 1949, 1959, 1970, 1980, and 1988 provide statistics for measuring changes and trends over the past 59 years. This report primarily emphasizes the changes and trends...
Forest statistics for Central Georgia, 1997
Michael T. Thompson
1998-01-01
This report highlights the principal findings of the seventh forest survey of Central Georgia. Field work began in November 1996 and was completed in August 1997. Six previous surveys, completed in 1936, 1952, 1961, 1972, 1982, and 1989 provide statistics for measuring changes and trends over the past 61 years. This report primarily emphasizes the changes and trends...
Forest statistics for Central Florida - 1995
Mark J. Brown
1996-01-01
This report highlights the principal findings of the seventh forest survey of Central Florida. Field work began in February 1995 and was completed in May 1995. Six previous surveys, completed in 1936, 1949, 1959, 1970, 1960, and 1988 provide statistics for measuring changes and trends over the past 59 years. This report primarily emphasizes the changes and trends since...
Forest statistics for Southwest Georgia, 1996
Raymond M. Sheffield; Michael T. Thompson
1997-01-01
This report highlights the principal findings of the seventh forest survey of Southwest Georgia. Field work began in June 1995 and was completed in November 1995. Six previous surveys, completed in 1934, 1951, 1960, 1971, 1981, and 1988 provide statistics for measuring changes and trends over the past 62 years. This report primarily emphasizes the changes and trends...
Forest statistics for Northeast Florida, 1980
Raymond M. Sheffield
1981-01-01
This report highlights the principal findings of the fifth forest survey of Northeast Florida. Fieldwork began in June 1979 and was completed in December 1979. Four previous surveys, completed in 1934, 1949, 1959, and 1970, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on the changes and trends since...
Indicators of School Crime and Safety: 2017. NCES 2018-036/NCJ 251413
ERIC Educational Resources Information Center
Zhang, Anlan; Wang, Ke; Zhang, Jizhi; Kemp, Jana; Diliberti, Melissa; Oudekerk, Barbara A.
2018-01-01
A joint effort by the National Center for Education Statistics and the Bureau of Justice Statistics, this annual report examines crime occurring in schools and colleges. This report presents data on crime at school from the perspectives of students, teachers, principals, and the general population from an array of sources--the National Crime…
Forest statistics for Virginia, 1992
Tony G. Johnson
1992-01-01
This report highlights the principal findings of the sixth forest survey of Virginia. Field work began in October 1990 and was completed in January 1992. Five previous surveys, completed in 1940, 1957, 1966, 1977, and 1986, provide statistics for measuring changes and trends over the past 52 years. The primary emphasis in this report is on the changes and trends since...
Forest statistics for the Northern Piedmont of Virginia 1976
Raymond M. Sheffield
1976-01-01
This report highlights the principal findings of the fourth inventory of the timber resource in the Northern Piedmont of Virginia. The inventory was started in March 1976 and completed in August 1976. Three previous inventories, completed in 1940, 1957, and 1965, provide statistics for measuring changes and trends over the past 36 years. In this report, the primary...
Forest statistics for the Coastal Plain of Virginia, 1976
Noel D. Cost
1976-01-01
This report highlights the principal findings of the fourth inventory of the timber resource in the coastal Plain of Virginia. The inventory was started in February 1975 and completed in November 1975. Three previous inventories, completed in 1940, 1956, and 1966, provide statistics for measuring changes and trends over the past 36 years. In this report, the primary...
Forest statistics for Northeast Florida, 1987
Mark J. Brown
1987-01-01
This report highlights the principal findings of the sixth forest survey of Northeast Florida. Field work began in January 1987 and was completed in July 1987. Five previous surveys, completed in 1934, 1949, 1959, 1970, and 1980, provide statistics for measuring changes and trends over the past 53 years. The primary emphasis in this report is on the changes and trends...
Forest statistics for Northwest Florida, 1979
Raymond M. Sheffield
1979-01-01
This report highlights the principal findings of the fifth forest survey of Northwest Florida. Fieldwork began in September 1978 and was completed in June 1979. Four previous surveys, completed in 1934, 1949, 1959, and 1969, provide statistics for measuring changes and trends over the past 45 years. The primary emphasis in this report is on the changes and trends since...
Forest statistics for Northeast Florida, 1995
Raymond M. Sheffield
1995-01-01
This report highlights the principal findings of the seventh forest survey of Northeast Florida. Field work began in April 1994 and was completed in May 1995. Six previous surveys, completed in 1934, 1949. 1959, 1970, 1980, and 1987 provide statistics for measuring changes and trends over the past 61 years. The primary emphasis in this report is on the changes and...
Forest statistics for North Georgia, 1983
John B. Tansey
1983-01-01
This report highlights the principal findings of the fifth forest survey of North Georgia. Fieldwork began in September 1982 and was completed in January 1983. Four previous surveys, completed in 1936, 1953, 1961, and 1972, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes and trends since...
Forest statistics for North Carolina, 1984
William A. Bechtold
1984-01-01
This report highlights the principal findings of the fifth forest survey of North Carolina, Fieldwork began in November 1982 and was completed in September 1984, Four previous surveys, completed in 1938, 1956, 1964, and 1974, provide statistics for measuring changes and trends over the past 46 years, The primary emphasis in this report is on the changes and trends...
Forest statistics for North Central Georgia, 1983
John B. Tansey
1983-01-01
This report highlights the principal findings of the fifth forest survey of North Central Georgia. Fieldwork began in May 1982 and was completed in September 1982. Four previous surveys, completed in 1936, 1953, 1961, and 1972, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes and trends...
Forest statistics for South Carolina, 1978
Raymond M. Sheffield
1978-01-01
This report highlights the principal findings of the fifth inventory of South Carolina's forests. Fieldwork began in April 1977 and was completed in August 1978. Four previous statewide inventories, completed in 1936, 1947, 19.58, and 1968, provide statistics for measuring changes and trends over the past 42 years. The primary emphasis in this report is on the...
Forest statistics for Southeast Georgia, 1981
Raymond M. Sheffield
1982-01-01
This report highlights the principal findings of the fifth forest survey of Southeast Georgia, Fieldwork began in November 1980 and was completed in October 1981. Four previous surveys, completed in 1934, 1952, 1960, and 1971, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes and trends...
Forest statistics for the Southern Piedmont of Virginia 1976
Raymond M. Sheffield
1976-01-01
This report highlights the principal findings of the fourth inventory of the timber resource in the Southern Piedmont of Virginia. The inventory was started in February 1975 and completed in November 1975. Three previous inventories, completed in 1940, 1956, and 1966, provide statistics for measuring changes and trends over the past 36 years. In this report, the...
Data Sharing and the Development of the Cleveland Clinic Statistical Education Dataset Repository
ERIC Educational Resources Information Center
Nowacki, Amy S.
2013-01-01
Examples are highly sought by both students and teachers. This is particularly true as many statistical instructors aim to engage their students and increase active participation. While simulated datasets are functional, they lack real perspective and the intricacies of actual data. In order to obtain real datasets, the principal investigator of a…
Forest statistics for Virginia, 1986
Mark J. Brown
1986-01-01
This report highlights the principal findings of the fifth forest survey of Virginia. Fieldwork began in September 1984 and was completed in November 1985. Four previous surveys, completed in 1940, 1957, 1966, and 1977, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on the changes and trends since 1977...
Forest statistics for Southwest Georgia, 1981
Raymond M. Sheffield
1981-01-01
This report highlights the principal findings of the fifth forest survey of southwest Georgia, Fieldwork began in May 1980 and was completed in November 1980. Four previous surveys, completed in 1938, 1951, 1960, 1971, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes and trends since 1971...
Forest statistics for the Northern Mountain region of Virginia 1977
Raymond M. Sheffield
1977-01-01
This report highlights the principal findings of the fourth inventory of timber resources in the Northern Mountain Region of Virginia. The inventory was started in August 1976 and completed in December 1976. Three previous inventories, completed in 1940, 1957 and 1966, provide statistics for measuring changes and trends over the past 37 years. In this report, the...
Forest statistics for Central Florida - 1980
Raymond M. Sheffield
1981-01-01
This report highlights the principal findings of the fifth forest survey of Central Florida. Fieldwork began in December 1979 and was completed in March 1980. Four previous surveys, completed in 1936, 1949, 1959, and 1970, provide statistics for measuring changes and trends over the past 44 years. The primary emphasis in this report is on the changes and trends since...
Forest statistics for South Carolina, 1986
John B. Tansey
1986-01-01
This report highlights the principal findings of the sixth forest survey in South Carolina. Fieldwork began in November 1985 and was completed in September 1986. Five previous surveys, completed in 1936, 1947, 1958, 1968, and 1978, provide statistics for measuring changes and trends over the past 50 years, The primary emphasis in this report is on the changes and...
Forest statistics for Southwest Georgia, 1988
Michael T. Thompson
1988-01-01
This report highlights the principal findings of the sixth forest survey in southwest Georgia. Field work began in October 1987 and was completed in January 1988. Five previous surveys, completed in 1934, 1951, 1960, 1971, and 1981, provide statistics for measuring changes and trends over the past 54 years. The primary emphasis in this report is on the changes and...
Forest statistics for Central Florida - 1988
Mark J. Brown
1988-01-01
This report highlights the principal findings of the sixth forest survey of Central Florida. Field work began in July 1987 and was completed in September 1987. Five previous surveys, completed in 1936, 1949, 1959, 1970, and 1980, provide statistics for measuring changes and trends over the past 52 years. The primary emphasis in this report is on the changes and trends...
Chavez, P.S.; Sides, S.C.; Anderson, J.A.
1991-01-01
The merging of multisensor image data is becoming a widely used procedure because of the complementary nature of various data sets. Ideally, the method used to merge data sets with high-spatial and high-spectral resolution should not distort the spectral characteristics of the high-spectral resolution data. This paper compares the results of three different methods used to merge the information contents of the Landsat Thematic Mapper (TM) and Satellite Pour l'Observation de la Terre (SPOT) panchromatic data. The comparison is based on spectral characteristics and is made using statistical, visual, and graphical analyses of the results. The three methods used to merge the information contents of the Landsat TM and SPOT panchromatic data were the Hue-Intensity-Saturation (HIS), Principal Component Analysis (PCA), and High-Pass Filter (HPF) procedures. The HIS method distorted the spectral characteristics of the data the most. The HPF method distorted the spectral characteristics the least; the distortions were minimal and difficult to detect. -Authors
Intermittency of principal stress directions within Arctic sea ice.
Weiss, Jérôme
2008-05-01
The brittle deformation of Arctic sea ice is not only characterized by strong spatial heterogeneity as well as intermittency of stress and strain-rate amplitudes, but also by an intermittency of principal stress directions, with power law statistics of angular fluctuations, long-range correlations in time, and multifractal scaling. This intermittency is much more pronounced than that of wind directions, i.e., is not a direct inheritance of the turbulent forcing.
ERIC Educational Resources Information Center
Jacob, Robin Tepper; Jacob, Brian
2012-01-01
Teacher and principal surveys are among the most common data collection techniques employed in education research. Yet there is remarkably little research on survey methods in education, or about the most cost-effective way to raise response rates among teachers and principals. In an effort to explore various methods for increasing survey response…
Schlairet, Maura C; Schlairet, Timothy James; Sauls, Denise H; Bellflowers, Lois
2015-03-01
Establishing the impact of the high-fidelity simulation environment on student performance, as well as identifying factors that could predict learning, would refine simulation outcome expectations among educators. The purpose of this quasi-experimental pilot study was to explore the impact of simulation on emotion and cognitive load among beginning nursing students. Forty baccalaureate nursing students participated in teaching simulations, rated their emotional state and cognitive load, and completed evaluation simulations. Two principal components of emotion were identified representing the pleasant activation and pleasant deactivation components of affect. Mean rating of cognitive load following simulation was high. Linear regression identiffed slight but statistically nonsignificant positive associations between principal components of emotion and cognitive load. Logistic regression identified a negative but statistically nonsignificant effect of cognitive load on assessment performance. Among lower ability students, a more pronounced effect of cognitive load on assessment performance was observed; this also was statistically non-significant. Copyright 2015, SLACK Incorporated.
The best motivator priorities parents choose via analytical hierarchy process
NASA Astrophysics Data System (ADS)
Farah, R. N.; Latha, P.
2015-05-01
Motivation is probably the most important factor that educators can target in order to improve learning. Numerous cross-disciplinary theories have been postulated to explain motivation. While each of these theories has some truth, no single theory seems to adequately explain all human motivation. The fact is that human beings in general and pupils in particular are complex creatures with complex needs and desires. In this paper, Analytic Hierarchy Process (AHP) has been proposed as an emerging solution to move towards too large, dynamic and complex real world multi-criteria decision making problems in selecting the most suitable motivator when choosing school for their children. Data were analyzed using SPSS 17.0 ("Statistical Package for Social Science") software. Statistic testing used are descriptive and inferential statistic. Descriptive statistic used to identify respondent pupils and parents demographic factors. The statistical testing used to determine the pupils and parents highest motivator priorities and parents' best priorities using AHP to determine the criteria chosen by parents such as school principals, teachers, pupils and parents. The moderating factors are selected schools based on "Standard Kualiti Pendidikan Malaysia" (SKPM) in Ampang. Inferential statistics such as One-way ANOVA used to get the significant and data used to calculate the weightage of AHP. School principals is found to be the best motivator for parents in choosing school for their pupils followed by teachers, parents and pupils.
Wolfson, Julian; Henn, Lisa
2014-01-01
In many areas of clinical investigation there is great interest in identifying and validating surrogate endpoints, biomarkers that can be measured a relatively short time after a treatment has been administered and that can reliably predict the effect of treatment on the clinical outcome of interest. However, despite dramatic advances in the ability to measure biomarkers, the recent history of clinical research is littered with failed surrogates. In this paper, we present a statistical perspective on why identifying surrogate endpoints is so difficult. We view the problem from the framework of causal inference, with a particular focus on the technique of principal stratification (PS), an approach which is appealing because the resulting estimands are not biased by unmeasured confounding. In many settings, PS estimands are not statistically identifiable and their degree of non-identifiability can be thought of as representing the statistical difficulty of assessing the surrogate value of a biomarker. In this work, we examine the identifiability issue and present key simplifying assumptions and enhanced study designs that enable the partial or full identification of PS estimands. We also present example situations where these assumptions and designs may or may not be feasible, providing insight into the problem characteristics which make the statistical evaluation of surrogate endpoints so challenging.
2014-01-01
In many areas of clinical investigation there is great interest in identifying and validating surrogate endpoints, biomarkers that can be measured a relatively short time after a treatment has been administered and that can reliably predict the effect of treatment on the clinical outcome of interest. However, despite dramatic advances in the ability to measure biomarkers, the recent history of clinical research is littered with failed surrogates. In this paper, we present a statistical perspective on why identifying surrogate endpoints is so difficult. We view the problem from the framework of causal inference, with a particular focus on the technique of principal stratification (PS), an approach which is appealing because the resulting estimands are not biased by unmeasured confounding. In many settings, PS estimands are not statistically identifiable and their degree of non-identifiability can be thought of as representing the statistical difficulty of assessing the surrogate value of a biomarker. In this work, we examine the identifiability issue and present key simplifying assumptions and enhanced study designs that enable the partial or full identification of PS estimands. We also present example situations where these assumptions and designs may or may not be feasible, providing insight into the problem characteristics which make the statistical evaluation of surrogate endpoints so challenging. PMID:25342953
Principal Score Methods: Assumptions, Extensions, and Practical Considerations
ERIC Educational Resources Information Center
Feller, Avi; Mealli, Fabrizia; Miratrix, Luke
2017-01-01
Researchers addressing posttreatment complications in randomized trials often turn to principal stratification to define relevant assumptions and quantities of interest. One approach for the subsequent estimation of causal effects in this framework is to use methods based on the "principal score," the conditional probability of belonging…
The accelerations of the earth and moon from early astronomical observations
NASA Technical Reports Server (NTRS)
Muller, P. M.; Stephenson, F. R.
1975-01-01
An investigation has compiled a very large amount of data on central or near central solar eclipses as recorded in four principal ancient sources (Greek and Roman classics, medieval European chronicles, Chinese annals and astronomical treatises, and Late Babylonian astronomical texts) and applied careful data selectivity criteria and statistical methods to obtain reliable dates, magnitudes, and places of observation of the events, and thereby made estimates of the earth acceleration and lunar acceleration. The basic conclusion is that the lunar acceleration and both tidal and nontidal earth accelerations have been essentially constant during the period from 1375 B.C. to the present.
Differential use of fresh water environments by wintering waterfowl of coastal Texas
White, D.H.; James, D.
1978-01-01
A comparative study of the environmental relationships among 14 species of wintering waterfowl was conducted at the Welder Wildlife Foundation, San Patricia County, near Sinton, Texas during the fall and early winter of 1973. Measurements of 20 environmental factors (social, vegetational, physical, and chemical) were subjected to multivariate statistical methods to determine certain niche characteristics and environmental relationships of waterfowl wintering in the aquatic community.....Each waterfowl species occupied a unique realized niche by responding to distinct combinations of environmental factors identified by principal component analysis. One percent confidence ellipses circumscribing the mean scores plotted for the first and second principal components gave an indication of relative niche width for each species. The waterfowl environments were significantly different interspecifically and water depth at feeding site and % emergent vegetation were most important in the separation. This was shown by subjecting the transformed data to multivariate analysis of variance with an associated step-down procedure. The species were distributed along a community cline extending from shallow water with abundant emergent vegetation to open deep water with little emergent vegetation of any kind. Four waterfowl subgroups were significantly separated along the cline, as indicated by one-way analysis of variance with Duncan?s multiple range test. Clumping of the bird species toward the middle of the available habitat hyperspace was shown in a plot of the principal component scores for the random samples and individual species.....Naturally occurring relationships among waterfowl were clarified using principal comcomponent analysis and related multivariate procedures. These techniques may prove useful in wetland management for particular groups of waterfowl based on habitat preferences.
Big-Data RHEED analysis for understanding epitaxial film growth processes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P
Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in-situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED image, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the dataset are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of RHEED image sequence.more » This approach is illustrated for growth of LaxCa1-xMnO3 films grown on etched (001) SrTiO3 substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the assymetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.« less
A Methodology for the Parametric Reconstruction of Non-Steady and Noisy Meteorological Time Series
NASA Astrophysics Data System (ADS)
Rovira, F.; Palau, J. L.; Millán, M.
2009-09-01
Climatic and meteorological time series often show some persistence (in time) in the variability of certain features. One could regard annual, seasonal and diurnal time variability as trivial persistence in the variability of some meteorological magnitudes (as, e.g., global radiation, air temperature above surface, etc.). In these cases, the traditional Fourier transform into frequency space will show the principal harmonics as the components with the largest amplitude. Nevertheless, meteorological measurements often show other non-steady (in time) variability. Some fluctuations in measurements (at different time scales) are driven by processes that prevail on some days (or months) of the year but disappear on others. By decomposing a time series into time-frequency space through the continuous wavelet transformation, one is able to determine both the dominant modes of variability and how those modes vary in time. This study is based on a numerical methodology to analyse non-steady principal harmonics in noisy meteorological time series. This methodology combines both the continuous wavelet transform and the development of a parametric model that includes the time evolution of the principal and the most statistically significant harmonics of the original time series. The parameterisation scheme proposed in this study consists of reproducing the original time series by means of a statistically significant finite sum of sinusoidal signals (waves), each defined by using the three usual parameters: amplitude, frequency and phase. To ensure the statistical significance of the parametric reconstruction of the original signal, we propose a standard statistical t-student analysis of the confidence level of the amplitude in the parametric spectrum for the different wave components. Once we have assured the level of significance of the different waves composing the parametric model, we can obtain the statistically significant principal harmonics (in time) of the original time series by using the Fourier transform of the modelled signal. Acknowledgements The CEAM Foundation is supported by the Generalitat Valenciana and BANCAIXA (València, Spain). This study has been partially funded by the European Commission (FP VI, Integrated Project CIRCE - No. 036961) and by the Ministerio de Ciencia e Innovación, research projects "TRANSREG” (CGL2007-65359/CLI) and "GRACCIE” (CSD2007-00067, Program CONSOLIDER-INGENIO 2010).
A Program Evaluation of a Leadership Academy for School Principals
ERIC Educational Resources Information Center
Wagner, Kristi E.
2014-01-01
This program evaluation focused on mid-range outcomes of a leadership academy for school principals. The mixed-methods evaluation included interviews, principals' instructional observation database, and teacher surveys. The Principal Academy program was designed to build principals' knowledge of high-yield instructional strategies (Hattie, 2009),…
Blind deconvolution with principal components analysis for wide-field and small-aperture telescopes
NASA Astrophysics Data System (ADS)
Jia, Peng; Sun, Rongyu; Wang, Weinan; Cai, Dongmei; Liu, Huigen
2017-09-01
Telescopes with a wide field of view (greater than 1°) and small apertures (less than 2 m) are workhorses for observations such as sky surveys and fast-moving object detection, and play an important role in time-domain astronomy. However, images captured by these telescopes are contaminated by optical system aberrations, atmospheric turbulence, tracking errors and wind shear. To increase the quality of images and maximize their scientific output, we propose a new blind deconvolution algorithm based on statistical properties of the point spread functions (PSFs) of these telescopes. In this new algorithm, we first construct the PSF feature space through principal component analysis, and then classify PSFs from a different position and time using a self-organizing map. According to the classification results, we divide images of the same PSF types and select these PSFs to construct a prior PSF. The prior PSF is then used to restore these images. To investigate the improvement that this algorithm provides for data reduction, we process images of space debris captured by our small-aperture wide-field telescopes. Comparing the reduced results of the original images and the images processed with the standard Richardson-Lucy method, our method shows a promising improvement in astrometry accuracy.
Forest statistics for the Southern Piedmont of Virginia, 1991
Tony G. Johnson
1991-01-01
This report highlights the principal findings of the sixth forest survey of the Southern Piedmont of Virginia. Field work began in March 1991 and was completed in June 1991. Five previous surveys, completed in 1940, 1957, 1965, 1976, and 1985, provide statistics for measuring changes and trends over the past 51 years. The primary emphasis in this report is on the...
Forest statistics for the Northern Mountains of Virginia, 1992
Tony G. Johnson
1992-01-01
This report highlights the principal findings of the sixth forest survey of the Northern Mountains of Virginia. Field work began in September 1991 and was completed in November 1991. Five previous surveys, completed in 1940, 1957, 1966, 1977, and 1986, provide statistics for measuring changes and trends over the past 52 years. The primary emphasis in this report is on...
Forest statistics for the Southern Coastal Plain of North Carolina, 1990
Tony G. Johnson
1990-01-01
This report highlights the principal findings of the sixth forest survey of the Southern Coastal Plain of North Carolina. Field work began in April 1989 and was completed in September 1989. Five previous surveys, completed in 1937, 1952, 1962, 1973, and 1983, provide statistics for measuring changes and trends over the past 53 years. The primary emphasis in this report...
Forest statistics for the mountains of North Carolina, 1984
Gerald C. Craver
1985-01-01
This report highlights the principal findings of the fifth forest survey in the Mountains of North Carolina. Fieldwork began in April 1984 and was completed in September 1984. Four previous surveys, completed in 1938, 1955, 1964, and 1974, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on the changes...
Forest statistics for North Central Georgia, 1989
Tony G. Johnson
1989-01-01
This report highlights the principal findings of the sixth forest survey in North Central Georgia. Field work began in February 1989 and was completed in April 1989. Five previous surveys, completed in 1936, 1953, 1961, 1972, and 1983, provide statistics for measuring changes and trends over the past 53 years. The primary emphasis in this report is on the changes and...
Forest statistics for the Piedmont of North Carolina 1975
Richard L. Welch
1975-01-01
This report highlights the principal findings of the fourth inventory of the timber resource in the Piedmont of North Carolina. The inventory was started in May 1964 and completed in January 1975. Three previous inventories, completed in 1937, 1956, and 1964m provide statistics for measuring changes and trends over the past 38 years. In this report, the primary...
Forest statistics for the Northern Coastal Plain of North Carolina, 1984
Edgar L. Davenport
1984-01-01
This report highlights the principal findings of the fifth forest inventory in the Northern Coastal Plain of North Carolina. Fieldwork began in June 1983 and was completed in December 1983. Four previous surveys, completed in 1937, 1955, 1963, and 1974, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on...
Forest statistics for the Southern Coastal Plain of North Carolina, 1983
John B. Tansey
1984-01-01
This report highlights the principal findings of the fifth forest survey in the southern Coastal Plain of North Carolina. Fieldwork began in November 1982 and was completed in June 1983. Four previous surveys, completed in 1938, 1952, 1962, and 1973, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on...
Forest statistics for Florida, 1980
William A. Bechtold; Raymond M. Sheffield
1981-01-01
This report highlights the principal findings of the fifth inventory of Floridaâs forests. Fieldwork began in September 1978 and was completed in May 1980. Four previous surveys, completed in 1936, 1949, 1959, and 1970, provide statistics for measuring changes and trends over the past 44 years. The primary emphasis in this report is on the changes and trends since 1970...
Forest statistics for the Piedmont of South Carolina 1977
Nolan L. Snyder
1977-01-01
This report highlights the principal findings of the fifth inventory of the timber resource in the Piedmont of South Carolina. The inventory was started in April 1977 and completed in September 1977. Four previous inventories, completed in 1936, 1947, 1958, and 1967, provide statistics for measuring changes and trends over the past 41 years. In this report, the primary...
Forest statistics for the Southern Coastal Plain of South Carolina 1978
Raymond M. Sheffield; Joanne Hutchison
1978-01-01
This report highlights the principal findings of the fifth forest inventory of the Southern Coastal Plain of South Carolina. Fieldwork began in April 1978 and was completed in August 1978. Four previous inventories, completed in 1934, 1947, 1958, and 1968, provide statistics for measuring changes and trends over the past 44 years. The primary emphasis in this report is...
Forest statistics for the Northern Piedmont of Virginia, 1986
Mark J. Brown
1986-01-01
This report highlights the principal findings of the fifth forest survey in the Northern Piedmont of Virginia. Fieldwork began in July 1985 and was completed in September 1985. Four previous surveys, completed in 1940, 1957, 1965, and 1976, provide statistics for measuring changes and trends over the past 46 years. The primary emphasis in this report is on the changes...
Forest statistics for the Piedmont of North Carolina, 1984
Cecil C. Hutchins
1984-01-01
This report highlights the principal findings of the fifth forest survey in the Piedmont of North Carolina, Fieldwork began in December 1983 and was completed in August 1984, Four previous surveys, completed in 1937, 1956, 1964, and 1975, provide statistics for measuring changes and trends over the past 47 years. The primary emphasis in this report is on the changes...
Forest statistics for the Northern Piedmont of Virginia, 1992
Michael T. Thompson
1992-01-01
This report highlights the principal findings of the sixth forest survey of the Northern Piedmont of Virginia. Field work began in June 1991 and was completed in September 1991. Five previous surveys, completed in 1940, 1957, 1965, 1976, and 1986, provide statistics for measuring changes and trends over the past 52 years. The primary emphasis in this report is on the...
Forest statistics for the Southern Coastal Plain of South Carolina, 1987
John B. Tansey
1987-01-01
This report highlights the principal findings of the sixth forest survey in the Southern Coastal plain of South Carolina. Fieldwork began in June 1986 and was completed in September 1986. Five previous surveys, completed in 1934, 1947, 1958, 1968, and 1978, provide statistics for measuring changes and trends over the past 53 years. The primary emphasis in this report...
Forest statistics for South Florida, 1980
Raymond M. Sheffield; William A. Bechtold
1981-01-01
This report highlights the principal findings of the fifth inventory of Floridaâs forests. Fieldwork began in September 1978 and was completed in May 1980. Four previous surveys, completed in 1936, 1949, 1959, and 1970, provide statistics for measuring changes and trends over the past 44 years. The primary emphasis in this report is on the changes and trends since 1970...
Forest statistics for the Northern Coastal plain of North Carolina 1974
Richard L. Welch; Herbert A. Knight
1974-01-01
This report highlights the principal findings of the fourth inventory of the timber resource in the Northern Coastal Plain of North Carolina. The inventory was started in July 1973 and completed in May 1974. Three previous inventories, completed in 1937, 1955, and 1963, provide statistics for measuring changes and trends over the past 37 years. In this report, the...
Forest statistics for the Coastal Plain of Virginia, 1991
Michael T. Thompson
1991-01-01
This report highlights the principal findings of the sixth forest survey of the Coastal Plain of Virginia. Field work began in October 1990 and was completed in March 1991. Five previous surveys, completed in 1940, 1956, 1966, 1976, and 1985, provide statistics for measuring changes and trends over the past 51 years. The primary emphasis in this report is on the...
Forest statistics for the mountain region of North Carolina 1974
Noel D. Cost
1974-01-01
This report highlights the principal findings of the fourth inventory of the timber resource in the Mountain Region of North Carolina. The inventory was started in May 1974 and completed in September 1974. Three previous inventories, completed in 1938, 1955, and 1964, provide statistic for measuring changes and trends over the past 36 years. In this report, the primary...
Forest statistics for the mountains of North Carolina, 1990
Tony G. Johnson
1991-01-01
This report highlights the principal findings of the sixth forest survey of the Mountains of North Carolina. Field work began in August 1990 and was completed in November 1990. Five previous surveys, completed in 1938, 1955, 1964, 1974, and 1984, provide statistics for measuring changes and trends over the past 52 years. The primary emphasis in this report is on the...
Forest statistics for the Southern Mountain region of Virginia, 1977
Raymond M. Sheffield
1977-01-01
This report highlights the principal findings of the fourth inventory of the timber resource in the Southern Mountain Region of Virginia. The inventory was started in December 1976 and completed in March 1977. Three previous inventories, completed in 1940, 1957, and 1966, provide statistics for measuring changes and trends over the past 37 years. In this report, the...
Forest statistics for the Coastal Plain of Virginia, 1985
Mark J. Brown; Gerald C. Craver
1985-01-01
This report highlights the principal findings of the fifth forest survey in the Coastal Plain of Virginia. Fieldwork began in September 1984 and was completed in February 1985. Four previous surveys, completed in 1940, 1956, 1966, and 1976, provide statistics for measuring changes and trends over the past 45 years. The primary emphasis in this report is on the changes...
Semi-supervised vibration-based classification and condition monitoring of compressors
NASA Astrophysics Data System (ADS)
Potočnik, Primož; Govekar, Edvard
2017-09-01
Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.
Angeler, David G; Viedma, Olga; Moreno, José M
2009-11-01
Time lag analysis (TLA) is a distance-based approach used to study temporal dynamics of ecological communities by measuring community dissimilarity over increasing time lags. Despite its increased use in recent years, its performance in comparison with other more direct methods (i.e., canonical ordination) has not been evaluated. This study fills this gap using extensive simulations and real data sets from experimental temporary ponds (true zooplankton communities) and landscape studies (landscape categories as pseudo-communities) that differ in community structure and anthropogenic stress history. Modeling time with a principal coordinate of neighborhood matrices (PCNM) approach, the canonical ordination technique (redundancy analysis; RDA) consistently outperformed the other statistical tests (i.e., TLAs, Mantel test, and RDA based on linear time trends) using all real data. In addition, the RDA-PCNM revealed different patterns of temporal change, and the strength of each individual time pattern, in terms of adjusted variance explained, could be evaluated, It also identified species contributions to these patterns of temporal change. This additional information is not provided by distance-based methods. The simulation study revealed better Type I error properties of the canonical ordination techniques compared with the distance-based approaches when no deterministic component of change was imposed on the communities. The simulation also revealed that strong emphasis on uniform deterministic change and low variability at other temporal scales is needed to result in decreased statistical power of the RDA-PCNM approach relative to the other methods. Based on the statistical performance of and information content provided by RDA-PCNM models, this technique serves ecologists as a powerful tool for modeling temporal change of ecological (pseudo-) communities.
NASA Astrophysics Data System (ADS)
Zabolotna, Natalia I.; Dovhaliuk, Rostyslav Y.
2013-09-01
We present a novel measurement method of optic axes orientation distribution which uses a relatively simple measurement setup. The principal difference of our method from other well-known methods lies in direct approach for measuring the orientation of optical axis of polycrystalline networks biological crystals. Our test polarimetry setup consists of HeNe laser, quarter wave plate, two linear polarizers and a CCD camera. We also propose a methodology for processing of measured optic axes orientation distribution which consists of evaluation of statistical, correlational and spectral moments. Such processing of obtained data can be used to classify particular tissue sample as "healthy" or "pathological". For our experiment we use thin layers of histological section of normal and muscular dystrophy tissue sections. It is shown that the difference between mentioned moments` values of normal and pathological samples can be quite noticeable with relative difference up to 6.26.
Barratt, Dean C; Chan, Carolyn S K; Edwards, Philip J; Penney, Graeme P; Slomczykowski, Mike; Carter, Timothy J; Hawkes, David J
2008-06-01
Statistical shape modelling potentially provides a powerful tool for generating patient-specific, 3D representations of bony anatomy for computer-aided orthopaedic surgery (CAOS) without the need for a preoperative CT scan. Furthermore, freehand 3D ultrasound (US) provides a non-invasive method for digitising bone surfaces in the operating theatre that enables a much greater region to be sampled compared with conventional direct-contact (i.e., pointer-based) digitisation techniques. In this paper, we describe how these approaches can be combined to simultaneously generate and register a patient-specific model of the femur and pelvis to the patient during surgery. In our implementation, a statistical deformation model (SDM) was constructed for the femur and pelvis by performing a principal component analysis on the B-spline control points that parameterise the freeform deformations required to non-rigidly register a training set of CT scans to a carefully segmented template CT scan. The segmented template bone surface, represented by a triangulated surface mesh, is instantiated and registered to a cloud of US-derived surface points using an iterative scheme in which the weights corresponding to the first five principal modes of variation of the SDM are optimised in addition to the rigid-body parameters. The accuracy of the method was evaluated using clinically realistic data obtained on three intact human cadavers (three whole pelves and six femurs). For each bone, a high-resolution CT scan and rigid-body registration transformation, calculated using bone-implanted fiducial markers, served as the gold standard bone geometry and registration transformation, respectively. After aligning the final instantiated model and CT-derived surfaces using the iterative closest point (ICP) algorithm, the average root-mean-square distance between the surfaces was 3.5mm over the whole bone and 3.7mm in the region of surgical interest. The corresponding distances after aligning the surfaces using the marker-based registration transformation were 4.6 and 4.5mm, respectively. We conclude that despite limitations on the regions of bone accessible using US imaging, this technique has potential as a cost-effective and non-invasive method to enable surgical navigation during CAOS procedures, without the additional radiation dose associated with performing a preoperative CT scan or intraoperative fluoroscopic imaging. However, further development is required to investigate errors using error measures relevant to specific surgical procedures.
Q-mode versus R-mode principal component analysis for linear discriminant analysis (LDA)
NASA Astrophysics Data System (ADS)
Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz
2017-05-01
Many literature apply Principal Component Analysis (PCA) as either preliminary visualization or variable con-struction methods or both. Focus of PCA can be on the samples (R-mode PCA) or variables (Q-mode PCA). Traditionally, R-mode PCA has been the usual approach to reduce high-dimensionality data before the application of Linear Discriminant Analysis (LDA), to solve classification problems. Output from PCA composed of two new matrices known as loadings and scores matrices. Each matrix can then be used to produce a plot, i.e. loadings plot aids identification of important variables whereas scores plot presents spatial distribution of samples on new axes that are also known as Principal Components (PCs). Fundamentally, the scores matrix always be the input variables for building classification model. A recent paper uses Q-mode PCA but the focus of analysis was not on the variables but instead on the samples. As a result, the authors have exchanged the use of both loadings and scores plots in which clustering of samples was studied using loadings plot whereas scores plot has been used to identify important manifest variables. Therefore, the aim of this study is to statistically validate the proposed practice. Evaluation is based on performance of external error obtained from LDA models according to number of PCs. On top of that, bootstrapping was also conducted to evaluate the external error of each of the LDA models. Results show that LDA models produced by PCs from R-mode PCA give logical performance and the matched external error are also unbiased whereas the ones produced with Q-mode PCA show the opposites. With that, we concluded that PCs produced from Q-mode is not statistically stable and thus should not be applied to problems of classifying samples, but variables. We hope this paper will provide some insights on the disputable issues.
77 FR 59004 - Membership of the Senior Executive Service Standing Performance Review Boards
Federal Register 2010, 2011, 2012, 2013, 2014
2012-09-25
..., MELODEE PRINCIPAL DEPUTY ADMINISTRATOR, OFFICE OF JUVENILE JUSTICE AND DELINQUENCY PREVENTION. FEUCHT.... SABOL, WILLIAM DEPUTY DIRECTOR, BUREAU OF JUSTICE STATISTICS. RIDGEWAY, GREG DEPUTY DIRECTOR, NATIONAL...
McKinney, Tim S.; Anning, David W.
2012-01-01
This product "Digital spatial data for predicted nitrate and arsenic concentrations in basin-fill aquifers of the Southwest Principal Aquifers study area" is a 1:250,000-scale vector spatial dataset developed as part of a regional Southwest Principal Aquifers (SWPA) study (Anning and others, 2012). The study examined the vulnerability of basin-fill aquifers in the southwestern United States to nitrate contamination and arsenic enrichment. Statistical models were developed by using the random forest classifier algorithm to predict concentrations of nitrate and arsenic across a model grid that represents local- and basin-scale measures of source, aquifer susceptibility, and geochemical conditions.
Hrdlickova Kuckova, Stepanka; Rambouskova, Gabriela; Hynek, Radovan; Cejnar, Pavel; Oltrogge, Doris; Fuchs, Robert
2015-11-01
Matrix-assisted laser desorption/ionisation-time of flight (MALDI-TOF) mass spectrometry is commonly used for the identification of proteinaceous binders and their mixtures in artworks. The determination of protein binders is based on a comparison between the m/z values of tryptic peptides in the unknown sample and a reference one (egg, casein, animal glues etc.), but this method has greater potential to study changes due to ageing and the influence of organic/inorganic components on protein identification. However, it is necessary to then carry out statistical evaluation on the obtained data. Before now, it has been complicated to routinely convert the mass spectrometric data into a statistical programme, to extract and match the appropriate peaks. Only several 'homemade' computer programmes without user-friendly interfaces are available for these purposes. In this paper, we would like to present our completely new, publically available, non-commercial software, ms-alone and multiMS-toolbox, for principal component analyses of MALDI-TOF MS data for R software, and their application to the study of the influence of heterogeneous matrices (organic lakes) for protein identification. Using this new software, we determined the main factors that influence the protein analyses of artificially aged model mixtures of organic lakes and fish glue, prepared according to historical recipes that were used for book illumination, using MALDI-TOF peptide mass mapping. Copyright © 2015 John Wiley & Sons, Ltd.
Finding Planets in K2: A New Method of Cleaning the Data
NASA Astrophysics Data System (ADS)
Currie, Miles; Mullally, Fergal; Thompson, Susan E.
2017-01-01
We present a new method of removing systematic flux variations from K2 light curves by employing a pixel-level principal component analysis (PCA). This method decomposes the light curves into its principal components (eigenvectors), each with an associated eigenvalue, the value of which is correlated to how much influence the basis vector has on the shape of the light curve. This method assumes that the most influential basis vectors will correspond to the unwanted systematic variations in the light curve produced by K2’s constant motion. We correct the raw light curve by automatically fitting and removing the strongest principal components. The strongest principal components generally correspond to the flux variations that result from the motion of the star in the field of view. Our primary method of calculating the strongest principal components to correct for in the raw light curve estimates the noise by measuring the scatter in the light curve after using an algorithm for Savitsy-Golay detrending, which computes the combined photometric precision value (SG-CDPP value) used in classic Kepler. We calculate this value after correcting the raw light curve for each element in a list of cumulative sums of principal components so that we have as many noise estimate values as there are principal components. We then take the derivative of the list of SG-CDPP values and take the number of principal components that correlates to the point at which the derivative effectively goes to zero. This is the optimal number of principal components to exclude from the refitting of the light curve. We find that a pixel-level PCA is sufficient for cleaning unwanted systematic and natural noise from K2’s light curves. We present preliminary results and a basic comparison to other methods of reducing the noise from the flux variations.
A PCA-Based method for determining craniofacial relationship and sexual dimorphism of facial shapes.
Shui, Wuyang; Zhou, Mingquan; Maddock, Steve; He, Taiping; Wang, Xingce; Deng, Qingqiong
2017-11-01
Previous studies have used principal component analysis (PCA) to investigate the craniofacial relationship, as well as sex determination using facial factors. However, few studies have investigated the extent to which the choice of principal components (PCs) affects the analysis of craniofacial relationship and sexual dimorphism. In this paper, we propose a PCA-based method for visual and quantitative analysis, using 140 samples of 3D heads (70 male and 70 female), produced from computed tomography (CT) images. There are two parts to the method. First, skull and facial landmarks are manually marked to guide the model's registration so that dense corresponding vertices occupy the same relative position in every sample. Statistical shape spaces of the skull and face in dense corresponding vertices are constructed using PCA. Variations in these vertices, captured in every principal component (PC), are visualized to observe shape variability. The correlations of skull- and face-based PC scores are analysed, and linear regression is used to fit the craniofacial relationship. We compute the PC coefficients of a face based on this craniofacial relationship and the PC scores of a skull, and apply the coefficients to estimate a 3D face for the skull. To evaluate the accuracy of the computed craniofacial relationship, the mean and standard deviation of every vertex between the two models are computed, where these models are reconstructed using real PC scores and coefficients. Second, each PC in facial space is analysed for sex determination, for which support vector machines (SVMs) are used. We examined the correlation between PCs and sex, and explored the extent to which the choice of PCs affects the expression of sexual dimorphism. Our results suggest that skull- and face-based PCs can be used to describe the craniofacial relationship and that the accuracy of the method can be improved by using an increased number of face-based PCs. The results show that the accuracy of the sex classification is related to the choice of PCs. The highest sex classification rate is 91.43% using our method. Copyright © 2017 Elsevier Ltd. All rights reserved.
Building Leadership Talent through Performance Evaluation
ERIC Educational Resources Information Center
Clifford, Matthew
2015-01-01
Most states and districts scramble to provide professional development to support principals, but "principal evaluation" is often lost amid competing priorities. Evaluation is an important method for supporting principal growth, communicating performance expectations to principals, and improving leadership practice. It provides leaders…
Removal of BCG artefact from concurrent fMRI-EEG recordings based on EMD and PCA.
Javed, Ehtasham; Faye, Ibrahima; Malik, Aamir Saeed; Abdullah, Jafri Malin
2017-11-01
Simultaneous electroencephalography (EEG) and functional magnetic resonance image (fMRI) acquisitions provide better insight into brain dynamics. Some artefacts due to simultaneous acquisition pose a threat to the quality of the data. One such problematic artefact is the ballistocardiogram (BCG) artefact. We developed a hybrid algorithm that combines features of empirical mode decomposition (EMD) with principal component analysis (PCA) to reduce the BCG artefact. The algorithm does not require extra electrocardiogram (ECG) or electrooculogram (EOG) recordings to extract the BCG artefact. The method was tested with both simulated and real EEG data of 11 participants. From the simulated data, the similarity index between the extracted BCG and the simulated BCG showed the effectiveness of the proposed method in BCG removal. On the other hand, real data were recorded with two conditions, i.e. resting state (eyes closed dataset) and task influenced (event-related potentials (ERPs) dataset). Using qualitative (visual inspection) and quantitative (similarity index, improved normalized power spectrum (INPS) ratio, power spectrum, sample entropy (SE)) evaluation parameters, the assessment results showed that the proposed method can efficiently reduce the BCG artefact while preserving the neuronal signals. Compared with conventional methods, namely, average artefact subtraction (AAS), optimal basis set (OBS) and combined independent component analysis and principal component analysis (ICA-PCA), the statistical analyses of the results showed that the proposed method has better performance, and the differences were significant for all quantitative parameters except for the power and sample entropy. The proposed method does not require any reference signal, prior information or assumption to extract the BCG artefact. It will be very useful in circumstances where the reference signal is not available. Copyright © 2017 Elsevier B.V. All rights reserved.
Multivariate relationships between groundwater chemistry and toxicity in an urban aquifer.
Dewhurst, Rachel E; Wells, N Claire; Crane, Mark; Callaghan, Amanda; Connon, Richard; Mather, John D
2003-11-01
Multivariate statistical methods were used to investigate the causes of toxicity and controls on groundwater chemistry from 274 boreholes in an urban area (London) of the United Kingdom. The groundwater was alkaline to neutral, and chemistry was dominated by calcium, sodium, and sulfate. Contaminants included fuels, solvents, and organic compounds derived from landfill material. The presence of organic material in the aquifer caused decreases in dissolved oxygen, sulfate and nitrate concentrations, and increases in ferrous iron and ammoniacal nitrogen concentrations. Pearson correlations between toxicity results and the concentration of individual analytes indicated that concentrations of ammoniacal nitrogen, dissolved oxygen, ferrous iron, and hydrocarbons were important where present. However, principal component and regression analysis suggested no significant correlation between toxicity and chemistry over the whole area. Multidimensional scaling was used to investigate differences in sites caused by historical use, landfill gas status, or position within the sample area. Significant differences were observed between sites with different historical land use and those with different gas status. Examination of the principal component matrix revealed that these differences are related to changes in the importance of reduced chemical species.
Physician performance assessment using a composite quality index.
Liu, Kaibo; Jain, Shabnam; Shi, Jianjun
2013-07-10
Assessing physician performance is important for the purposes of measuring and improving quality of service and reducing healthcare delivery costs. In recent years, physician performance scorecards have been used to provide feedback on individual measures; however, one key challenge is how to develop a composite quality index that combines multiple measures for overall physician performance evaluation. A controversy arises over establishing appropriate weights to combine indicators in multiple dimensions, and cannot be easily resolved. In this study, we proposed a generic unsupervised learning approach to develop a single composite index for physician performance assessment by using non-negative principal component analysis. We developed a new algorithm named iterative quadratic programming to solve the numerical issue in the non-negative principal component analysis approach. We conducted real case studies to demonstrate the performance of the proposed method. We provided interpretations from both statistical and clinical perspectives to evaluate the developed composite ranking score in practice. In addition, we implemented the root cause assessment techniques to explain physician performance for improvement purposes. Copyright © 2012 John Wiley & Sons, Ltd.
Szabo, J.K.; Fedriani, E.M.; Segovia-Gonzalez, M. M.; Astheimer, L.B.; Hooper, M.J.
2010-01-01
This paper introduces a new technique in ecology to analyze spatial and temporal variability in environmental variables. By using simple statistics, we explore the relations between abiotic and biotic variables that influence animal distributions. However, spatial and temporal variability in rainfall, a key variable in ecological studies, can cause difficulties to any basic model including time evolution. The study was of a landscape scale (three million square kilometers in eastern Australia), mainly over the period of 19982004. We simultaneously considered qualitative spatial (soil and habitat types) and quantitative temporal (rainfall) variables in a Geographical Information System environment. In addition to some techniques commonly used in ecology, we applied a new method, Functional Principal Component Analysis, which proved to be very suitable for this case, as it explained more than 97% of the total variance of the rainfall data, providing us with substitute variables that are easier to manage and are even able to explain rainfall patterns. The main variable came from a habitat classification that showed strong correlations with rainfall values and soil types. ?? 2010 World Scientific Publishing Company.
Sandoval, S; Torres, A; Pawlowsky-Reusing, E; Riechel, M; Caradot, N
2013-01-01
The present study aims to explore the relationship between rainfall variables and water quality/quantity characteristics of combined sewer overflows (CSOs), by the use of multivariate statistical methods and online measurements at a principal CSO outlet in Berlin (Germany). Canonical correlation results showed that the maximum and average rainfall intensities are the most influential variables to describe CSO water quantity and pollutant loads whereas the duration of the rainfall event and the rain depth seem to be the most influential variables to describe CSO pollutant concentrations. The analysis of partial least squares (PLS) regression models confirms the findings of the canonical correlation and highlights three main influences of rainfall on CSO characteristics: (i) CSO water quantity characteristics are mainly influenced by the maximal rainfall intensities, (ii) CSO pollutant concentrations were found to be mostly associated with duration of the rainfall and (iii) pollutant loads seemed to be principally influenced by dry weather duration before the rainfall event. The prediction quality of PLS models is rather low (R² < 0.6) but results can be useful to explore qualitatively the influence of rainfall on CSO characteristics.
Analysis of satellite data on energetic particles of ionospheric origin
NASA Technical Reports Server (NTRS)
Sharp, R. D.; Johnson, R. G.; Shelley, E. G.
1976-01-01
The principal result of this program has been the completion of a detailed statistical study of the properties of precipitating O(+) and H(+) ions during two principal magnetic storms. The results of the analysis of selected data of ion mass spectrometer experiment on satellites are given with emphasis on the morphology of the O(+) ions of ionospheric origin with energies in the 0.7 les than or equal to E less than or equal to 12 keV range that were discovered with this experiment.
Brooker, Simon J; Nikolay, Birgit; Balabanova, Dina; Pullan, Rachel L
2015-08-01
Emphasis is being given to the control of neglected tropical diseases, including the possibility of interrupting the transmission of soil-transmitted helminths (STH). We evaluated the feasibility by country of achieving interruption of the transmission of STH. Based on a conceptual framework for the identification of the characteristics of a successful STH control programme, we assembled spatial data for a range of epidemiological, institutional, economic, and political factors. Using four different statistical methods, we developed a composite score of the feasibility of interrupting STH transmission and undertook a sensitivity analysis of the data and methods. The most important determining factors in the analysis were underlying intensity of STH transmission, current implementation of control programmes for neglected tropical diseases, and whether countries receive large-scale external funding and have strong health systems. The composite scores suggested that interrupting STH transmission is most feasible in countries in the Americas and parts of Asia (eg, Argentina [range of composite feasibility scores, depending on scoring method, 9·4-10·0], Brazil [8·7- 9·7], Chile [8·84-10·0], and Thailand [9·1-10·0]; there was perfect agreement between the four methods), and least feasible in countries in sub-Saharan Africa (eg, Congo [0·4-2·7] and Guinea [2·0-5·6]; there was full agreement between methods), but there were important exceptions to these trends (eg, Ghana [7·4-10·0]; there was agreement between three methods). Agreement was highest between the scores derived with the expert opinion and principal component analysis weighting schemes (Pearson correlation coefficient, r=0·98). The largest disagreement was between benefit-of-the-doubt-derived and principal-component-analysis-derived weighting schemes (r=0·74). The interruption of STH transmission is feasible, especially in countries with low intensity of transmission, supportive household environments, strong health systems, and the availability of suitable delivery platforms and in-country funds, but to achieve local elimination of STH an intersectoral approach to STH control will be needed. Bill & Melinda Gates Foundation and Wellcome Trust. Copyright © 2015 Brooker et al. Open Access article distributed under the terms of CC BY-NC-ND. Published by Elsevier Ltd.. All rights reserved.
Model based approach to UXO imaging using the time domain electromagnetic method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lavely, E.M.
1999-04-01
Time domain electromagnetic (TDEM) sensors have emerged as a field-worthy technology for UXO detection in a variety of geological and environmental settings. This success has been achieved with commercial equipment that was not optimized for UXO detection and discrimination. The TDEM response displays a rich spatial and temporal behavior which is not currently utilized. Therefore, in this paper the author describes a research program for enhancing the effectiveness of the TDEM method for UXO detection and imaging. Fundamental research is required in at least three major areas: (a) model based imaging capability i.e. the forward and inverse problem, (b) detectormore » modeling and instrument design, and (c) target recognition and discrimination algorithms. These research problems are coupled and demand a unified treatment. For example: (1) the inverse solution depends on solution of the forward problem and knowledge of the instrument response; (2) instrument design with improved diagnostic power requires forward and inverse modeling capability; and (3) improved target recognition algorithms (such as neural nets) must be trained with data collected from the new instrument and with synthetic data computed using the forward model. Further, the design of the appropriate input and output layers of the net will be informed by the results of the forward and inverse modeling. A more fully developed model of the TDEM response would enable the joint inversion of data collected from multiple sensors (e.g., TDEM sensors and magnetometers). Finally, the author suggests that a complementary approach to joint inversions is the statistical recombination of data using principal component analysis. The decomposition into principal components is useful since the first principal component contains those features that are most strongly correlated from image to image.« less
Data preparation techniques for a perinatal psychiatric study based on linked data.
Xu, Fenglian; Hilder, Lisa; Austin, Marie-Paule; Sullivan, Elizabeth A
2012-06-08
In recent years there has been an increase in the use of population-based linked data. However, there is little literature that describes the method of linked data preparation. This paper describes the method for merging data, calculating the statistical variable (SV), recoding psychiatric diagnoses and summarizing hospital admissions for a perinatal psychiatric study. The data preparation techniques described in this paper are based on linked birth data from the New South Wales (NSW) Midwives Data Collection (MDC), the Register of Congenital Conditions (RCC), the Admitted Patient Data Collection (APDC) and the Pharmaceutical Drugs of Addiction System (PHDAS). The master dataset is the meaningfully linked data which include all or major study data collections. The master dataset can be used to improve the data quality, calculate the SV and can be tailored for different analyses. To identify hospital admissions in the periods before pregnancy, during pregnancy and after birth, a statistical variable of time interval (SVTI) needs to be calculated. The methods and SPSS syntax for building a master dataset, calculating the SVTI, recoding the principal diagnoses of mental illness and summarizing hospital admissions are described. Linked data preparation, including building the master dataset and calculating the SV, can improve data quality and enhance data function.
FGWAS: Functional genome wide association analysis.
Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu
2017-10-01
Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.
Forest statistics for the Southern Coastal Plain of North Carolina 1973
Noel D. Cost
1973-01-01
This report highlights the principal findings of the fourth inventory of the timber resource in the Southern Coastal plain of North Carolina. The inventory was s t a r t e d in November 1972 and completed in August 1973. Three previous inventories, completed in 1937, 1952, and 1962, provide statistics for measuring changes and trends over the past 36 years. In this...
Forest statistics for the Northern Coastal Plain of South Carolina, 1986
John B. Tansey
1987-01-01
This report highlights the principal findings of the sixth forest survey in the Northern Coastal Plain of South Carolina. Fieldwork began in April 1986 and was completed in July 1986. Five previous surveys, completed in 1936, 1947, 1958, 1968, and 1978, provide statistics for measuring changes and trends over the past 50 years. The primary emphasis in this report is on...
75 FR 61136 - Notice of Proposed Information Collection Requests
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-04
... EdFacts data as well as data from surveys of school principals and special education designees about their school improvement practices. The study will use descriptive statistics and regression analysis to...
Grosse Holtforth, Martin; Altenstein, David; Krieger, Tobias; Flückiger, Christoph; Wright, Aidan G C; Caspar, Franz
2014-01-01
We examined interpersonal problems in psychotherapy outpatients with a principal diagnosis of a depressive disorder in routine care (n=361). These patients were compared to a normative non-clinical sample and to outpatients with other principal diagnoses (n=959). Furthermore, these patients were statistically assigned to interpersonally defined subgroups that were compared regarding symptoms and the quality of the early alliance. The sample of depressive patients reported higher levels of interpersonal problems than the normative sample and the sample of outpatients without a principal diagnosis of depression. Latent Class Analysis identified eight distinct interpersonal subgroups, which differed regarding self-reported symptom load and the quality of the early alliance. However, therapists' alliance ratings did not differentiate between the groups. This interpersonal differentiation within the group of patients with a principal diagnosis of depression may add to a personalized psychotherapy based on interpersonal profiles.
Lin, Ying-he; Man, Yi; Qu, Yi-li; Guan, Dong-hua; Lu, Xuan; Wei, Na
2006-01-01
To study the movement of long axis and the distribution of principal stress in the abutment teeth in removable partial denture which is retained by use of conical telescope. An ideal three dimensional finite element model was constructed by using SCT image reconstruction technique, self-programming and ANSYS software. The static loads were applied. The displacement of the long axis and the distribution of the principal stress in the abutment teeth was analyzed. There is no statistic difference of displacenat and stress distribution among different three-dimensional finite element models. Generally, the abutment teeth move along the long axis itself. Similar stress distribution was observed in each three-dimensional finite element model. The maximal principal compressive stress was observed at the distal cervix of the second premolar. The abutment teeth can be well protected by use of conical telescope.
NASA Astrophysics Data System (ADS)
Vile, Douglas J.
In radiation therapy, interfraction organ motion introduces a level of geometric uncertainty into the planning process. Plans, which are typically based upon a single instance of anatomy, must be robust against daily anatomical variations. For this problem, a model of the magnitude, direction, and likelihood of deformation is useful. In this thesis, principal component analysis (PCA) is used to statistically model the 3D organ motion for 19 prostate cancer patients, each with 8-13 fractional computed tomography (CT) images. Deformable image registration and the resultant displacement vector fields (DVFs) are used to quantify the interfraction systematic and random motion. By applying the PCA technique to the random DVFs, principal modes of random tissue deformation were determined for each patient, and a method for sampling synthetic random DVFs was developed. The PCA model was then extended to describe the principal modes of systematic and random organ motion for the population of patients. A leave-one-out study tested both the systematic and random motion model's ability to represent PCA training set DVFs. The random and systematic DVF PCA models allowed the reconstruction of these data with absolute mean errors between 0.5-0.9 mm and 1-2 mm, respectively. To the best of the author's knowledge, this study is the first successful effort to build a fully 3D statistical PCA model of systematic tissue deformation in a population of patients. By sampling synthetic systematic and random errors, organ occupancy maps were created for bony and prostate-centroid patient setup processes. By thresholding these maps, PCA-based planning target volume (PTV) was created and tested against conventional margin recipes (van Herk for bony alignment and 5 mm fixed [3 mm posterior] margin for centroid alignment) in a virtual clinical trial for low-risk prostate cancer. Deformably accumulated delivered dose served as a surrogate for clinical outcome. For the bony landmark setup subtrial, the PCA PTV significantly (p<0.05) reduced D30, D20, and D5 to bladder and D50 to rectum, while increasing rectal D20 and D5. For the centroid-aligned setup, the PCA PTV significantly reduced all bladder DVH metrics and trended to lower rectal toxicity metrics. All PTVs covered the prostate with the prescription dose.
McGuire, Thomas G; Ayanian, John Z; Ford, Daniel E; Henke, Rachel E M; Rost, Kathryn M; Zaslavsky, Alan M
2008-01-01
Objective To test for discrimination by race/ethnicity arising from clinical uncertainty in treatment for depression, also known as “statistical discrimination.” Data Sources We used survey data from 1,321 African-American, Hispanic, and white adults identified with depression in primary care. Surveys were administered every six months for two years in the Quality Improvement for Depression (QID) studies. Study Design To examine whether and how change in depression severity affects change in treatment intensity by race/ethnicity, we used multivariate cross-sectional and change models that difference out unobserved time-invariant patient characteristics potentially correlated with race/ethnicity. Data Collection/Extraction Methods Treatment intensity was operationalized as expenditures on drugs, primary care, and specialty services, weighted by national prices from the Medical Expenditure Panel Survey. Patient race/ethnicity was collected at baseline by self-report. Principal Findings Change in depression severity is less associated with change in treatment intensity in minority patients than in whites, consistent with the hypothesis of statistical discrimination. The differential effect by racial/ethnic group was accounted for by use of mental health specialists. Conclusions Enhanced physician–patient communication and use of standardized depression instruments may reduce statistical discrimination arising from clinical uncertainty and be useful in reducing racial/ethnic inequities in depression treatment. PMID:18370966
NASA Astrophysics Data System (ADS)
Khan, Uzma Zafar
The aim of this quantitative study was to investigate elementary principals' beliefs about reformed science teaching and learning, science subject matter knowledge, and how these factors relate to fourth grade students' superior science outcomes. Online survey methodology was used for data collection and included a demographic questionnaire and two survey instruments: the K-4 Physical Science Misconceptions Oriented Science Assessment Resources for Teachers (MOSART) and the Beliefs About Reformed Science Teaching and Learning (BARSTL). Hierarchical multiple regression analysis was used to assess the separate and collective contributions of background variables such as principals' personal and school characteristics, principals' science teaching and learning beliefs, and principals' science knowledge on students' superior science outcomes. Mediation analysis was also used to explore whether principals' science knowledge mediated the relationship between their beliefs about science teaching and learning and students' science outcomes. Findings indicated that principals' science beliefs and knowledge do not contribute to predicting students' superior science scores. Fifty-two percent of the variance in percentage of students with superior science scores was explained by school characteristics with free or reduced price lunch and school type as the only significant individual predictors. Furthermore, principals' science knowledge did not mediate the relationship between their science beliefs and students' science outcomes. There was no statistically significant variation among the variables. The data failed to support the proposed mediation model of the study. Implications for future research are discussed.
Functional data analysis of sleeping energy expenditure.
Lee, Jong Soo; Zakeri, Issa F; Butte, Nancy F
2017-01-01
Adequate sleep is crucial during childhood for metabolic health, and physical and cognitive development. Inadequate sleep can disrupt metabolic homeostasis and alter sleeping energy expenditure (SEE). Functional data analysis methods were applied to SEE data to elucidate the population structure of SEE and to discriminate SEE between obese and non-obese children. Minute-by-minute SEE in 109 children, ages 5-18, was measured in room respiration calorimeters. A smoothing spline method was applied to the calorimetric data to extract the true smoothing function for each subject. Functional principal component analysis was used to capture the important modes of variation of the functional data and to identify differences in SEE patterns. Combinations of functional principal component analysis and classifier algorithm were used to classify SEE. Smoothing effectively removed instrumentation noise inherent in the room calorimeter data, providing more accurate data for analysis of the dynamics of SEE. SEE exhibited declining but subtly undulating patterns throughout the night. Mean SEE was markedly higher in obese than non-obese children, as expected due to their greater body mass. SEE was higher among the obese than non-obese children (p<0.01); however, the weight-adjusted mean SEE was not statistically different (p>0.1, after post hoc testing). Functional principal component scores for the first two components explained 77.8% of the variance in SEE and also differed between groups (p = 0.037). Logistic regression, support vector machine or random forest classification methods were able to distinguish weight-adjusted SEE between obese and non-obese participants with good classification rates (62-64%). Our results implicate other factors, yet to be uncovered, that affect the weight-adjusted SEE of obese and non-obese children. Functional data analysis revealed differences in the structure of SEE between obese and non-obese children that may contribute to disruption of metabolic homeostasis.
Characterization of palmprints by wavelet signatures via directional context modeling.
Zhang, Lei; Zhang, David
2004-06-01
The palmprint is one of the most reliable physiological characteristics that can be used to distinguish between individuals. Current palmprint-based systems are more user friendly, more cost effective, and require fewer data signatures than traditional fingerprint-based identification systems. The principal lines and wrinkles captured in a low-resolution palmprint image provide more than enough information to uniquely identify an individual. This paper presents a palmprint identification scheme that characterizes a palmprint using a set of statistical signatures. The palmprint is first transformed into the wavelet domain, and the directional context of each wavelet subband is defined and computed in order to collect the predominant coefficients of its principal lines and wrinkles. A set of statistical signatures, which includes gravity center, density, spatial dispersivity and energy, is then defined to characterize the palmprint with the selected directional context values. A classification and identification scheme based on these signatures is subsequently developed. This scheme exploits the features of principal lines and prominent wrinkles sufficiently and achieves satisfactory results. Compared with the line-segments-matching or interesting-points-matching based palmprint verification schemes, the proposed scheme uses a much smaller amount of data signatures. It also provides a convenient classification strategy and more accurate identification.
Sparse PCA with Oracle Property.
Gu, Quanquan; Wang, Zhaoran; Liu, Han
In this paper, we study the estimation of the k -dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank- k , and attains a [Formula: see text] statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.
Sparse PCA with Oracle Property
Gu, Quanquan; Wang, Zhaoran; Liu, Han
2014-01-01
In this paper, we study the estimation of the k-dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets. PMID:25684971
Face-iris multimodal biometric scheme based on feature level fusion
NASA Astrophysics Data System (ADS)
Huo, Guang; Liu, Yuanning; Zhu, Xiaodong; Dong, Hongxing; He, Fei
2015-11-01
Unlike score level fusion, feature level fusion demands all the features extracted from unimodal traits with high distinguishability, as well as homogeneity and compatibility, which is difficult to achieve. Therefore, most multimodal biometric research focuses on score level fusion, whereas few investigate feature level fusion. We propose a face-iris recognition method based on feature level fusion. We build a special two-dimensional-Gabor filter bank to extract local texture features from face and iris images, and then transform them by histogram statistics into an energy-orientation variance histogram feature with lower dimensions and higher distinguishability. Finally, through a fusion-recognition strategy based on principal components analysis and support vector machine (FRSPS), feature level fusion and one-to-n identification are accomplished. The experimental results demonstrate that this method can not only effectively extract face and iris features but also provide higher recognition accuracy. Compared with some state-of-the-art fusion methods, the proposed method has a significant performance advantage.
Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics
Belo, David; Gamboa, Hugo
2017-01-01
The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components. PMID:28831239
Deblauwe, Vincent; Kennel, Pol; Couteron, Pierre
2012-01-01
Background Independence between observations is a standard prerequisite of traditional statistical tests of association. This condition is, however, violated when autocorrelation is present within the data. In the case of variables that are regularly sampled in space (i.e. lattice data or images), such as those provided by remote-sensing or geographical databases, this problem is particularly acute. Because analytic derivation of the null probability distribution of the test statistic (e.g. Pearson's r) is not always possible when autocorrelation is present, we propose instead the use of a Monte Carlo simulation with surrogate data. Methodology/Principal Findings The null hypothesis that two observed mapped variables are the result of independent pattern generating processes is tested here by generating sets of random image data while preserving the autocorrelation function of the original images. Surrogates are generated by matching the dual-tree complex wavelet spectra (and hence the autocorrelation functions) of white noise images with the spectra of the original images. The generated images can then be used to build the probability distribution function of any statistic of association under the null hypothesis. We demonstrate the validity of a statistical test of association based on these surrogates with both actual and synthetic data and compare it with a corrected parametric test and three existing methods that generate surrogates (randomization, random rotations and shifts, and iterative amplitude adjusted Fourier transform). Type I error control was excellent, even with strong and long-range autocorrelation, which is not the case for alternative methods. Conclusions/Significance The wavelet-based surrogates are particularly appropriate in cases where autocorrelation appears at all scales or is direction-dependent (anisotropy). We explore the potential of the method for association tests involving a lattice of binary data and discuss its potential for validation of species distribution models. An implementation of the method in Java for the generation of wavelet-based surrogates is available online as supporting material. PMID:23144961
ERIC Educational Resources Information Center
Ritzman, Mitzi J.; Sanger, Dixie
2007-01-01
Purpose: The purpose of this study was to survey the opinions of principals concerning the role of speech-language pathologists (SLPs) serving students with communication disorders who have been involved in violence. Method: A mixed methods design involving 678 questionnaires was mailed to elementary, middle, and high school principals in a…
Setting Instructional Expectations: Patterns of Principal Leadership for Middle School Mathematics
ERIC Educational Resources Information Center
Katterfeld, Karin
2013-01-01
Principal instructional leadership has been found to support improved instruction. However, the methods through which principal leadership influences classroom instruction are less clear. This study investigates how principals' leadership may predict the expectations that mathematics teachers perceive for classroom practice. Results from a…
Crashes & Fatalities Related To Driver Drowsiness/Fatigue
DOT National Transportation Integrated Search
1994-11-01
THIS REPORT SUMMARIZES RECENT NATIONAL STATISTICS ON THE INCIDENCE AND CHARACTERISTICS OF CRASHES INVOLVING DRIVER FATIGUE, DROWSINESS, OR "ASLEEP-AT-THE-WHEEL." FOR THE PURPOSES OF THIS REPORT, THESE TERMS ARE : CONSIDERED SYNONYMOUS. PRINCIPAL DATA...
Guide to reporting highway statistics
DOT National Transportation Integrated Search
1983-11-01
Previous analyses conducted by the Federal Highway Administration (FHWA) are used to project year-by-year economical impacts of changes in highway performance out to 1995. In the principal scenario examined, highway performance is allowed to deterior...
NASA Astrophysics Data System (ADS)
Jin, Seung-Seop; Jung, Hyung-Jo
2014-03-01
It is well known that the dynamic properties of a structure such as natural frequencies depend not only on damage but also on environmental condition (e.g., temperature). The variation in dynamic characteristics of a structure due to environmental condition may mask damage of the structure. Without taking the change of environmental condition into account, false-positive or false-negative damage diagnosis may occur so that structural health monitoring becomes unreliable. In order to address this problem, an approach to construct a regression model based on structural responses considering environmental factors has been usually used by many researchers. The key to success of this approach is the formulation between the input and output variables of the regression model to take into account the environmental variations. However, it is quite challenging to determine proper environmental variables and measurement locations in advance for fully representing the relationship between the structural responses and the environmental variations. One alternative (i.e., novelty detection) is to remove the variations caused by environmental factors from the structural responses by using multivariate statistical analysis (e.g., principal component analysis (PCA), factor analysis, etc.). The success of this method is deeply depending on the accuracy of the description of normal condition. Generally, there is no prior information on normal condition during data acquisition, so that the normal condition is determined by subjective perspective with human-intervention. The proposed method is a novel adaptive multivariate statistical analysis for monitoring of structural damage detection under environmental change. One advantage of this method is the ability of a generative learning to capture the intrinsic characteristics of the normal condition. The proposed method is tested on numerically simulated data for a range of noise in measurement under environmental variation. A comparative study with conventional methods (i.e., fixed reference scheme) demonstrates the superior performance of the proposed method for structural damage detection.
Analysis of the principal component algorithm in phase-shifting interferometry.
Vargas, J; Quiroga, J Antonio; Belenguer, T
2011-06-15
We recently presented a new asynchronous demodulation method for phase-sampling interferometry. The method is based in the principal component analysis (PCA) technique. In the former work, the PCA method was derived heuristically. In this work, we present an in-depth analysis of the PCA demodulation method.
Ghosh, Tonmoy; Fattah, Shaikh Anowarul; Wahid, Khan A
2018-01-01
Wireless capsule endoscopy (WCE) is the most advanced technology to visualize whole gastrointestinal (GI) tract in a non-invasive way. But the major disadvantage here, it takes long reviewing time, which is very laborious as continuous manual intervention is necessary. In order to reduce the burden of the clinician, in this paper, an automatic bleeding detection method for WCE video is proposed based on the color histogram of block statistics, namely CHOBS. A single pixel in WCE image may be distorted due to the capsule motion in the GI tract. Instead of considering individual pixel values, a block surrounding to that individual pixel is chosen for extracting local statistical features. By combining local block features of three different color planes of RGB color space, an index value is defined. A color histogram, which is extracted from those index values, provides distinguishable color texture feature. A feature reduction technique utilizing color histogram pattern and principal component analysis is proposed, which can drastically reduce the feature dimension. For bleeding zone detection, blocks are classified using extracted local features that do not incorporate any computational burden for feature extraction. From extensive experimentation on several WCE videos and 2300 images, which are collected from a publicly available database, a very satisfactory bleeding frame and zone detection performance is achieved in comparison to that obtained by some of the existing methods. In the case of bleeding frame detection, the accuracy, sensitivity, and specificity obtained from proposed method are 97.85%, 99.47%, and 99.15%, respectively, and in the case of bleeding zone detection, 95.75% of precision is achieved. The proposed method offers not only low feature dimension but also highly satisfactory bleeding detection performance, which even can effectively detect bleeding frame and zone in a continuous WCE video data.
Soudek, Petr; Katrusáková, Adéla; Sedlácek, Lukás; Petrová, Sárka; Kocí, Vladimír; Marsík, Petr; Griga, Miroslav; Vanek, Tomás
2010-08-01
The effect of toxic metals on seed germination was studied in 23 cultivars of flax (Linum usitatissimum L.). Toxicity of cadmium, cobalt, copper, zinc, nickel, lead, chromium, and arsenic at five different concentrations (0.01-1 mM) was tested by standard ecotoxicity test. Root length was measured after 72 h of incubation. Elongation inhibition, EC50 value, slope, and NOEC values were calculated. Results were evaluated by principal component analysis, a multidimensional statistical method. The results showed that heavy-metal toxicity decreased in the following order: As3+>or=As5+>Cu2+>Cd2+>Co2+>Cr6+>Ni2+>Pb2+>Cr3+>Zn2+.
Methods to control for unmeasured confounding in pharmacoepidemiology: an overview.
Uddin, Md Jamal; Groenwold, Rolf H H; Ali, Mohammed Sanni; de Boer, Anthonius; Roes, Kit C B; Chowdhury, Muhammad A B; Klungel, Olaf H
2016-06-01
Background Unmeasured confounding is one of the principal problems in pharmacoepidemiologic studies. Several methods have been proposed to detect or control for unmeasured confounding either at the study design phase or the data analysis phase. Aim of the Review To provide an overview of commonly used methods to detect or control for unmeasured confounding and to provide recommendations for proper application in pharmacoepidemiology. Methods/Results Methods to control for unmeasured confounding in the design phase of a study are case only designs (e.g., case-crossover, case-time control, self-controlled case series) and the prior event rate ratio adjustment method. Methods that can be applied in the data analysis phase include, negative control method, perturbation variable method, instrumental variable methods, sensitivity analysis, and ecological analysis. A separate group of methods are those in which additional information on confounders is collected from a substudy. The latter group includes external adjustment, propensity score calibration, two-stage sampling, and multiple imputation. Conclusion As the performance and application of the methods to handle unmeasured confounding may differ across studies and across databases, we stress the importance of using both statistical evidence and substantial clinical knowledge for interpretation of the study results.
Shear, principal, and equivalent strains in equal-channel angular deformation
NASA Astrophysics Data System (ADS)
Xia, K.; Wang, J.
2001-10-01
The shear and principal strains involved in equal channel angular deformation (ECAD) were analyzed using a variety of methods. A general expression for the total shear strain calculated by integrating infinitesimal strain increments gave the same result as that from simple geometric considerations. The magnitude and direction of the accumulated principal strains were calculated based on a geometric and a matrix algebra method, respectively. For an intersecting angle of π/2, the maximum normal strain is 0.881 in the direction at π/8 (22.5 deg) from the longitudinal direction of the material in the exit channel. The direction of the maximum principal strain should be used as the direction of grain elongation. Since the principal direction of strain rotates during ECAD, the total shear strain and principal strains so calculated do not have the same meaning as those in a strain tensor. Consequently, the “equivalent” strain based on the second invariant of a strain tensor is no longer an invariant. Indeed, the equivalent strains calculated using the total shear strain and that using the total principal strains differed as the intensity of deformation increased. The method based on matrix algebra is potentially useful in mathematical analysis and computer calculation of ECAD.
Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P; Kalinin, Sergei V
2014-10-28
Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED images, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the data set are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of a RHEED image sequence. This approach is illustrated for growth of La(x)Ca(1-x)MnO(3) films grown on etched (001) SrTiO(3) substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the asymmetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.
Statistical properties of nonlinear one-dimensional wave fields
NASA Astrophysics Data System (ADS)
Chalikov, D.
2005-06-01
A numerical model for long-term simulation of gravity surface waves is described. The model is designed as a component of a coupled Wave Boundary Layer/Sea Waves model, for investigation of small-scale dynamic and thermodynamic interactions between the ocean and atmosphere. Statistical properties of nonlinear wave fields are investigated on a basis of direct hydrodynamical modeling of 1-D potential periodic surface waves. The method is based on a nonstationary conformal surface-following coordinate transformation; this approach reduces the principal equations of potential waves to two simple evolutionary equations for the elevation and the velocity potential on the surface. The numerical scheme is based on a Fourier transform method. High accuracy was confirmed by validation of the nonstationary model against known solutions, and by comparison between the results obtained with different resolutions in the horizontal. The scheme allows reproduction of the propagation of steep Stokes waves for thousands of periods with very high accuracy. The method here developed is applied to simulation of the evolution of wave fields with large number of modes for many periods of dominant waves. The statistical characteristics of nonlinear wave fields for waves of different steepness were investigated: spectra, curtosis and skewness, dispersion relation, life time. The prime result is that wave field may be presented as a superposition of linear waves is valid only for small amplitudes. It is shown as well, that nonlinear wave fields are rather a superposition of Stokes waves not linear waves. Potential flow, free surface, conformal mapping, numerical modeling of waves, gravity waves, Stokes waves, breaking waves, freak waves, wind-wave interaction.
Principal Turnover: Upheaval and Uncertainty in Charter Schools?
ERIC Educational Resources Information Center
Ni, Yongmei; Sun, Min; Rorrer, Andrea
2015-01-01
Purpose: Informed by literature on labor market and school choice, this study aims to examine the dynamics of principal career movements in charter schools by comparing principal turnover rates and patterns between charter schools and traditional public schools. Research Methods/Approach: This study uses longitudinal data on Utah principals and…
Mezzenga, Emilio; D'Errico, Vincenzo; Sarnelli, Anna; Strigari, Lidia; Menghi, Enrico; Marcocci, Francesco; Bianchini, David; Benassi, Marcello
2016-01-01
The purpose of this study was to retrospectively evaluate the results from a Helical TomoTherapy Hi-Art treatment system relating to quality controls based on daily static and dynamic output checks using statistical process control methods. Individual value X-charts, exponentially weighted moving average charts, and process capability and acceptability indices were used to monitor the treatment system performance. Daily output values measured from January 2014 to January 2015 were considered. The results obtained showed that, although the process was in control, there was an out-of-control situation in the principal maintenance intervention for the treatment system. In particular, process capability indices showed a decreasing percentage of points in control which was, however, acceptable according to AAPM TG148 guidelines. Our findings underline the importance of restricting the acceptable range of daily output checks and suggest a future line of investigation for a detailed process control of daily output checks for the Helical TomoTherapy Hi-Art treatment system.
Hu, Yipeng; Morgan, Dominic; Ahmed, Hashim Uddin; Pendsé, Doug; Sahu, Mahua; Allen, Clare; Emberton, Mark; Hawkes, David; Barratt, Dean
2008-01-01
A method is described for generating a patient-specific, statistical motion model (SMM) of the prostate gland. Finite element analysis (FEA) is used to simulate the motion of the gland using an ultrasound-based 3D FE model over a range of plausible boundary conditions and soft-tissue properties. By applying principal component analysis to the displacements of the FE mesh node points inside the gland, the simulated deformations are then used as training data to construct the SMM. The SMM is used to both predict the displacement field over the whole gland and constrain a deformable surface registration algorithm, given only a small number of target points on the surface of the deformed gland. Using 3D transrectal ultrasound images of the prostates of five patients, acquired before and after imposing a physical deformation, to evaluate the accuracy of predicted landmark displacements, the mean target registration error was found to be less than 1.9 mm.
Alignment of the Stanford Linear Collider Arcs: Concepts and results
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pitthan, R.; Bell, B.; Friedsam, H.
1987-02-01
The alignment of the Arcs for the Stanford Linear Collider at SLAC has posed problems in accelerator survey and alignment not encountered before. These problems come less from the tight tolerances of 0.1 mm, although reaching such a tight statistically defined accuracy in a controlled manner is difficult enough, but from the absence of a common reference plane for the Arcs. Traditional circular accelerators, including HERA and LEP, have been designed in one plane referenced to local gravity. For the SLC Arcs no such single plane exists. Methods and concepts developed to solve these and other problems, connected with themore » unique design of SLC, range from the first use of satellites for accelerator alignment, use of electronic laser theodolites for placement of components, computer control of the manual adjustment process, complete automation of the data flow incorporating the most advanced concepts of geodesy, strict separation of survey and alignment, to linear principal component analysis for the final statistical smoothing of the mechanical components.« less
Effective Thermal Inactivation of the Spores of Bacillus cereus Biofilms Using Microwave.
Park, Hyong Seok; Yang, Jungwoo; Choi, Hee Jung; Kim, Kyoung Heon
2017-07-28
Microwave sterilization was performed to inactivate the spores of biofilms of Bacillus cereus involved in foodborne illness. The sterilization conditions, such as the amount of water and the operating temperature and treatment time, were optimized using statistical analysis based on 15 runs of experimental results designed by the Box-Behnken method. Statistical analysis showed that the optimal conditions for the inactivation of B. cereus biofilms were 14 ml of water, 108°C of temperature, and 15 min of treatment time. Interestingly, response surface plots showed that the amount of water is the most important factor for microwave sterilization under the present conditions. Complete inactivation by microwaves was achieved in 5 min, and the inactivation efficiency by microwave was obviously higher than that by conventional steam autoclave. Finally, confocal laser scanning microscopy images showed that the principal effect of microwave treatment was cell membrane disruption. Thus, this study can contribute to the development of a process to control food-associated pathogens.
[A study of Boletus bicolor from different areas using Fourier transform infrared spectrometry].
Zhou, Zai-Jin; Liu, Gang; Ren, Xian-Pei
2010-04-01
It is hard to differentiate the same species of wild growing mushrooms from different areas by macromorphological features. In this paper, Fourier transform infrared (FTIR) spectroscopy combined with principal component analysis was used to identify 58 samples of boletus bicolor from five different areas. Based on the fingerprint infrared spectrum of boletus bicolor samples, principal component analysis was conducted on 58 boletus bicolor spectra in the range of 1 350-750 cm(-1) using the statistical software SPSS 13.0. According to the result, the accumulated contributing ratio of the first three principal components accounts for 88.87%. They included almost all the information of samples. The two-dimensional projection plot using first and second principal component is a satisfactory clustering effect for the classification and discrimination of boletus bicolor. All boletus bicolor samples were divided into five groups with a classification accuracy of 98.3%. The study demonstrated that wild growing boletus bicolor at species level from different areas can be identified by FTIR spectra combined with principal components analysis.
ERIC Educational Resources Information Center
Moriarty, Margaret E.
2012-01-01
This mixed-methods study was designed to determine how principals perceived the ethicality of sanctions for students engaged in sexting behavior relative to the race/ethnicity and gender of the student. Personality traits of the principals were surveyed to determine if Openness and/or Conscientiousness would predict principal response. Sexting is…
Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques
NASA Astrophysics Data System (ADS)
Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein
2017-10-01
The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.
Benson, Nsikak U.; Asuquo, Francis E.; Williams, Akan B.; Essien, Joseph P.; Ekong, Cyril I.; Akpabio, Otobong; Olajire, Abaas A.
2016-01-01
Trace metals (Cd, Cr, Cu, Ni and Pb) concentrations in benthic sediments were analyzed through multi-step fractionation scheme to assess the levels and sources of contamination in estuarine, riverine and freshwater ecosystems in Niger Delta (Nigeria). The degree of contamination was assessed using the individual contamination factors (ICF) and global contamination factor (GCF). Multivariate statistical approaches including principal component analysis (PCA), cluster analysis and correlation test were employed to evaluate the interrelationships and associated sources of contamination. The spatial distribution of metal concentrations followed the pattern Pb>Cu>Cr>Cd>Ni. Ecological risk index by ICF showed significant potential mobility and bioavailability for Cu, Cu and Ni. The ICF contamination trend in the benthic sediments at all studied sites was Cu>Cr>Ni>Cd>Pb. The principal component and agglomerative clustering analyses indicate that trace metals contamination in the ecosystems was influenced by multiple pollution sources. PMID:27257934
Climate drivers on malaria transmission in Arunachal Pradesh, India.
Upadhyayula, Suryanaryana Murty; Mutheneni, Srinivasa Rao; Chenna, Sumana; Parasaram, Vaideesh; Kadiri, Madhusudhan Rao
2015-01-01
The present study was conducted during the years 2006 to 2012 and provides information on prevalence of malaria and its regulation with effect to various climatic factors in East Siang district of Arunachal Pradesh, India. Correlation analysis, Principal Component Analysis and Hotelling's T² statistics models are adopted to understand the effect of weather variables on malaria transmission. The epidemiological study shows that the prevalence of malaria is mostly caused by the parasite Plasmodium vivax followed by Plasmodium falciparum. It is noted that, the intensity of malaria cases declined gradually from the year 2006 to 2012. The transmission of malaria observed was more during the rainy season, as compared to summer and winter seasons. Further, the data analysis study with Principal Component Analysis and Hotelling's T² statistic has revealed that the climatic variables such as temperature and rainfall are the most influencing factors for the high rate of malaria transmission in East Siang district of Arunachal Pradesh.
NASA Astrophysics Data System (ADS)
Mendez, F. J.; Rueda, A.; Barnard, P.; Mori, N.; Nakajo, S.; Espejo, A.; del Jesus, M.; Diez Sierra, J.; Cofino, A. S.; Camus, P.
2016-02-01
Hurricanes hitting California have a very low ocurrence probability due to typically cool ocean temperature and westward tracks. However, damages associated to these improbable events would be dramatic in Southern California and understanding the oceanographic and atmospheric drivers is of paramount importance for coastal risk management for present and future climates. A statistical analysis of the historical events is very difficult due to the limited resolution of atmospheric and oceanographic forcing data available. In this work, we propose a combination of: (a) statistical downscaling methods (Espejo et al, 2015); and (b) a synthetic stochastic tropical cyclone (TC) model (Nakajo et al, 2014). To build the statistical downscaling model, Y=f(X), we apply a combination of principal component analysis and the k-means classification algorithm to find representative patterns from a potential TC index derived from large-scale SST fields in Eastern Central Pacific (predictor X) and the associated tropical cyclone ocurrence (predictand Y). SST data comes from NOAA Extended Reconstructed SST V3b providing information from 1854 to 2013 on a 2.0 degree x 2.0 degree global grid. As data for the historical occurrence and paths of tropical cycloneas are scarce, we apply a stochastic TC model which is based on a Monte Carlo simulation of the joint distribution of track, minimum sea level pressure and translation speed of the historical events in the Eastern Central Pacific Ocean. Results will show the ability of the approach to explain seasonal-to-interannual variability of the predictor X, which is clearly related to El Niño Southern Oscillation. References Espejo, A., Méndez, F.J., Diez, J., Medina, R., Al-Yahyai, S. (2015) Seasonal probabilistic forecasting of tropical cyclone activity in the North Indian Ocean, Journal of Flood Risk Management, DOI: 10.1111/jfr3.12197 Nakajo, S., N. Mori, T. Yasuda, and H. Mase (2014) Global Stochastic Tropical Cyclone Model Based on Principal Component Analysis and Cluster Analysis, Journal of Applied Meteorology and Climatology, DOI: 10.1175/JAMC-D-13-08.1
A Novel Weighted Kernel PCA-Based Method for Optimization and Uncertainty Quantification
NASA Astrophysics Data System (ADS)
Thimmisetty, C.; Talbot, C.; Chen, X.; Tong, C. H.
2016-12-01
It has been demonstrated that machine learning methods can be successfully applied to uncertainty quantification for geophysical systems through the use of the adjoint method coupled with kernel PCA-based optimization. In addition, it has been shown through weighted linear PCA how optimization with respect to both observation weights and feature space control variables can accelerate convergence of such methods. Linear machine learning methods, however, are inherently limited in their ability to represent features of non-Gaussian stochastic random fields, as they are based on only the first two statistical moments of the original data. Nonlinear spatial relationships and multipoint statistics leading to the tortuosity characteristic of channelized media, for example, are captured only to a limited extent by linear PCA. With the aim of coupling the kernel-based and weighted methods discussed, we present a novel mathematical formulation of kernel PCA, Weighted Kernel Principal Component Analysis (WKPCA), that both captures nonlinear relationships and incorporates the attribution of significance levels to different realizations of the stochastic random field of interest. We also demonstrate how new instantiations retaining defining characteristics of the random field can be generated using Bayesian methods. In particular, we present a novel WKPCA-based optimization method that minimizes a given objective function with respect to both feature space random variables and observation weights through which optimal snapshot significance levels and optimal features are learned. We showcase how WKPCA can be applied to nonlinear optimal control problems involving channelized media, and in particular demonstrate an application of the method to learning the spatial distribution of material parameter values in the context of linear elasticity, and discuss further extensions of the method to stochastic inversion.
Leadership Behaviors and Its Relation with Principals' Management Experience
ERIC Educational Resources Information Center
Mehdinezhad, Vali; Sardarzahi, Zaid
2016-01-01
This paper aims at studying the leadership behaviors reported by principals and observed by teachers and its relationship with management experience of principals. A quantitative method was used in this study. The target population included all principals and teachers of guidance schools and high schools in the Dashtiari District, Iran. A sample…
Portraits of Principal Practice: Time Allocation and School Principal Work
ERIC Educational Resources Information Center
Sebastian, James; Camburn, Eric M.; Spillane, James P.
2018-01-01
Purpose: The purpose of this study was to examine how school principals in urban settings distributed their time working on critical school functions. We also examined who principals worked with and how their time allocation patterns varied by school contextual characteristics. Research Method/Approach: The study was conducted in an urban school…
Principals' Perceived Supervisory Behaviors Regarding Marginal Teachers in Two States
ERIC Educational Resources Information Center
Range, Bret; Hewitt, Paul; Young, Suzie
2014-01-01
This descriptive study used an online survey to determine how principals in two states viewed the supervision of marginal teachers. Principals ranked their own evaluation of the teacher as the most important factor when identifying marginal teachers and relied on informal methods to diagnose marginal teaching. Female principals rated a majority of…
NASA Astrophysics Data System (ADS)
Díaz-Ayil, G.; Amouroux, M.; Blondel, W. C. P. M.; Bourg-Heckly, G.; Leroux, A.; Guillemin, F.; Granjon, Y.
2009-07-01
This paper deals with the development and application of in vivo spatially-resolved bimodal spectroscopy (AutoFluorescence AF and Diffuse Reflectance DR), to discriminate various stages of skin precancer in a preclinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A programmable instrumentation was developed for acquiring AF emission spectra using 7 excitation wavelengths: 360, 368, 390, 400, 410, 420 and 430 nm, and DR spectra in the 390-720 nm wavelength range. After various steps of intensity spectra preprocessing (filtering, spectral correction and intensity normalization), several sets of spectral characteristics were extracted and selected based on their discrimination power statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of sensitivity (Se) and specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibers distances and of the numbers of principal components, such that: Se and Sp ≈ 100% when discriminating CH vs. others; Sp ≈ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ≈ 74% and Se ≈ 63%for AH vs. D.
Risk prediction for myocardial infarction via generalized functional regression models.
Ieva, Francesca; Paganoni, Anna M
2016-08-01
In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.
Holmes, Sean T; Iuliucci, Robbie J; Mueller, Karl T; Dybowski, Cecil
2015-11-10
Calculations of the principal components of magnetic-shielding tensors in crystalline solids require the inclusion of the effects of lattice structure on the local electronic environment to obtain significant agreement with experimental NMR measurements. We assess periodic (GIPAW) and GIAO/symmetry-adapted cluster (SAC) models for computing magnetic-shielding tensors by calculations on a test set containing 72 insulating molecular solids, with a total of 393 principal components of chemical-shift tensors from 13C, 15N, 19F, and 31P sites. When clusters are carefully designed to represent the local solid-state environment and when periodic calculations include sufficient variability, both methods predict magnetic-shielding tensors that agree well with experimental chemical-shift values, demonstrating the correspondence of the two computational techniques. At the basis-set limit, we find that the small differences in the computed values have no statistical significance for three of the four nuclides considered. Subsequently, we explore the effects of additional DFT methods available only with the GIAO/cluster approach, particularly the use of hybrid-GGA functionals, meta-GGA functionals, and hybrid meta-GGA functionals that demonstrate improved agreement in calculations on symmetry-adapted clusters. We demonstrate that meta-GGA functionals improve computed NMR parameters over those obtained by GGA functionals in all cases, and that hybrid functionals improve computed results over the respective pure DFT functional for all nuclides except 15N.
Lee, Seung Ho; Lee, Sang Hwa; Shin, Jae-Ho; Choi, Samjin
2018-06-01
Although the confirmation of inflammatory changes within tissues at the onset of various diseases is critical for the early detection of disease and selection of appropriate treatment, most therapies are based on complex and time-consuming diagnostic procedures. Raman spectroscopy has the ability to provide non-invasive, real-time, chemical bonding analysis through the inelastic scattering of photons. In this study, we evaluate the feasibility of Raman spectroscopy as a new, easy, fast, and accurate diagnostic method to support diagnostic decisions. The molecular changes in carrageenan-induced acute inflammation rat tissues were assessed by Raman spectroscopy. Volumes of 0 (control), 100, 150, and 200 µL of 1% carrageenan were administered to rat hind paws to control the degree of inflammation. The prominent peaks at [1,062, 1,131] cm -1 and [2,847, 2,881] cm -1 were selected as characteristic measurements corresponding to the C-C stretching vibrational modes and the symmetric and asymmetric C-H (CH 2 ) stretching vibrational modes, respectively. Principal component analysis of the inflammatory Raman spectra enabled graphical representation of the degree of inflammation through principal component loading profiles of inflammatory tissues on a two-dimensional plot. Therefore, Raman spectroscopy with multivariate statistical analysis represents a promising method for detecting biomolecular responses based on different types of inflammatory tissues. © 2018 Wiley Periodicals, Inc.
Multivariate methods for indoor PM10 and PM2.5 modelling in naturally ventilated schools buildings
NASA Astrophysics Data System (ADS)
Elbayoumi, Maher; Ramli, Nor Azam; Md Yusof, Noor Faizah Fitri; Yahaya, Ahmad Shukri Bin; Al Madhoun, Wesam; Ul-Saufie, Ahmed Zia
2014-09-01
In this study the concentrations of PM10, PM2.5, CO and CO2 concentrations and meteorological variables (wind speed, air temperature, and relative humidity) were employed to predict the annual and seasonal indoor concentration of PM10 and PM2.5 using multivariate statistical methods. The data have been collected in twelve naturally ventilated schools in Gaza Strip (Palestine) from October 2011 to May 2012 (academic year). The bivariate correlation analysis showed that the indoor PM10 and PM2.5 were highly positive correlated with outdoor concentration of PM10 and PM2.5. Further, Multiple linear regression (MLR) was used for modelling and R2 values for indoor PM10 were determined as 0.62 and 0.84 for PM10 and PM2.5 respectively. The Performance indicators of MLR models indicated that the prediction for PM10 and PM2.5 annual models were better than seasonal models. In order to reduce the number of input variables, principal component analysis (PCA) and principal component regression (PCR) were applied by using annual data. The predicted R2 were 0.40 and 0.73 for PM10 and PM2.5, respectively. PM10 models (MLR and PCR) show the tendency to underestimate indoor PM10 concentrations as it does not take into account the occupant's activities which highly affect the indoor concentrations during the class hours.
Comparing biomarkers as principal surrogate endpoints.
Huang, Ying; Gilbert, Peter B
2011-12-01
Recently a new definition of surrogate endpoint, the "principal surrogate," was proposed based on causal associations between treatment effects on the biomarker and on the clinical endpoint. Despite its appealing interpretation, limited research has been conducted to evaluate principal surrogates, and existing methods focus on risk models that consider a single biomarker. How to compare principal surrogate value of biomarkers or general risk models that consider multiple biomarkers remains an open research question. We propose to characterize a marker or risk model's principal surrogate value based on the distribution of risk difference between interventions. In addition, we propose a novel summary measure (the standardized total gain) that can be used to compare markers and to assess the incremental value of a new marker. We develop a semiparametric estimated-likelihood method to estimate the joint surrogate value of multiple biomarkers. This method accommodates two-phase sampling of biomarkers and is more widely applicable than existing nonparametric methods by incorporating continuous baseline covariates to predict the biomarker(s), and is more robust than existing parametric methods by leaving the error distribution of markers unspecified. The methodology is illustrated using a simulated example set and a real data set in the context of HIV vaccine trials. © 2011, The International Biometric Society.
VanTrump, G.; Miesch, A.T.
1977-01-01
RASS is an acronym for Rock Analysis Storage System and STATPAC, for Statistical Package. The RASS and STATPAC computer programs are integrated into the RASS-STATPAC system for the management and statistical reduction of geochemical data. The system, in its present form, has been in use for more than 9 yr by scores of U.S. Geological Survey geologists, geochemists, and other scientists engaged in a broad range of geologic and geochemical investigations. The principal advantage of the system is the flexibility afforded the user both in data searches and retrievals and in the manner of statistical treatment of data. The statistical programs provide for most types of statistical reduction normally used in geochemistry and petrology, but also contain bridges to other program systems for statistical processing and automatic plotting. ?? 1977.
Spiral-bevel geometry and gear train precision
NASA Technical Reports Server (NTRS)
Litvin, F. L.; Coy, J. J.
1983-01-01
A new aproach to the solution of determination of surface principal curvatures and directions is proposed. Direct relationships between the principal curvatures and directions of the tool surface and those of the principal curvatures and directions of generated gear surface are obtained. The principal curvatures and directions of geartooth surface are obtained without using the complicated equations of these surfaces. A general theory of the train kinematical errors exerted by manufacturing and assembly errors is discussed. Two methods for the determination of the train kinematical errors can be worked out: (1) with aid of a computer, and (2) with a approximate method. Results from noise and vibration measurement conducted on a helicopter transmission are used to illustrate the principals contained in the theory of kinematic errors.
Fast, Exact Bootstrap Principal Component Analysis for p > 1 million
Fisher, Aaron; Caffo, Brian; Schwartz, Brian; Zipunnikov, Vadim
2015-01-01
Many have suggested a bootstrap procedure for estimating the sampling variability of principal component analysis (PCA) results. However, when the number of measurements per subject (p) is much larger than the number of subjects (n), calculating and storing the leading principal components from each bootstrap sample can be computationally infeasible. To address this, we outline methods for fast, exact calculation of bootstrap principal components, eigenvalues, and scores. Our methods leverage the fact that all bootstrap samples occupy the same n-dimensional subspace as the original sample. As a result, all bootstrap principal components are limited to the same n-dimensional subspace and can be efficiently represented by their low dimensional coordinates in that subspace. Several uncertainty metrics can be computed solely based on the bootstrap distribution of these low dimensional coordinates, without calculating or storing the p-dimensional bootstrap components. Fast bootstrap PCA is applied to a dataset of sleep electroencephalogram recordings (p = 900, n = 392), and to a dataset of brain magnetic resonance images (MRIs) (p ≈ 3 million, n = 352). For the MRI dataset, our method allows for standard errors for the first 3 principal components based on 1000 bootstrap samples to be calculated on a standard laptop in 47 minutes, as opposed to approximately 4 days with standard methods. PMID:27616801
Measuring socioeconomic status in multicountry studies: results from the eight-country MAL-ED study
2014-01-01
Background There is no standardized approach to comparing socioeconomic status (SES) across multiple sites in epidemiological studies. This is particularly problematic when cross-country comparisons are of interest. We sought to develop a simple measure of SES that would perform well across diverse, resource-limited settings. Methods A cross-sectional study was conducted with 800 children aged 24 to 60 months across eight resource-limited settings. Parents were asked to respond to a household SES questionnaire, and the height of each child was measured. A statistical analysis was done in two phases. First, the best approach for selecting and weighting household assets as a proxy for wealth was identified. We compared four approaches to measuring wealth: maternal education, principal components analysis, Multidimensional Poverty Index, and a novel variable selection approach based on the use of random forests. Second, the selected wealth measure was combined with other relevant variables to form a more complete measure of household SES. We used child height-for-age Z-score (HAZ) as the outcome of interest. Results Mean age of study children was 41 months, 52% were boys, and 42% were stunted. Using cross-validation, we found that random forests yielded the lowest prediction error when selecting assets as a measure of household wealth. The final SES index included access to improved water and sanitation, eight selected assets, maternal education, and household income (the WAMI index). A 25% difference in the WAMI index was positively associated with a difference of 0.38 standard deviations in HAZ (95% CI 0.22 to 0.55). Conclusions Statistical learning methods such as random forests provide an alternative to principal components analysis in the development of SES scores. Results from this multicountry study demonstrate the validity of a simplified SES index. With further validation, this simplified index may provide a standard approach for SES adjustment across resource-limited settings. PMID:24656134
Vasilaki, V; Volcke, E I P; Nandi, A K; van Loosdrecht, M C M; Katsou, E
2018-04-26
Multivariate statistical analysis was applied to investigate the dependencies and underlying patterns between N 2 O emissions and online operational variables (dissolved oxygen and nitrogen component concentrations, temperature and influent flow-rate) during biological nitrogen removal from wastewater. The system under study was a full-scale reactor, for which hourly sensor data were available. The 15-month long monitoring campaign was divided into 10 sub-periods based on the profile of N 2 O emissions, using Binary Segmentation. The dependencies between operating variables and N 2 O emissions fluctuated according to Spearman's rank correlation. The correlation between N 2 O emissions and nitrite concentrations ranged between 0.51 and 0.78. Correlation >0.7 between N 2 O emissions and nitrate concentrations was observed at sub-periods with average temperature lower than 12 °C. Hierarchical k-means clustering and principal component analysis linked N 2 O emission peaks with precipitation events and ammonium concentrations higher than 2 mg/L, especially in sub-periods characterized by low N 2 O fluxes. Additionally, the highest ranges of measured N 2 O fluxes belonged to clusters corresponding with NO 3 -N concentration less than 1 mg/L in the upstream plug-flow reactor (middle of oxic zone), indicating slow nitrification rates. The results showed that the range of N 2 O emissions partially depends on the prior behavior of the system. The principal component analysis validated the findings from the clustering analysis and showed that ammonium, nitrate, nitrite and temperature explained a considerable percentage of the variance in the system for the majority of the sub-periods. The applied statistical methods, linked the different ranges of emissions with the system variables, provided insights on the effect of operating conditions on N 2 O emissions in each sub-period and can be integrated into N 2 O emissions data processing at wastewater treatment plants. Copyright © 2018. Published by Elsevier Ltd.
Proceedings of the NASA Symposium on Mathematical Pattern Recognition and Image Analysis
NASA Technical Reports Server (NTRS)
Guseman, L. F., Jr.
1983-01-01
The application of mathematical and statistical analyses techniques to imagery obtained by remote sensors is described by Principal Investigators. Scene-to-map registration, geometric rectification, and image matching are among the pattern recognition aspects discussed.
42 CFR 476.74 - General requirements for the assumption of review.
Code of Federal Regulations, 2010 CFR
2010-10-01
... inspection at its principal business office— (1) A copy of each agreement with Medicare fiscal intermediaries... by CMS, a QIO is responsible for compiling statistics based on the criteria contained in § 405.332 of...
The Relation between Factor Score Estimates, Image Scores, and Principal Component Scores
ERIC Educational Resources Information Center
Velicer, Wayne F.
1976-01-01
Investigates the relation between factor score estimates, principal component scores, and image scores. The three methods compared are maximum likelihood factor analysis, principal component analysis, and a variant of rescaled image analysis. (RC)
Feizbakhsh, Masood; Kadkhodaei, Mahmoud; Zandian, Dana; Hosseinpour, Zahra
2017-01-01
One of the most effective ways for distal movement of molars to treat Class II malocclusion is using extraoral force through a headgear device. The purpose of this study was the comparison of stress distribution in maxillary first molar periodontium using straight pull headgear in vertical and horizontal tubes through finite element method. Based on the real geometry model, a basic model of the first molar and maxillary bone was obtained using three-dimensional imaging of the skull. After the geometric modeling of periodontium components through CATIA software and the definition of mechanical properties and element classification, a force of 150 g for each headgear was defined in ABAQUS software. Consequently, Von Mises and Principal stresses were evaluated. The statistical analysis was performed using T-paired and Wilcoxon nonparametric tests. Extension of areas with Von Mises and Principal stresses utilizing straight pull headgear with a vertical tube was not different from that of using a horizontal tube, but the numerical value of the Von Mises stress in the vertical tube was significantly reduced ( P < 0/05). On the other hand, the difference of the principal stress between both tubes was not significant ( P > 0/05). Based on the results, when force applied to the straight pull headgear with a vertical tube, Von Mises stress was reduced significantly in comparison with the horizontal tube. Therefore, to correct the mesiolingual movement of the maxillary first molar, vertical headgear tube is recommended.
Information extraction from multivariate images
NASA Technical Reports Server (NTRS)
Park, S. K.; Kegley, K. A.; Schiess, J. R.
1986-01-01
An overview of several multivariate image processing techniques is presented, with emphasis on techniques based upon the principal component transformation (PCT). Multiimages in various formats have a multivariate pixel value, associated with each pixel location, which has been scaled and quantized into a gray level vector, and the bivariate of the extent to which two images are correlated. The PCT of a multiimage decorrelates the multiimage to reduce its dimensionality and reveal its intercomponent dependencies if some off-diagonal elements are not small, and for the purposes of display the principal component images must be postprocessed into multiimage format. The principal component analysis of a multiimage is a statistical analysis based upon the PCT whose primary application is to determine the intrinsic component dimensionality of the multiimage. Computational considerations are also discussed.
Soleimani, Mohammad Ali; Yaghoobzadeh, Ameneh; Bahrami, Nasim; Sharif, Saeed Pahlevan; Sharif Nia, Hamid
2016-10-01
In this study, 398 Iranian cancer patients completed the 15-item Templer's Death Anxiety Scale (TDAS). Tests of internal consistency, principal components analysis, and confirmatory factor analysis were conducted to assess the internal consistency and factorial validity of the Persian TDAS. The construct reliability statistic and average variance extracted were also calculated to measure construct reliability, convergent validity, and discriminant validity. Principal components analysis indicated a 3-component solution, which was generally supported in the confirmatory analysis. However, acceptable cutoffs for construct reliability, convergent validity, and discriminant validity were not fulfilled for the three subscales that were derived from the principal component analysis. This study demonstrated both the advantages and potential limitations of using the TDAS with Persian-speaking cancer patients.
Minnesota School Principals' Perceptions of Minnesota School Counselors' Role and Functions
ERIC Educational Resources Information Center
Karch, Lisa Irene Hanson
2010-01-01
The purpose of the concurrent mixed methods study was to explore Minnesota principals' perceptive responses regarding the role and functions of Minnesota school counselors. A convenience sample of K-12 school principals was used for this study. Participant criteria was that each individual be a school principal in the state of Minnesota. School…
ERIC Educational Resources Information Center
McDaniel, Luther
2017-01-01
The purpose of this mixed methods study was to assess school principals' perspectives of the extent to which they apply the principles of andragogy to the professional development of assistant principals in their schools. This study was conducted in school districts that constitute a RESA area in a southeastern state. The schools in these…
What Aspects of Principal Leadership Are Most Highly Correlated with School Outcomes in China?
ERIC Educational Resources Information Center
Zheng, Qiao; Li, Lingyan; Chen, Huijuan; Loeb, Susanna
2017-01-01
Purpose: The purpose of this study is to build a broader framework for Chinese principal leadership and to determine what aspects of principal leadership correlate most highly with school outcomes from the perspectives of both principals and teachers. Method: The data come from a 2013 national student achievement assessment in China comprising…
An Exploration of How Elementary School Principals Approach the Student Retention Decision Process
ERIC Educational Resources Information Center
Martinez-Hicks, Laura M.
2012-01-01
This is a constructivist grounded theory study investigating how elementary principals approach the student retention decision process in their schools. Twenty-two elementary principals participated in the study using a selective or snowball sampling method. Principals worked in one of three districts in a mid-Atlantic state and had experience as…
School food policies and practices: a state-wide survey of secondary school principals.
French, Simone A; Story, Mary; Fulkerson, Jayne A
2002-12-01
To describe food-related policies and practices in secondary schools in Minnesota. Mailed anonymous survey including questions about the secondary school food environment and food-related practices and policies. Members of a statewide professional organization for secondary school principals (n = 610; response rate: 463/610 = 75%). Of the 463 surveys returned, 336 met the eligibility criteria (current position was either principal or assistant principal and school included at least one of the grades of 9 through 12). Descriptive statistics examined the prevalence of specific policies and practices. Chi2 analysis examined associations between policies and practices and school variables. Among principals, 65% believed it was important to have a nutrition policy for the high school; however, only 32% reported a policy at their school. Principals reported positive attitudes about providing a healthful school food environment, but 98% of the schools had soft drink vending machines and 77% had contracts with soft drink companies. Food sold at school fundraisers was most often candy, fruit, and cookies. Dietetics professionals who work in secondary school settings should collaborate with other key school staff members and parents to develop and implement a comprehensive school nutrition policy. Such a policy could foster a school food environment that is supportive of healthful food choices among youth.
Anomaly detection in hyperspectral imagery: statistics vs. graph-based algorithms
NASA Astrophysics Data System (ADS)
Berkson, Emily E.; Messinger, David W.
2016-05-01
Anomaly detection (AD) algorithms are frequently applied to hyperspectral imagery, but different algorithms produce different outlier results depending on the image scene content and the assumed background model. This work provides the first comparison of anomaly score distributions between common statistics-based anomaly detection algorithms (RX and subspace-RX) and the graph-based Topological Anomaly Detector (TAD). Anomaly scores in statistical AD algorithms should theoretically approximate a chi-squared distribution; however, this is rarely the case with real hyperspectral imagery. The expected distribution of scores found with graph-based methods remains unclear. We also look for general trends in algorithm performance with varied scene content. Three separate scenes were extracted from the hyperspectral MegaScene image taken over downtown Rochester, NY with the VIS-NIR-SWIR ProSpecTIR instrument. In order of most to least cluttered, we study an urban, suburban, and rural scene. The three AD algorithms were applied to each scene, and the distributions of the most anomalous 5% of pixels were compared. We find that subspace-RX performs better than RX, because the data becomes more normal when the highest variance principal components are removed. We also see that compared to statistical detectors, anomalies detected by TAD are easier to separate from the background. Due to their different underlying assumptions, the statistical and graph-based algorithms highlighted different anomalies within the urban scene. These results will lead to a deeper understanding of these algorithms and their applicability across different types of imagery.
Tahir, Lokman Mohd; Khan, Aqeel; Musah, Mohammed Borhandden; Ahmad, Roslee; Daud, Khadijah; Al-Hudawi, Shafeeq Hussain Vazhathodi; Musta'Amal, Aede Hatib; Talib, Rohaya
2017-11-18
Principals are school leaders who experienced stress while leading their schools towards excellence. However, principals stress experiences are always ignored and least studied. This mixed-methods study investigates primary principals' stress experiences and their Islamic coping strategies used in incapacitating the stress experiences. A total of 216 Muslim primary principals across different gender, types of schools and years of experiences as school leaders responded to the administrative stress and the Islamic coping strategies items. In addition, seven primary principals were purposefully selected and interviewed in exploring their reasons of using Islamic coping strategies for their relieving process. Results discovered that primary principals experienced fairly stress level and they perceived managing students' academic achievement was the most stressor followed by managing teachers' capabilities. Although findings revealed that no significant differences in terms primary principals' demographics; male primary principals, and experienced between 6 and 10 years and positioned in schools with least students (SLS) category have slightly higher level of stress. In terms of Islamic coping strategies used by primary principals, saying dhua to Allah, performing dhikir and reciting the Yassen are selected coping approaches employed in handling their stress. From interviews, primary principals also revealed that they used Islamic religious approaches as part of meaningful activities not just to overcoming their stress but also as part of religious approaches in remembering Allah, thinking back their past mistakes as part of the Muhasabah process. Therefore, we believed that religious approaches should be taken into consideration in principals' training as it provides peaceful and treatment in managing principals' stress issue.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steenbergen, K. G., E-mail: kgsteen@gmail.com; Gaston, N.
2014-02-14
Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement formore » a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.« less
Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics
NASA Astrophysics Data System (ADS)
Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio
2018-01-01
The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.
An Updated Review of Meat Authenticity Methods and Applications.
Vlachos, Antonios; Arvanitoyannis, Ioannis S; Tserkezou, Persefoni
2016-05-18
Adulteration of foods is a serious economic problem concerning most foodstuffs, and in particular meat products. Since high-priced meat demand premium prices, producers of meat-based products might be tempted to blend these products with lower cost meat. Moreover, the labeled meat contents may not be met. Both types of adulteration are difficult to detect and lead to deterioration of product quality. For the consumer, it is of outmost importance to guarantee both authenticity and compliance with product labeling. The purpose of this article is to review the state of the art of meat authenticity with analytical and immunochemical methods with the focus on the issue of geographic origin and sensory characteristics. This review is also intended to provide an overview of the various currently applied statistical analyses (multivariate analysis (MAV), such as principal component analysis, discriminant analysis, cluster analysis, etc.) and their effectiveness for meat authenticity.
Multi-Centrality Graph Spectral Decompositions and Their Application to Cyber Intrusion Detection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Pin-Yu; Choudhury, Sutanay; Hero, Alfred
Many modern datasets can be represented as graphs and hence spectral decompositions such as graph principal component analysis (PCA) can be useful. Distinct from previous graph decomposition approaches based on subspace projection of a single topological feature, e.g., the centered graph adjacency matrix (graph Laplacian), we propose spectral decomposition approaches to graph PCA and graph dictionary learning that integrate multiple features, including graph walk statistics, centrality measures and graph distances to reference nodes. In this paper we propose a new PCA method for single graph analysis, called multi-centrality graph PCA (MC-GPCA), and a new dictionary learning method for ensembles ofmore » graphs, called multi-centrality graph dictionary learning (MC-GDL), both based on spectral decomposition of multi-centrality matrices. As an application to cyber intrusion detection, MC-GPCA can be an effective indicator of anomalous connectivity pattern and MC-GDL can provide discriminative basis for attack classification.« less
NASA Astrophysics Data System (ADS)
Lieu, Richard
2018-01-01
A hierarchy of statistics of increasing sophistication and accuracy is proposed, to exploit an interesting and fundamental arithmetic structure in the photon bunching noise of incoherent light of large photon occupation number, with the purpose of suppressing the noise and rendering a more reliable and unbiased measurement of the light intensity. The method does not require any new hardware, rather it operates at the software level, with the help of high precision computers, to reprocess the intensity time series of the incident light to create a new series with smaller bunching noise coherence length. The ultimate accuracy improvement of this method of flux measurement is limited by the timing resolution of the detector and the photon occupation number of the beam (the higher the photon number the better the performance). The principal application is accuracy improvement in the bolometric flux measurement of a radio source.
Classification of human pathogen bacteria for early screening using electronic nose
NASA Astrophysics Data System (ADS)
Zulkifli, Syahida Amani; Mohamad, Che Wan Syarifah Robiah; Abdullah, Abu Hassan
2017-10-01
This paper present human pathogen bacteria for early screening using electronic nose. Electronic nose (E-nose) known as gas sensor array is a device that analyze the odor measurement give the fast response and less time consuming for clinical diagnosis. Many bacterial pathogens could lead to life threatening infections. Accurate and rapid diagnosis is crucial for the successful management of these infections disease. The conventional method need more time to detect the growth of bacterial. Alternatively, the bacteria are Pseudomonas aeruginosa and Shigella cultured on different media agar can be detected and classifies according to the volatile compound in shorter time using electronic nose (E-nose). Then, the data from electronic nose (E-nose) is processed using statistical method which is principal component analysis (PCA). The study shows the capability of electronic nose (E-nose) for early screening for bacterial infection in human stomach.
Steenbergen, K G; Gaston, N
2014-02-14
Inspired by methods of remote sensing image analysis, we analyze structural variation in cluster molecular dynamics (MD) simulations through a unique application of the principal component analysis (PCA) and Pearson Correlation Coefficient (PCC). The PCA analysis characterizes the geometric shape of the cluster structure at each time step, yielding a detailed and quantitative measure of structural stability and variation at finite temperature. Our PCC analysis captures bond structure variation in MD, which can be used to both supplement the PCA analysis as well as compare bond patterns between different cluster sizes. Relying only on atomic position data, without requirement for a priori structural input, PCA and PCC can be used to analyze both classical and ab initio MD simulations for any cluster composition or electronic configuration. Taken together, these statistical tools represent powerful new techniques for quantitative structural characterization and isomer identification in cluster MD.
NASA Astrophysics Data System (ADS)
Lipovsky, B.; Funning, G. J.
2009-12-01
We compare several techniques for the analysis of geodetic time series with the ultimate aim to characterize the physical processes which are represented therein. We compare three methods for the analysis of these data: Principal Component Analysis (PCA), Non-Linear PCA (NLPCA), and Rotated PCA (RPCA). We evaluate each method by its ability to isolate signals which may be any combination of low amplitude (near noise level), temporally transient, unaccompanied by seismic emissions, and small scale with respect to the spatial domain. PCA is a powerful tool for extracting structure from large datasets which is traditionally realized through either the solution of an eigenvalue problem or through iterative methods. PCA is an transformation of the coordinate system of our data such that the new "principal" data axes retain maximal variance and minimal reconstruction error (Pearson, 1901; Hotelling, 1933). RPCA is achieved by an orthogonal transformation of the principal axes determined in PCA. In the analysis of meteorological data sets, RPCA has been seen to overcome domain shape dependencies, correct for sampling errors, and to determine principal axes which more closely represent physical processes (e.g., Richman, 1986). NLPCA generalizes PCA such that principal axes are replaced by principal curves (e.g., Hsieh 2004). We achieve NLPCA through an auto-associative feed-forward neural network (Scholz, 2005). We show the geophysical relevance of these techniques by application of each to a synthetic data set. Results are compared by inverting principal axes to determine deformation source parameters. Temporal variability in source parameters, estimated by each method, are also compared.
NASA Astrophysics Data System (ADS)
Newman, Brent D.; Havenor, Kay C.; Longmire, Patrick
2016-06-01
Analysis of groundwater chemistry can yield important insights about subsurface conditions, and provide an alternative and complementary method for characterizing basin hydrogeology, especially in areas where hydraulic data are limited. More specifically, hydrochemical facies have been used for decades to help understand basin flow and transport, and a set of facies were developed for the Roswell Artesian Basin (RAB) in a semi-arid part of New Mexico, USA. The RAB is an important agricultural water source, and is an excellent example of a rechargeable artesian system. However, substantial uncertainties about the RAB hydrogeology and groundwater chemistry exist. The RAB was a great opportunity to explore hydrochemcial facies definition. A set of facies, derived from fingerprint diagrams (graphical approach), existed as a basis for testing and for comparison to principal components, factor analysis, and cluster analyses (statistical approaches). Geochemical data from over 300 RAB wells in the central basin were examined. The statistical testing of fingerprint-diagram-based facies was useful in terms of quantitatively evaluating differences between facies, and for understanding potential controls on basin groundwater chemistry. This study suggests the presence of three hydrochemical facies in the shallower part of the RAB (mostly unconfined conditions) and three in the deeper artesian system of the RAB. These facies reflect significant spatial differences in chemistry in the basin that are associated with specific stratigraphic intervals as well as structural features. Substantial chemical variability across faults and within fault blocks was also observed.
Attia, Khalid A M; El-Abasawi, Nasr M; El-Olemy, Ahmed; Abdelazim, Ahmed H
2018-03-01
Three UV spectrophotometric methods have been developed for the simultaneous determination of two new Food and Drug Administration-approved drugs, elbasvir (EBV) and grazoprevir (GRV), in their combined pharmaceutical dosage form. These methods include dual wavelength (DW), classic least-squares (CLS), and principal component regression (PCR). To achieve the DW method, two wavelengths were chosen for each drug in a way to ensure the difference in absorbance was zero from one drug to the other. GRV revealed equal absorbance at 351 and 315 nm, for which the distinctions in absorbance were measured for the determination of EBV. In the same way, distinctions in absorbance at 375 and 334.5 nm were measured for the determination of GRV. Alternatively, the CLS and PCR models were applied to the spectra analysis because the synchronous inclusion of many unreal wavelengths rather than using a single wavelength greatly increased the precision and predictive ability of the methods. The proposed methods were successfully applied to the assay of these drugs in their pharmaceutical formulation. The obtained results were statistically compared with manufacturing methods. The results conclude that there was no significant difference between the proposed methods and the manufacturing method with respect to accuracy and precision.
Yang, Jun-Ho; Yoh, Jack J
2018-01-01
A novel technique is reported for separating overlapping latent fingerprints using chemometric approaches that combine laser-induced breakdown spectroscopy (LIBS) and multivariate analysis. The LIBS technique provides the capability of real time analysis and high frequency scanning as well as the data regarding the chemical composition of overlapping latent fingerprints. These spectra offer valuable information for the classification and reconstruction of overlapping latent fingerprints by implementing appropriate statistical multivariate analysis. The current study employs principal component analysis and partial least square methods for the classification of latent fingerprints from the LIBS spectra. This technique was successfully demonstrated through a classification study of four distinct latent fingerprints using classification methods such as soft independent modeling of class analogy (SIMCA) and partial least squares discriminant analysis (PLS-DA). The novel method yielded an accuracy of more than 85% and was proven to be sufficiently robust. Furthermore, through laser scanning analysis at a spatial interval of 125 µm, the overlapping fingerprints were reconstructed as separate two-dimensional forms.
Statistical error model for a solar electric propulsion thrust subsystem
NASA Technical Reports Server (NTRS)
Bantell, M. H.
1973-01-01
The solar electric propulsion thrust subsystem statistical error model was developed as a tool for investigating the effects of thrust subsystem parameter uncertainties on navigation accuracy. The model is currently being used to evaluate the impact of electric engine parameter uncertainties on navigation system performance for a baseline mission to Encke's Comet in the 1980s. The data given represent the next generation in statistical error modeling for low-thrust applications. Principal improvements include the representation of thrust uncertainties and random process modeling in terms of random parametric variations in the thrust vector process for a multi-engine configuration.
Tessem, May-Britt; Bathen, Tone F; Cejková, Jitka; Midelfart, Anna
2005-03-01
This study was conducted to investigate metabolic changes in aqueous humor from rabbit eyes exposed to either UV-A or -B radiation, by using (1)H nuclear magnetic resonance (NMR) spectroscopy and unsupervised pattern recognition methods. Both eyes of adult albino rabbits were irradiated with UV-A (366 nm, 0.589 J/cm(2)) or UV-B (312 nm, 1.667 J/cm(2)) radiation for 8 minutes, once a day for 5 days. Three days after the last irradiation, samples of aqueous humor were aspirated, and the metabolic profiles analyzed with (1)H NMR spectroscopy. The metabolic concentrations in the exposed and control materials were statistically analyzed and compared, with multivariate methods and one-way ANOVA. UV-B radiation caused statistically significant alterations of betaine, glucose, ascorbate, valine, isoleucine, and formate in the rabbit aqueous humor. By using principal component analysis, the UV-B-irradiated samples were clearly separated from the UV-A-irradiated samples and the control group. No significant metabolic changes were detected in UV-A-irradiated samples. This study demonstrates the potential of using unsupervised pattern recognition methods to extract valuable metabolic information from complex (1)H NMR spectra. UV-B irradiation of rabbit eyes led to significant metabolic changes in the aqueous humor detected 3 days after the last exposure.
Incorporating principal component analysis into air quality model evaluation
The efficacy of standard air quality model evaluation techniques is becoming compromised as the simulation periods continue to lengthen in response to ever increasing computing capacity. Accordingly, the purpose of this paper is to demonstrate a statistical approach called Princi...
Intelligent Transportation Systems (ITS) logical architecture : volume 3 : data dictionary
DOT National Transportation Integrated Search
1982-01-01
A Guide to Reporting Highway Statistics is a principal part of Federal Highway Administration's comprehensive highway information collection effort. This Guide has two objectives: 1) To serve as a reference to the reporting system that the Federal Hi...
Dimensionality of Community Satisfaction.
ERIC Educational Resources Information Center
Williams, R. Gary; Knop, Edward
Using factor analysis (both principal factor solutions and rotated factor solutions) to search for underlying statistical commonality among 12 indicators of community satisfaction in 6 Colorado communities, the research explores various dimensions of community satisfaction that may be important across different communities. The communities under…
Spectral methods in machine learning and new strategies for very large datasets
Belabbas, Mohamed-Ali; Wolfe, Patrick J.
2009-01-01
Spectral methods are of fundamental importance in statistics and machine learning, because they underlie algorithms from classical principal components analysis to more recent approaches that exploit manifold structure. In most cases, the core technical problem can be reduced to computing a low-rank approximation to a positive-definite kernel. For the growing number of applications dealing with very large or high-dimensional datasets, however, the optimal approximation afforded by an exact spectral decomposition is too costly, because its complexity scales as the cube of either the number of training examples or their dimensionality. Motivated by such applications, we present here 2 new algorithms for the approximation of positive-semidefinite kernels, together with error bounds that improve on results in the literature. We approach this problem by seeking to determine, in an efficient manner, the most informative subset of our data relative to the kernel approximation task at hand. This leads to two new strategies based on the Nyström method that are directly applicable to massive datasets. The first of these—based on sampling—leads to a randomized algorithm whereupon the kernel induces a probability distribution on its set of partitions, whereas the latter approach—based on sorting—provides for the selection of a partition in a deterministic way. We detail their numerical implementation and provide simulation results for a variety of representative problems in statistical data analysis, each of which demonstrates the improved performance of our approach relative to existing methods. PMID:19129490
Li, Pengxiang; Kim, Michelle M; Doshi, Jalpa A
2010-08-20
The Centers for Medicare and Medicaid Services (CMS) has implemented the CMS-Hierarchical Condition Category (CMS-HCC) model to risk adjust Medicare capitation payments. This study intends to assess the performance of the CMS-HCC risk adjustment method and to compare it to the Charlson and Elixhauser comorbidity measures in predicting in-hospital and six-month mortality in Medicare beneficiaries. The study used the 2005-2006 Chronic Condition Data Warehouse (CCW) 5% Medicare files. The primary study sample included all community-dwelling fee-for-service Medicare beneficiaries with a hospital admission between January 1st, 2006 and June 30th, 2006. Additionally, four disease-specific samples consisting of subgroups of patients with principal diagnoses of congestive heart failure (CHF), stroke, diabetes mellitus (DM), and acute myocardial infarction (AMI) were also selected. Four analytic files were generated for each sample by extracting inpatient and/or outpatient claims for each patient. Logistic regressions were used to compare the methods. Model performance was assessed using the c-statistic, the Akaike's information criterion (AIC), the Bayesian information criterion (BIC) and their 95% confidence intervals estimated using bootstrapping. The CMS-HCC had statistically significant higher c-statistic and lower AIC and BIC values than the Charlson and Elixhauser methods in predicting in-hospital and six-month mortality across all samples in analytic files that included claims from the index hospitalization. Exclusion of claims for the index hospitalization generally led to drops in model performance across all methods with the highest drops for the CMS-HCC method. However, the CMS-HCC still performed as well or better than the other two methods. The CMS-HCC method demonstrated better performance relative to the Charlson and Elixhauser methods in predicting in-hospital and six-month mortality. The CMS-HCC model is preferred over the Charlson and Elixhauser methods if information about the patient's diagnoses prior to the index hospitalization is available and used to code the risk adjusters. However, caution should be exercised in studies evaluating inpatient processes of care and where data on pre-index admission diagnoses are unavailable.
Imamura, Ryota; Murata, Naoki; Shimanouchi, Toshinori; Yamashita, Kaoru; Fukuzawa, Masayuki; Noda, Minoru
2017-01-01
A new fluorescent arrayed biosensor has been developed to discriminate species and concentrations of target proteins by using plural different phospholipid liposome species encapsulating fluorescent molecules, utilizing differences in permeation of the fluorescent molecules through the membrane to modulate liposome-target protein interactions. This approach proposes a basically new label-free fluorescent sensor, compared with the common technique of developed fluorescent array sensors with labeling. We have confirmed a high output intensity of fluorescence emission related to characteristics of the fluorescent molecules dependent on their concentrations when they leak from inside the liposomes through the perturbed lipid membrane. After taking an array image of the fluorescence emission from the sensor using a CMOS imager, the output intensities of the fluorescence were analyzed by a principal component analysis (PCA) statistical method. It is found from PCA plots that different protein species with several concentrations were successfully discriminated by using the different lipid membranes with high cumulative contribution ratio. We also confirmed that the accuracy of the discrimination by the array sensor with a single shot is higher than that of a single sensor with multiple shots. PMID:28714873
Imamura, Ryota; Murata, Naoki; Shimanouchi, Toshinori; Yamashita, Kaoru; Fukuzawa, Masayuki; Noda, Minoru
2017-07-15
A new fluorescent arrayed biosensor has been developed to discriminate species and concentrations of target proteins by using plural different phospholipid liposome species encapsulating fluorescent molecules, utilizing differences in permeation of the fluorescent molecules through the membrane to modulate liposome-target protein interactions. This approach proposes a basically new label-free fluorescent sensor, compared with the common technique of developed fluorescent array sensors with labeling. We have confirmed a high output intensity of fluorescence emission related to characteristics of the fluorescent molecules dependent on their concentrations when they leak from inside the liposomes through the perturbed lipid membrane. After taking an array image of the fluorescence emission from the sensor using a CMOS imager, the output intensities of the fluorescence were analyzed by a principal component analysis (PCA) statistical method. It is found from PCA plots that different protein species with several concentrations were successfully discriminated by using the different lipid membranes with high cumulative contribution ratio. We also confirmed that the accuracy of the discrimination by the array sensor with a single shot is higher than that of a single sensor with multiple shots.
A first application of independent component analysis to extracting structure from stock returns.
Back, A D; Weigend, A S
1997-08-01
This paper explores the application of a signal processing technique known as independent component analysis (ICA) or blind source separation to multivariate financial time series such as a portfolio of stocks. The key idea of ICA is to linearly map the observed multivariate time series into a new space of statistically independent components (ICs). We apply ICA to three years of daily returns of the 28 largest Japanese stocks and compare the results with those obtained using principal component analysis. The results indicate that the estimated ICs fall into two categories, (i) infrequent large shocks (responsible for the major changes in the stock prices), and (ii) frequent smaller fluctuations (contributing little to the overall level of the stocks). We show that the overall stock price can be reconstructed surprisingly well by using a small number of thresholded weighted ICs. In contrast, when using shocks derived from principal components instead of independent components, the reconstructed price is less similar to the original one. ICA is shown to be a potentially powerful method of analyzing and understanding driving mechanisms in financial time series. The application to portfolio optimization is described in Chin and Weigend (1998).
NASA Astrophysics Data System (ADS)
Ogruc Ildiz, G.; Arslan, M.; Unsalan, O.; Araujo-Andrade, C.; Kurt, E.; Karatepe, H. T.; Yilmaz, A.; Yalcinkaya, O. B.; Herken, H.
2016-01-01
In this study, a methodology based on Fourier-transform infrared spectroscopy and principal component analysis and partial least square methods is proposed for the analysis of blood plasma samples in order to identify spectral changes correlated with some biomarkers associated with schizophrenia and bipolarity. Our main goal was to use the spectral information for the calibration of statistical models to discriminate and classify blood plasma samples belonging to bipolar and schizophrenic patients. IR spectra of 30 samples of blood plasma obtained from each, bipolar and schizophrenic patients and healthy control group were collected. The results obtained from principal component analysis (PCA) show a clear discrimination between the bipolar (BP), schizophrenic (SZ) and control group' (CG) blood samples that also give possibility to identify three main regions that show the major differences correlated with both mental disorders (biomarkers). Furthermore, a model for the classification of the blood samples was calibrated using partial least square discriminant analysis (PLS-DA), allowing the correct classification of BP, SZ and CG samples. The results obtained applying this methodology suggest that it can be used as a complimentary diagnostic tool for the detection and discrimination of these mental diseases.
The fine-scale genetic structure and evolution of the Japanese population.
Takeuchi, Fumihiko; Katsuya, Tomohiro; Kimura, Ryosuke; Nabika, Toru; Isomura, Minoru; Ohkubo, Takayoshi; Tabara, Yasuharu; Yamamoto, Ken; Yokota, Mitsuhiro; Liu, Xuanyao; Saw, Woei-Yuh; Mamatyusupu, Dolikun; Yang, Wenjun; Xu, Shuhua; Teo, Yik-Ying; Kato, Norihiro
2017-01-01
The contemporary Japanese populations largely consist of three genetically distinct groups-Hondo, Ryukyu and Ainu. By principal-component analysis, while the three groups can be clearly separated, the Hondo people, comprising 99% of the Japanese, form one almost indistinguishable cluster. To understand fine-scale genetic structure, we applied powerful haplotype-based statistical methods to genome-wide single nucleotide polymorphism data from 1600 Japanese individuals, sampled from eight distinct regions in Japan. We then combined the Japanese data with 26 other Asian populations data to analyze the shared ancestry and genetic differentiation. We found that the Japanese could be separated into nine genetic clusters in our dataset, showing a marked concordance with geography; and that major components of ancestry profile of Japanese were from the Korean and Han Chinese clusters. We also detected and dated admixture in the Japanese. While genetic differentiation between Ryukyu and Hondo was suggested to be caused in part by positive selection, genetic differentiation among the Hondo clusters appeared to result principally from genetic drift. Notably, in Asians, we found the possibility that positive selection accentuated genetic differentiation among distant populations but attenuated genetic differentiation among close populations. These findings are significant for studies of human evolution and medical genetics.
How Many Separable Sources? Model Selection In Independent Components Analysis
Woods, Roger P.; Hansen, Lars Kai; Strother, Stephen
2015-01-01
Unlike mixtures consisting solely of non-Gaussian sources, mixtures including two or more Gaussian components cannot be separated using standard independent components analysis methods that are based on higher order statistics and independent observations. The mixed Independent Components Analysis/Principal Components Analysis (mixed ICA/PCA) model described here accommodates one or more Gaussian components in the independent components analysis model and uses principal components analysis to characterize contributions from this inseparable Gaussian subspace. Information theory can then be used to select from among potential model categories with differing numbers of Gaussian components. Based on simulation studies, the assumptions and approximations underlying the Akaike Information Criterion do not hold in this setting, even with a very large number of observations. Cross-validation is a suitable, though computationally intensive alternative for model selection. Application of the algorithm is illustrated using Fisher's iris data set and Howells' craniometric data set. Mixed ICA/PCA is of potential interest in any field of scientific investigation where the authenticity of blindly separated non-Gaussian sources might otherwise be questionable. Failure of the Akaike Information Criterion in model selection also has relevance in traditional independent components analysis where all sources are assumed non-Gaussian. PMID:25811988
Macro policy responses to oil booms and busts in the United Arab Emirates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Al-Mutawa, A.K.
1991-01-01
The effects of oil shocks and macro policy changes in the United Arab Emirates are analyzed. A theoretical model is developed within the framework of the Dutch Disease literature. It contains four unique features that are applicable to the United Arab Emirates' economy. There are: (1) the presence of a large foreign labor force; (2) OPEC's oil export quotas; (3) the division of oil profits; and (4) the important role of government expenditures. The model is then used to examine the welfare effects of the above-mentioned shocks. An econometric model is then specified that conforms to the analytical model. Inmore » the econometric model the method of principal components' is applied owing to the undersized sample data. The principal components methodology is used in both the identification testing and the estimation of the structural equations. The oil and macro policy shocks are then simulated. The simulation results show that an oil-quantity boom leads to a higher welfare gain than an oil-price boom. Under certain circumstances, this finding is also confirmed by the comparative statistics that follow from the analytical model.« less
2014-01-01
Background The occurrence of response shift (RS) in longitudinal health-related quality of life (HRQoL) studies, reflecting patient adaptation to disease, has already been demonstrated. Several methods have been developed to detect the three different types of response shift (RS), i.e. recalibration RS, 2) reprioritization RS, and 3) reconceptualization RS. We investigated two complementary methods that characterize the occurrence of RS: factor analysis, comprising Principal Component Analysis (PCA) and Multiple Correspondence Analysis (MCA), and a method of Item Response Theory (IRT). Methods Breast cancer patients (n = 381) completed the EORTC QLQ-C30 and EORTC QLQ-BR23 questionnaires at baseline, immediately following surgery, and three and six months after surgery, according to the “then-test/post-test” design. Recalibration was explored using MCA and a model of IRT, called the Linear Logistic Model with Relaxed Assumptions (LLRA) using the then-test method. Principal Component Analysis (PCA) was used to explore reconceptualization and reprioritization. Results MCA highlighted the main profiles of recalibration: patients with high HRQoL level report a slightly worse HRQoL level retrospectively and vice versa. The LLRA model indicated a downward or upward recalibration for each dimension. At six months, the recalibration effect was statistically significant for 11/22 dimensions of the QLQ-C30 and BR23 according to the LLRA model (p ≤ 0.001). Regarding the QLQ-C30, PCA indicated a reprioritization of symptom scales and reconceptualization via an increased correlation between functional scales. Conclusions Our findings demonstrate the usefulness of these analyses in characterizing the occurrence of RS. MCA and IRT model had convergent results with then-test method to characterize recalibration component of RS. PCA is an indirect method in investigating the reprioritization and reconceptualization components of RS. PMID:24606836
ERIC Educational Resources Information Center
Hsiao, Hsi-Chi; Lee, Ming-Chao; Tu, Ya-Ling
2013-01-01
Deregulation has formed the primary core of education reform in Taiwan in the past decade. The principal selection system was one of the specific recommendations in the deregulation of education. The method of designation of senior high school principals has changed from being "appointed" to being "selected." The issue as to…
ERIC Educational Resources Information Center
Taylor, Rosemarye T.; Pelletier, Kelly; Trimble, Todd; Ruiz, Eddie
2014-01-01
The purpose of these three parallel mixed method studies was to measure the effectiveness of an urban school district's 2011 Preparing New Principals Program (PNPP). Results supported the premise that preparing principals for school leadership in 2013 must develop them as instructional leaders who can improve teacher performance and student…
ERIC Educational Resources Information Center
Oplatka, Izhar
2010-01-01
Purpose: To fill the gap in theoretical and empirical knowledge on late career in principalship, the aim of this study was to explore the career experiences, needs, and behaviors of principals at this stage. Research method: Life history and semistructured interviews were conducted with 20 late-career principals, 20 schoolteachers, and 10…
ERIC Educational Resources Information Center
Catano, Nancy; Stronge, James H.
2007-01-01
This study used both quantitative and qualitative methods of content analysis to examine principal evaluation instruments and state and professional standards for principals in school districts located in a mid-Atlantic state in the USA. The purposes of this study were to (a) determine the degrees of emphasis that are placed upon leadership and…
Hegde, Satisha; Hegde, Harsha Vasudev; Jalalpure, Sunil Satyappa; Peram, Malleswara Rao; Pai, Sandeep Ramachandra; Roy, Subarna
2017-01-01
Saraca asoca (Roxb.) De Wilde (Ashoka) is a highly valued endangered medicinal tree species from Western Ghats of India. Besides treating cardiac and circulatory problems, S. asoca provides immense relief in gynecological disorders. Higher price and demand, in contrast to the smaller population size of the plant, have motivated adulteration with other plants such as Polyalthia longifolia (Sonnerat) Thwaites. The fundamental concerns in quality control of S. asoca arise due to its part of medicinal value (Bark) and the chemical composition. Phytochemical fingerprinting with proper selection of analytical markers is a promising method in addressing quality control issues. In the present study, high-performance liquid chromatography of phenolic compounds (gallic acid, catechin, and epicatechin) coupled to multivariate analysis was used. Five samples each of S. asoca, P. longifolia from two localities alongside five commercial market samples showed evidence of adulteration. Subsequently, multivariate hierarchical cluster analysis and principal component analysis was established to discriminate the adulterants of S. asoca. The proposed method ascertains identification of S. asoca from its putative adulterant P. longifolia and commercial market samples. The data generated may also serve as baseline data to form a quality standard for pharmacopoeias. SUMMARY Simultaneous quantification of gallic acid, catechin, epicatechin from Saraca asoca by high-performance liquid chromatographyDetection of S. asoca from adulterant and commercial samplesUse of analytical method along with a statistical tool for addressing quality issues. Abbreviations used: HPLC: High Performance Liquid Chromatography; RP-HPLC: Reverse Phase High Performance Liquid Chromatography; CAT: Catechin; EPI: Epicatechin; GA: Gallic acid; PCA: Principal Component Analysis. PMID:28808391
An approach for quantitative image quality analysis for CT
NASA Astrophysics Data System (ADS)
Rahimi, Amir; Cochran, Joe; Mooney, Doug; Regensburger, Joe
2016-03-01
An objective and standardized approach to assess image quality of Compute Tomography (CT) systems is required in a wide variety of imaging processes to identify CT systems appropriate for a given application. We present an overview of the framework we have developed to help standardize and to objectively assess CT image quality for different models of CT scanners used for security applications. Within this framework, we have developed methods to quantitatively measure metrics that should correlate with feature identification, detection accuracy and precision, and image registration capabilities of CT machines and to identify strengths and weaknesses in different CT imaging technologies in transportation security. To that end we have designed, developed and constructed phantoms that allow for systematic and repeatable measurements of roughly 88 image quality metrics, representing modulation transfer function, noise equivalent quanta, noise power spectra, slice sensitivity profiles, streak artifacts, CT number uniformity, CT number consistency, object length accuracy, CT number path length consistency, and object registration. Furthermore, we have developed a sophisticated MATLAB based image analysis tool kit to analyze CT generated images of phantoms and report these metrics in a format that is standardized across the considered models of CT scanners, allowing for comparative image quality analysis within a CT model or between different CT models. In addition, we have developed a modified sparse principal component analysis (SPCA) method to generate a modified set of PCA components as compared to the standard principal component analysis (PCA) with sparse loadings in conjunction with Hotelling T2 statistical analysis method to compare, qualify, and detect faults in the tested systems.
THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures
Theobald, Douglas L.; Wuttke, Deborah S.
2008-01-01
Summary THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. PMID:16777907
Regional flow simulation in fractured aquifers using stress-dependent parameters.
Preisig, Giona; Joel Cornaton, Fabien; Perrochet, Pierre
2012-01-01
A model function relating effective stress to fracture permeability is developed from Hooke's law, implemented in the tensorial form of Darcy's law, and used to evaluate discharge rates and pressure distributions at regional scales. The model takes into account elastic and statistical fracture parameters, and is able to simulate real stress-dependent permeabilities from laboratory to field studies. This modeling approach gains in phenomenology in comparison to the classical ones because the permeability tensors may vary in both strength and principal directions according to effective stresses. Moreover this method allows evaluation of the fracture porosity changes, which are then translated into consolidation of the medium. © 2011, The Author(s). Ground Water © 2011, National Ground Water Association.
Modulated Hebb-Oja learning rule--a method for principal subspace analysis.
Jankovic, Marko V; Ogawa, Hidemitsu
2006-03-01
This paper presents analysis of the recently proposed modulated Hebb-Oja (MHO) method that performs linear mapping to a lower-dimensional subspace. Principal component subspace is the method that will be analyzed. Comparing to some other well-known methods for yielding principal component subspace (e.g., Oja's Subspace Learning Algorithm), the proposed method has one feature that could be seen as desirable from the biological point of view--synaptic efficacy learning rule does not need the explicit information about the value of the other efficacies to make individual efficacy modification. Also, the simplicity of the "neural circuits" that perform global computations and a fact that their number does not depend on the number of input and output neurons, could be seen as good features of the proposed method.
Salvatore, Stefania; Bramness, Jørgen Gustav; Reid, Malcolm J; Thomas, Kevin Victor; Harman, Christopher; Røislien, Jo
2015-01-01
Wastewater-based epidemiology (WBE) is a new methodology for estimating the drug load in a population. Simple summary statistics and specification tests have typically been used to analyze WBE data, comparing differences between weekday and weekend loads. Such standard statistical methods may, however, overlook important nuanced information in the data. In this study, we apply functional data analysis (FDA) to WBE data and compare the results to those obtained from more traditional summary measures. We analysed temporal WBE data from 42 European cities, using sewage samples collected daily for one week in March 2013. For each city, the main temporal features of two selected drugs were extracted using functional principal component (FPC) analysis, along with simpler measures such as the area under the curve (AUC). The individual cities' scores on each of the temporal FPCs were then used as outcome variables in multiple linear regression analysis with various city and country characteristics as predictors. The results were compared to those of functional analysis of variance (FANOVA). The three first FPCs explained more than 99% of the temporal variation. The first component (FPC1) represented the level of the drug load, while the second and third temporal components represented the level and the timing of a weekend peak. AUC was highly correlated with FPC1, but other temporal characteristic were not captured by the simple summary measures. FANOVA was less flexible than the FPCA-based regression, and even showed concordance results. Geographical location was the main predictor for the general level of the drug load. FDA of WBE data extracts more detailed information about drug load patterns during the week which are not identified by more traditional statistical methods. Results also suggest that regression based on FPC results is a valuable addition to FANOVA for estimating associations between temporal patterns and covariate information.
McGrory, Ellen R; Brown, Colin; Bargary, Norma; Williams, Natalya Hunter; Mannix, Anthony; Zhang, Chaosheng; Henry, Tiernan; Daly, Eve; Nicholas, Sarah; Petrunic, Barbara M; Lee, Monica; Morrison, Liam
2017-02-01
The presence of arsenic in groundwater has become a global concern due to the health risks from drinking water with elevated concentrations. The Water Framework Directive (WFD) of the European Union calls for drinking water risk assessment for member states. The present study amalgamates readily available national and sub-national scale datasets on arsenic in groundwater in the Republic of Ireland. However, due to the presence of high levels of left censoring (i.e. arsenic values below an analytical detection limit) and changes in detection limits over time, the application of conventional statistical methods would inhibit the generation of meaningful results. In order to handle these issues several arsenic databases were integrated and the data modelled using statistical methods appropriate for non-detect data. In addition, geostatistical methods were used to assess principal risk components of elevated arsenic related to lithology, aquifer type and groundwater vulnerability. Geographic statistical methods were used to overcome some of the geographical limitations of the Irish Environmental Protection Agency (EPA) sample database. Nearest-neighbour inverse distance weighting (IDW) and local indicator of spatial association (LISA) methods were used to estimate risk in non-sampled areas. Significant differences were also noted between different aquifer lithologies, indicating that Rhyolite, Sandstone and Shale (Greywackes), and Impure Limestone potentially presented a greater risk of elevated arsenic in groundwaters. Significant differences also occurred among aquifer types with poorly productive aquifers, locally important fractured bedrock aquifers and regionally important fissured bedrock aquifers presenting the highest potential risk of elevated arsenic. No significant differences were detected among different groundwater vulnerability groups as defined by the Geological Survey of Ireland. This research will assist management and future policy directions of groundwater resources at EU level and guide future research focused on understanding arsenic mobilisation processes to facilitate in guiding future development, testing and treatment requirements of groundwater resources. Copyright © 2016 Elsevier B.V. All rights reserved.
Applications of "Integrated Data Viewer'' (IDV) in the classroom
NASA Astrophysics Data System (ADS)
Nogueira, R.; Cutrim, E. M.
2006-06-01
Conventionally, weather products utilized in synoptic meteorology reduce phenomena occurring in four dimensions to a 2-dimensional form. This constitutes a road-block for non-atmospheric-science majors who need to take meteorology as a non-mathematical and complementary course to their major programs. This research examines the use of Integrated Data Viewer-IDV as a teaching tool, as it allows a 4-dimensional representation of weather products. IDV was tested in the teaching of synoptic meteorology, weather analysis, and weather map interpretation to non-science students in the laboratory sessions of an introductory meteorology class at Western Michigan University. Comparison of student exam scores according to the laboratory teaching techniques, i.e., traditional lab manual and IDV was performed for short- and long-term learning. Results of the statistical analysis show that the Fall 2004 students in the IDV-based lab session retained learning. However, in the Spring 2005 the exam scores did not reflect retention in learning when compared with IDV-based and MANUAL-based lab scores (short term learning, i.e., exam taken one week after the lab exercise). Testing the long-term learning, seven weeks between the two exams in the Spring 2005, show no statistically significant difference between IDV-based group scores and MANUAL-based group scores. However, the IDV group obtained exam score average slightly higher than the MANUAL group. Statistical testing of the principal hypothesis in this study, leads to the conclusion that the IDV-based method did not prove to be a better teaching tool than the traditional paper-based method. Future studies could potentially find significant differences in the effectiveness of both manual and IDV methods if the conditions had been more controlled. That is, students in the control group should not be exposed to the weather analysis using IDV during lecture.
Stumpe, B; Engel, T; Steinweg, B; Marschner, B
2012-04-03
In the past, different slag materials were often used for landscaping and construction purposes or simply dumped. Nowadays German environmental laws strictly control the use of slags, but there is still a remaining part of 35% which is uncontrolled dumped in landfills. Since some slags have high heavy metal contents and different slag types have typical chemical and physical properties that will influence the risk potential and other characteristics of the deposits, an identification of the slag types is needed. We developed a FT-IR-based statistical method to identify different slags classes. Slags samples were collected at different sites throughout various cities within the industrial Ruhr area. Then, spectra of 35 samples from four different slags classes, ladle furnace (LF), blast furnace (BF), oxygen furnace steel (OF), and zinc furnace slags (ZF), were determined in the mid-infrared region (4000-400 cm(-1)). The spectra data sets were subject to statistical classification methods for the separation of separate spectral data of different slag classes. Principal component analysis (PCA) models for each slag class were developed and further used for soft independent modeling of class analogy (SIMCA). Precise classification of slag samples into four different slag classes were achieved using two different SIMCA models stepwise. At first, SIMCA 1 was used for classification of ZF as well as OF slags over the total spectral range. If no correct classification was found, then the spectrum was analyzed with SIMCA 2 at reduced wavenumbers for the classification of LF as well as BF spectra. As a result, we provide a time- and cost-efficient method based on FT-IR spectroscopy for processing and identifying large numbers of environmental slag samples.
NASA Astrophysics Data System (ADS)
Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying
2018-06-01
In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.
An Empirical Cumulus Parameterization Scheme for a Global Spectral Model
NASA Technical Reports Server (NTRS)
Rajendran, K.; Krishnamurti, T. N.; Misra, V.; Tao, W.-K.
2004-01-01
Realistic vertical heating and drying profiles in a cumulus scheme is important for obtaining accurate weather forecasts. A new empirical cumulus parameterization scheme based on a procedure to improve the vertical distribution of heating and moistening over the tropics is developed. The empirical cumulus parameterization scheme (ECPS) utilizes profiles of Tropical Rainfall Measuring Mission (TRMM) based heating and moistening derived from the European Centre for Medium- Range Weather Forecasts (ECMWF) analysis. A dimension reduction technique through rotated principal component analysis (RPCA) is performed on the vertical profiles of heating (Q1) and drying (Q2) over the convective regions of the tropics, to obtain the dominant modes of variability. Analysis suggests that most of the variance associated with the observed profiles can be explained by retaining the first three modes. The ECPS then applies a statistical approach in which Q1 and Q2 are expressed as a linear combination of the first three dominant principal components which distinctly explain variance in the troposphere as a function of the prevalent large-scale dynamics. The principal component (PC) score which quantifies the contribution of each PC to the corresponding loading profile is estimated through a multiple screening regression method which yields the PC score as a function of the large-scale variables. The profiles of Q1 and Q2 thus obtained are found to match well with the observed profiles. The impact of the ECPS is investigated in a series of short range (1-3 day) prediction experiments using the Florida State University global spectral model (FSUGSM, T126L14). Comparisons between short range ECPS forecasts and those with the modified Kuo scheme show a very marked improvement in the skill in ECPS forecasts. This improvement in the forecast skill with ECPS emphasizes the importance of incorporating realistic vertical distributions of heating and drying in the model cumulus scheme. This also suggests that in the absence of explicit models for convection, the proposed statistical scheme improves the modeling of the vertical distribution of heating and moistening in areas of deep convection.
Morphological variation of 508 hatchling alligators from three lakes in north central Florida (Lakes Woodruff, Apopka, and Orange) was analyzed using multivariate statistics. Morphological variation was found among clutches as well as among lakes. Principal components analysis wa...
SPATIAL STATISTICS AND ECONOMETRICS FOR MODELS IN FISHERIES ECONOMICS. (R828012)
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
SOME STATISTICAL TOOLS FOR EVALUATING COMPUTER SIMULATIONS: A DATA ANALYSIS. (R825381)
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
NASA Astrophysics Data System (ADS)
Chen, Zhe; Qiu, Zurong; Huo, Xinming; Fan, Yuming; Li, Xinghua
2017-03-01
A fiber-capacitive drop analyzer is an instrument which monitors a growing droplet to produce a capacitive opto-tensiotrace (COT). Each COT is an integration of fiber light intensity signals and capacitance signals and can reflect the unique physicochemical property of a liquid. In this study, we propose a solution analytical and concentration quantitative method based on multivariate statistical methods. Eight characteristic values are extracted from each COT. A series of COT characteristic values of training solutions at different concentrations compose a data library of this kind of solution. A two-stage linear discriminant analysis is applied to analyze different solution libraries and establish discriminant functions. Test solutions can be discriminated by these functions. After determining the variety of test solutions, Spearman correlation test and principal components analysis are used to filter and reduce dimensions of eight characteristic values, producing a new representative parameter. A cubic spline interpolation function is built between the parameters and concentrations, based on which we can calculate the concentration of the test solution. Methanol, ethanol, n-propanol, and saline solutions are taken as experimental subjects in this paper. For each solution, nine or ten different concentrations are chosen to be the standard library, and the other two concentrations compose the test group. By using the methods mentioned above, all eight test solutions are correctly identified and the average relative error of quantitative analysis is 1.11%. The method proposed is feasible which enlarges the applicable scope of recognizing liquids based on the COT and improves the concentration quantitative precision, as well.
Differential principal component analysis of ChIP-seq.
Ji, Hongkai; Li, Xia; Wang, Qian-fei; Ning, Yang
2013-04-23
We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.
McKinney, Tim S.; Anning, David W.
2012-01-01
This product "Digital spatial data for observed, predicted, and misclassification errors for observations in the training dataset for nitrate and arsenic concentrations in basin-fill aquifers in the Southwest Principal Aquifers study area" is a 1:250,000-scale point spatial dataset developed as part of a regional Southwest Principal Aquifers (SWPA) study (Anning and others, 2012). The study examined the vulnerability of basin-fill aquifers in the southwestern United States to nitrate contamination and arsenic enrichment. Statistical models were developed by using the random forest classifier algorithm to predict concentrations of nitrate and arsenic across a model grid that represents local- and basin-scale measures of source, aquifer susceptibility, and geochemical conditions.
ERIC Educational Resources Information Center
Akkary, Rima Karami
2014-01-01
This study provides empirical data about the role and work context of the school principal in the Lebanon. The study applied grounded theory methods in collecting and analysing the data. The data were collected through a series of open-ended interviews with 53 secondary school principals, and focus group interviews with 8 principals from public as…
Penalized nonparametric scalar-on-function regression via principal coordinates
Reiss, Philip T.; Miller, David L.; Wu, Pei-Shien; Hua, Wen-Yu
2016-01-01
A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. PMID:29217963
New machine-learning algorithms for prediction of Parkinson's disease
NASA Astrophysics Data System (ADS)
Mandal, Indrajit; Sairam, N.
2014-03-01
This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.