Effects of preprocessing Landsat MSS data on derived features
NASA Technical Reports Server (NTRS)
Parris, T. M.; Cicone, R. C.
1983-01-01
Important to the use of multitemporal Landsat MSS data for earth resources monitoring, such as agricultural inventories, is the ability to minimize the effects of varying atmospheric and satellite viewing conditions, while extracting physically meaningful features from the data. In general, the approaches to the preprocessing problem have been derived from either physical or statistical models. This paper compares three proposed algorithms; XSTAR haze correction, Color Normalization, and Multiple Acquisition Mean Level Adjustment. These techniques represent physical, statistical, and hybrid physical-statistical models, respectively. The comparisons are made in the context of three feature extraction techniques; the Tasseled Cap, the Cate Color Cube. and Normalized Difference.
Hancock, Matthew C.; Magnan, Jerry F.
2016-01-01
Abstract. In the assessment of nodules in CT scans of the lungs, a number of image-derived features are diagnostically relevant. Currently, many of these features are defined only qualitatively, so they are difficult to quantify from first principles. Nevertheless, these features (through their qualitative definitions and interpretations thereof) are often quantified via a variety of mathematical methods for the purpose of computer-aided diagnosis (CAD). To determine the potential usefulness of quantified diagnostic image features as inputs to a CAD system, we investigate the predictive capability of statistical learning methods for classifying nodule malignancy. We utilize the Lung Image Database Consortium dataset and only employ the radiologist-assigned diagnostic feature values for the lung nodules therein, as well as our derived estimates of the diameter and volume of the nodules from the radiologists’ annotations. We calculate theoretical upper bounds on the classification accuracy that are achievable by an ideal classifier that only uses the radiologist-assigned feature values, and we obtain an accuracy of 85.74 (±1.14)%, which is, on average, 4.43% below the theoretical maximum of 90.17%. The corresponding area-under-the-curve (AUC) score is 0.932 (±0.012), which increases to 0.949 (±0.007) when diameter and volume features are included and has an accuracy of 88.08 (±1.11)%. Our results are comparable to those in the literature that use algorithmically derived image-based features, which supports our hypothesis that lung nodules can be classified as malignant or benign using only quantified, diagnostic image features, and indicates the competitiveness of this approach. We also analyze how the classification accuracy depends on specific features and feature subsets, and we rank the features according to their predictive power, statistically demonstrating the top four to be spiculation, lobulation, subtlety, and calcification. PMID:27990453
Hancock, Matthew C; Magnan, Jerry F
2016-10-01
In the assessment of nodules in CT scans of the lungs, a number of image-derived features are diagnostically relevant. Currently, many of these features are defined only qualitatively, so they are difficult to quantify from first principles. Nevertheless, these features (through their qualitative definitions and interpretations thereof) are often quantified via a variety of mathematical methods for the purpose of computer-aided diagnosis (CAD). To determine the potential usefulness of quantified diagnostic image features as inputs to a CAD system, we investigate the predictive capability of statistical learning methods for classifying nodule malignancy. We utilize the Lung Image Database Consortium dataset and only employ the radiologist-assigned diagnostic feature values for the lung nodules therein, as well as our derived estimates of the diameter and volume of the nodules from the radiologists' annotations. We calculate theoretical upper bounds on the classification accuracy that are achievable by an ideal classifier that only uses the radiologist-assigned feature values, and we obtain an accuracy of 85.74 [Formula: see text], which is, on average, 4.43% below the theoretical maximum of 90.17%. The corresponding area-under-the-curve (AUC) score is 0.932 ([Formula: see text]), which increases to 0.949 ([Formula: see text]) when diameter and volume features are included and has an accuracy of 88.08 [Formula: see text]. Our results are comparable to those in the literature that use algorithmically derived image-based features, which supports our hypothesis that lung nodules can be classified as malignant or benign using only quantified, diagnostic image features, and indicates the competitiveness of this approach. We also analyze how the classification accuracy depends on specific features and feature subsets, and we rank the features according to their predictive power, statistically demonstrating the top four to be spiculation, lobulation, subtlety, and calcification.
From creation and annihilation operators to statistics
NASA Astrophysics Data System (ADS)
Hoyuelos, M.
2018-01-01
A procedure to derive the partition function of non-interacting particles with exotic or intermediate statistics is presented. The partition function is directly related to the associated creation and annihilation operators that obey some specific commutation or anti-commutation relations. The cases of Gentile statistics, quons, Polychronakos statistics, and ewkons are considered. Ewkons statistics was recently derived from the assumption of free diffusion in energy space (Hoyuelos and Sisterna, 2016); an ideal gas of ewkons has negative pressure, a feature that makes them suitable for the description of dark energy.
NASA Astrophysics Data System (ADS)
Irvine, John M.; Ghadar, Nastaran; Duncan, Steve; Floyd, David; O'Dowd, David; Lin, Kristie; Chang, Tom
2017-03-01
Quantitative biomarkers for assessing the presence, severity, and progression of age-related macular degeneration (AMD) would benefit research, diagnosis, and treatment. This paper explores development of quantitative biomarkers derived from OCT imagery of the retina. OCT images for approximately 75 patients with Wet AMD, Dry AMD, and no AMD (healthy eyes) were analyzed to identify image features indicative of the patients' conditions. OCT image features provide a statistical characterization of the retina. Healthy eyes exhibit a layered structure, whereas chaotic patterns indicate the deterioration associated with AMD. Our approach uses wavelet and Frangi filtering, combined with statistical features that do not rely on image segmentation, to assess patient conditions. Classification analysis indicates clear separability of Wet AMD from other conditions, including Dry AMD and healthy retinas. The probability of correct classification of was 95.7%, as determined from cross validation. Similar classification analysis predicts the response of Wet AMD patients to treatment, as measured by the Best Corrected Visual Acuity (BCVA). A statistical model predicts BCVA from the imagery features with R2 = 0.846. Initial analysis of OCT imagery indicates that imagery-derived features can provide useful biomarkers for characterization and quantification of AMD: Accurate assessment of Wet AMD compared to other conditions; image-based prediction of outcome for Wet AMD treatment; and features derived from the OCT imagery accurately predict BCVA; unlike many methods in the literature, our techniques do not rely on segmentation of the OCT image. Next steps include larger scale testing and validation.
Schizophrenia classification using functional network features
NASA Astrophysics Data System (ADS)
Rish, Irina; Cecchi, Guillermo A.; Heuton, Kyle
2012-03-01
This paper focuses on discovering statistical biomarkers (features) that are predictive of schizophrenia, with a particular focus on topological properties of fMRI functional networks. We consider several network properties, such as node (voxel) strength, clustering coefficients, local efficiency, as well as just a subset of pairwise correlations. While all types of features demonstrate highly significant statistical differences in several brain areas, and close to 80% classification accuracy, the most remarkable results of 93% accuracy are achieved by using a small subset of only a dozen of most-informative (lowest p-value) correlation features. Our results suggest that voxel-level correlations and functional network features derived from them are highly informative about schizophrenia and can be used as statistical biomarkers for the disease.
Blind image quality assessment based on aesthetic and statistical quality-aware features
NASA Astrophysics Data System (ADS)
Jenadeleh, Mohsen; Masaeli, Mohammad Masood; Moghaddam, Mohsen Ebrahimi
2017-07-01
The main goal of image quality assessment (IQA) methods is the emulation of human perceptual image quality judgments. Therefore, the correlation between objective scores of these methods with human perceptual scores is considered as their performance metric. Human judgment of the image quality implicitly includes many factors when assessing perceptual image qualities such as aesthetics, semantics, context, and various types of visual distortions. The main idea of this paper is to use a host of features that are commonly employed in image aesthetics assessment in order to improve blind image quality assessment (BIQA) methods accuracy. We propose an approach that enriches the features of BIQA methods by integrating a host of aesthetics image features with the features of natural image statistics derived from multiple domains. The proposed features have been used for augmenting five different state-of-the-art BIQA methods, which use statistical natural scene statistics features. Experiments were performed on seven benchmark image quality databases. The experimental results showed significant improvement of the accuracy of the methods.
NASA Astrophysics Data System (ADS)
Hancock, Matthew C.; Magnan, Jerry F.
2017-03-01
To determine the potential usefulness of quantified diagnostic image features as inputs to a CAD system, we investigate the predictive capabilities of statistical learning methods for classifying nodule malignancy, utilizing the Lung Image Database Consortium (LIDC) dataset, and only employ the radiologist-assigned diagnostic feature values for the lung nodules therein, as well as our derived estimates of the diameter and volume of the nodules from the radiologists' annotations. We calculate theoretical upper bounds on the classification accuracy that is achievable by an ideal classifier that only uses the radiologist-assigned feature values, and we obtain an accuracy of 85.74 (+/-1.14)% which is, on average, 4.43% below the theoretical maximum of 90.17%. The corresponding area-under-the-curve (AUC) score is 0.932 (+/-0.012), which increases to 0.949 (+/-0.007) when diameter and volume features are included, along with the accuracy to 88.08 (+/-1.11)%. Our results are comparable to those in the literature that use algorithmically-derived image-based features, which supports our hypothesis that lung nodules can be classified as malignant or benign using only quantified, diagnostic image features, and indicates the competitiveness of this approach. We also analyze how the classification accuracy depends on specific features, and feature subsets, and we rank the features according to their predictive power, statistically demonstrating the top four to be spiculation, lobulation, subtlety, and calcification.
1992-12-23
predominance of structural models of recognition, of which a recent example is the Recognition By Components (RBC) theory ( Biederman , 1987 ). Structural...related to recent statistical theory (Huber, 1985; Friedman, 1987 ) and is derived from a biologically motivated computational theory (Bienenstock et...dimensional object recognition (Intrator and Gold, 1991). The method is related to recent statistical theory (Huber, 1985; Friedman, 1987 ) and is derived
ERIC Educational Resources Information Center
Mauriello, David
1984-01-01
Reviews an interactive statistical analysis package (designed to run on 8- and 16-bit machines that utilize CP/M 80 and MS-DOS operating systems), considering its features and uses, documentation, operation, and performance. The package consists of 40 general purpose statistical procedures derived from the classic textbook "Statistical…
Nketiah, Gabriel; Elschot, Mattijs; Kim, Eugene; Teruel, Jose R; Scheenen, Tom W; Bathen, Tone F; Selnæs, Kirsten M
2017-07-01
To evaluate the diagnostic relevance of T2-weighted (T2W) MRI-derived textural features relative to quantitative physiological parameters derived from diffusion-weighted (DW) and dynamic contrast-enhanced (DCE) MRI in Gleason score (GS) 3+4 and 4+3 prostate cancers. 3T multiparametric-MRI was performed on 23 prostate cancer patients prior to prostatectomy. Textural features [angular second moment (ASM), contrast, correlation, entropy], apparent diffusion coefficient (ADC), and DCE pharmacokinetic parameters (K trans and V e ) were calculated from index tumours delineated on the T2W, DW, and DCE images, respectively. The association between the textural features and prostatectomy GS and the MRI-derived parameters, and the utility of the parameters in differentiating between GS 3+4 and 4+3 prostate cancers were assessed statistically. ASM and entropy correlated significantly (p < 0.05) with both GS and median ADC. Contrast correlated moderately with median ADC. The textural features correlated insignificantly with K trans and V e . GS 4+3 cancers had significantly lower ASM and higher entropy than 3+4 cancers, but insignificant differences in median ADC, K trans , and V e . The combined texture-MRI parameters yielded higher classification accuracy (91%) than the individual parameter sets. T2W MRI-derived textural features could serve as potential diagnostic markers, sensitive to the pathological differences in prostate cancers. • T2W MRI-derived textural features correlate significantly with Gleason score and ADC. • T2W MRI-derived textural features differentiate Gleason score 3+4 from 4+3 cancers. • T2W image textural features could augment tumour characterization.
Detection of Tampering Inconsistencies on Mobile Photos
NASA Astrophysics Data System (ADS)
Cao, Hong; Kot, Alex C.
Fast proliferation of mobile cameras and the deteriorating trust on digital images have created needs in determining the integrity of photos captured by mobile devices. As tampering often creates some inconsistencies, we propose in this paper a novel framework to statistically detect the image tampering inconsistency using accurately detected demosaicing weights features. By first cropping four non-overlapping blocks, each from one of the four quadrants in the mobile photo, we extract a set of demosaicing weights features from each block based on a partial derivative correlation model. Through regularizing the eigenspectrum of the within-photo covariance matrix and performing eigenfeature transformation, we further derive a compact set of eigen demosaicing weights features, which are sensitive to image signal mixing from different photo sources. A metric is then proposed to quantify the inconsistency based on the eigen weights features among the blocks cropped from different regions of the mobile photo. Through comparison, we show our eigen weights features perform better than the eigen features extracted from several other conventional sets of statistical forensics features in detecting the presence of tampering. Experimentally, our method shows a good confidence in tampering detection especially when one of the four cropped blocks is from a different camera model or brand with different demosaicing process.
A computational visual saliency model based on statistics and machine learning.
Lin, Ru-Je; Lin, Wei-Song
2014-08-01
Identifying the type of stimuli that attracts human visual attention has been an appealing topic for scientists for many years. In particular, marking the salient regions in images is useful for both psychologists and many computer vision applications. In this paper, we propose a computational approach for producing saliency maps using statistics and machine learning methods. Based on four assumptions, three properties (Feature-Prior, Position-Prior, and Feature-Distribution) can be derived and combined by a simple intersection operation to obtain a saliency map. These properties are implemented by a similarity computation, support vector regression (SVR) technique, statistical analysis of training samples, and information theory using low-level features. This technique is able to learn the preferences of human visual behavior while simultaneously considering feature uniqueness. Experimental results show that our approach performs better in predicting human visual attention regions than 12 other models in two test databases. © 2014 ARVO.
AbdelRahman, Samir E; Zhang, Mingyuan; Bray, Bruce E; Kawamoto, Kensaku
2014-05-27
The aim of this study was to propose an analytical approach to develop high-performing predictive models for congestive heart failure (CHF) readmission using an operational dataset with incomplete records and changing data over time. Our analytical approach involves three steps: pre-processing, systematic model development, and risk factor analysis. For pre-processing, variables that were absent in >50% of records were removed. Moreover, the dataset was divided into a validation dataset and derivation datasets which were separated into three temporal subsets based on changes to the data over time. For systematic model development, using the different temporal datasets and the remaining explanatory variables, the models were developed by combining the use of various (i) statistical analyses to explore the relationships between the validation and the derivation datasets; (ii) adjustment methods for handling missing values; (iii) classifiers; (iv) feature selection methods; and (iv) discretization methods. We then selected the best derivation dataset and the models with the highest predictive performance. For risk factor analysis, factors in the highest-performing predictive models were analyzed and ranked using (i) statistical analyses of the best derivation dataset, (ii) feature rankers, and (iii) a newly developed algorithm to categorize risk factors as being strong, regular, or weak. The analysis dataset consisted of 2,787 CHF hospitalizations at University of Utah Health Care from January 2003 to June 2013. In this study, we used the complete-case analysis and mean-based imputation adjustment methods; the wrapper subset feature selection method; and four ranking strategies based on information gain, gain ratio, symmetrical uncertainty, and wrapper subset feature evaluators. The best-performing models resulted from the use of a complete-case analysis derivation dataset combined with the Class-Attribute Contingency Coefficient discretization method and a voting classifier which averaged the results of multi-nominal logistic regression and voting feature intervals classifiers. Of 42 final model risk factors, discharge disposition, discretized age, and indicators of anemia were the most significant. This model achieved a c-statistic of 86.8%. The proposed three-step analytical approach enhanced predictive model performance for CHF readmissions. It could potentially be leveraged to improve predictive model performance in other areas of clinical medicine.
Feature maps driven no-reference image quality prediction of authentically distorted images
NASA Astrophysics Data System (ADS)
Ghadiyaram, Deepti; Bovik, Alan C.
2015-03-01
Current blind image quality prediction models rely on benchmark databases comprised of singly and synthetically distorted images, thereby learning image features that are only adequate to predict human perceived visual quality on such inauthentic distortions. However, real world images often contain complex mixtures of multiple distortions. Rather than a) discounting the effect of these mixtures of distortions on an image's perceptual quality and considering only the dominant distortion or b) using features that are only proven to be efficient for singly distorted images, we deeply study the natural scene statistics of authentically distorted images, in different color spaces and transform domains. We propose a feature-maps-driven statistical approach which avoids any latent assumptions about the type of distortion(s) contained in an image, and focuses instead on modeling the remarkable consistencies in the scene statistics of real world images in the absence of distortions. We design a deep belief network that takes model-based statistical image features derived from a very large database of authentically distorted images as input and discovers good feature representations by generalizing over different distortion types, mixtures, and severities, which are later used to learn a regressor for quality prediction. We demonstrate the remarkable competence of our features for improving automatic perceptual quality prediction on a benchmark database and on the newly designed LIVE Authentic Image Quality Challenge Database and show that our approach of combining robust statistical features and the deep belief network dramatically outperforms the state-of-the-art.
No-reference image quality assessment based on statistics of convolution feature maps
NASA Astrophysics Data System (ADS)
Lv, Xiaoxin; Qin, Min; Chen, Xiaohui; Wei, Guo
2018-04-01
We propose a Convolutional Feature Maps (CFM) driven approach to accurately predict image quality. Our motivation bases on the finding that the Nature Scene Statistic (NSS) features on convolution feature maps are significantly sensitive to distortion degree of an image. In our method, a Convolutional Neural Network (CNN) is trained to obtain kernels for generating CFM. We design a forward NSS layer which performs on CFM to better extract NSS features. The quality aware features derived from the output of NSS layer is effective to describe the distortion type and degree an image suffered. Finally, a Support Vector Regression (SVR) is employed in our No-Reference Image Quality Assessment (NR-IQA) model to predict a subjective quality score of a distorted image. Experiments conducted on two public databases demonstrate the promising performance of the proposed method is competitive to state of the art NR-IQA methods.
A bootstrap based Neyman-Pearson test for identifying variable importance.
Ditzler, Gregory; Polikar, Robi; Rosen, Gail
2015-04-01
Selection of most informative features that leads to a small loss on future data are arguably one of the most important steps in classification, data analysis and model selection. Several feature selection (FS) algorithms are available; however, due to noise present in any data set, FS algorithms are typically accompanied by an appropriate cross-validation scheme. In this brief, we propose a statistical hypothesis test derived from the Neyman-Pearson lemma for determining if a feature is statistically relevant. The proposed approach can be applied as a wrapper to any FS algorithm, regardless of the FS criteria used by that algorithm, to determine whether a feature belongs in the relevant set. Perhaps more importantly, this procedure efficiently determines the number of relevant features given an initial starting point. We provide freely available software implementations of the proposed methodology.
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data
Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J.; Yanes, Oscar
2012-01-01
Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples. PMID:24957762
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data.
Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J; Yanes, Oscar
2012-10-18
Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.
Sepehrband, Farshid; Lynch, Kirsten M; Cabeen, Ryan P; Gonzalez-Zacarias, Clio; Zhao, Lu; D'Arcy, Mike; Kesselman, Carl; Herting, Megan M; Dinov, Ivo D; Toga, Arthur W; Clark, Kristi A
2018-05-15
Exploring neuroanatomical sex differences using a multivariate statistical learning approach can yield insights that cannot be derived with univariate analysis. While gross differences in total brain volume are well-established, uncovering the more subtle, regional sex-related differences in neuroanatomy requires a multivariate approach that can accurately model spatial complexity as well as the interactions between neuroanatomical features. Here, we developed a multivariate statistical learning model using a support vector machine (SVM) classifier to predict sex from MRI-derived regional neuroanatomical features from a single-site study of 967 healthy youth from the Philadelphia Neurodevelopmental Cohort (PNC). Then, we validated the multivariate model on an independent dataset of 682 healthy youth from the multi-site Pediatric Imaging, Neurocognition and Genetics (PING) cohort study. The trained model exhibited an 83% cross-validated prediction accuracy, and correctly predicted the sex of 77% of the subjects from the independent multi-site dataset. Results showed that cortical thickness of the middle occipital lobes and the angular gyri are major predictors of sex. Results also demonstrated the inferential benefits of going beyond classical regression approaches to capture the interactions among brain features in order to better characterize sex differences in male and female youths. We also identified specific cortical morphological measures and parcellation techniques, such as cortical thickness as derived from the Destrieux atlas, that are better able to discriminate between males and females in comparison to other brain atlases (Desikan-Killiany, Brodmann and subcortical atlases). Copyright © 2018 Elsevier Inc. All rights reserved.
Predicting axillary lymph node metastasis from kinetic statistics of DCE-MRI breast images
NASA Astrophysics Data System (ADS)
Ashraf, Ahmed B.; Lin, Lilie; Gavenonis, Sara C.; Mies, Carolyn; Xanthopoulos, Eric; Kontos, Despina
2012-03-01
The presence of axillary lymph node metastases is the most important prognostic factor in breast cancer and can influence the selection of adjuvant therapy, both chemotherapy and radiotherapy. In this work we present a set of kinetic statistics derived from DCE-MRI for predicting axillary node status. Breast DCE-MRI images from 69 women with known nodal status were analyzed retrospectively under HIPAA and IRB approval. Axillary lymph nodes were positive in 12 patients while 57 patients had no axillary lymph node involvement. Kinetic curves for each pixel were computed and a pixel-wise map of time-to-peak (TTP) was obtained. Pixels were first partitioned according to the similarity of their kinetic behavior, based on TTP values. For every kinetic curve, the following pixel-wise features were computed: peak enhancement (PE), wash-in-slope (WIS), wash-out-slope (WOS). Partition-wise statistics for every feature map were calculated, resulting in a total of 21 kinetic statistic features. ANOVA analysis was done to select features that differ significantly between node positive and node negative women. Using the computed kinetic statistic features a leave-one-out SVM classifier was learned that performs with AUC=0.77 under the ROC curve, outperforming the conventional kinetic measures, including maximum peak enhancement (MPE) and signal enhancement ratio (SER), (AUCs of 0.61 and 0.57 respectively). These findings suggest that our DCE-MRI kinetic statistic features can be used to improve the prediction of axillary node status in breast cancer patients. Such features could ultimately be used as imaging biomarkers to guide personalized treatment choices for women diagnosed with breast cancer.
Douglas, Pamela K.; Lau, Edward; Anderson, Ariana; Head, Austin; Kerr, Wesley; Wollner, Margalit; Moyer, Daniel; Li, Wei; Durnhofer, Mike; Bramen, Jennifer; Cohen, Mark S.
2013-01-01
The complex task of assessing the veracity of a statement is thought to activate uniquely distributed brain regions based on whether a subject believes or disbelieves a given assertion. In the current work, we present parallel machine learning methods for predicting a subject's decision response to a given propositional statement based on independent component (IC) features derived from EEG and fMRI data. Our results demonstrate that IC features outperformed features derived from event related spectral perturbations derived from any single spectral band, yet were similar to accuracy across all spectral bands combined. We compared our diagnostic IC spatial maps with our conventional general linear model (GLM) results, and found that informative ICs had significant spatial overlap with our GLM results, yet also revealed unique regions like amygdala that were not statistically significant in GLM analyses. Overall, these results suggest that ICs may yield a parsimonious feature set that can be used along with a decision tree structure for interpretation of features used in classifying complex cognitive processes such as belief and disbelief across both fMRI and EEG neuroimaging modalities. PMID:23914164
Quantitative topographic differentiation of the neonatal EEG.
Paul, Karel; Krajca, Vladimír; Roth, Zdenek; Melichar, Jan; Petránek, Svojmil
2006-09-01
To test the discriminatory topographic potential of a new method of the automatic EEG analysis in neonates. A quantitative description of the neonatal EEG can contribute to the objective assessment of the functional state of the brain, and may improve the precision of diagnosing cerebral dysfunctions manifested by 'disorganization', 'dysrhythmia' or 'dysmaturity'. 21 healthy, full-term newborns were examined polygraphically during sleep (EEG-8 referential derivations, respiration, ECG, EOG, EMG). From each EEG record, two 5-min samples (one from the middle of quiet sleep, the other from the middle of active sleep) were subject to subsequent automatic analysis and were described by 13 variables: spectral features and features describing shape and variability of the signal. The data from individual infants were averaged and the number of variables was reduced by factor analysis. All factors identified by factor analysis were statistically significantly influenced by the location of derivation. A large number of statistically significant differences were also established when comparing the effects of individual derivations on each of the 13 measured variables. Both spectral features and features describing shape and variability of the signal are largely accountable for the topographic differentiation of the neonatal EEG. The presented method of the automatic EEG analysis is capable to assess the topographic characteristics of the neonatal EEG, and it is adequately sensitive and describes the neonatal electroencephalogram with sufficient precision. The discriminatory capability of the used method represents a promise for their application in the clinical practice.
The application of automatic recognition techniques in the Apollo 9 SO-65 experiment
NASA Technical Reports Server (NTRS)
Macdonald, R. B.
1970-01-01
A synoptic feature analysis is reported on Apollo 9 remote earth surface photographs that uses the methods of statistical pattern recognition to classify density points and clusterings in digital conversion of optical data. A computer derived geological map of a geological test site indicates that geological features of the range are separable, but that specific rock types are not identifiable.
CT Image Sequence Analysis for Object Recognition - A Rule-Based 3-D Computer Vision System
Dongping Zhu; Richard W. Conners; Daniel L. Schmoldt; Philip A. Araman
1991-01-01
Research is now underway to create a vision system for hardwood log inspection using a knowledge-based approach. In this paper, we present a rule-based, 3-D vision system for locating and identifying wood defects using topological, geometric, and statistical attributes. A number of different features can be derived from the 3-D input scenes. These features and evidence...
From Loss of Memory to Poisson.
ERIC Educational Resources Information Center
Johnson, Bruce R.
1983-01-01
A way of presenting the Poisson process and deriving the Poisson distribution for upper-division courses in probability or mathematical statistics is presented. The main feature of the approach lies in the formulation of Poisson postulates with immediate intuitive appeal. (MNS)
Statistical comparisons of gravity wave features derived from OH airglow and SABER data
NASA Astrophysics Data System (ADS)
Gelinas, L. J.; Hecht, J. H.; Walterscheid, R. L.
2017-12-01
The Aerospace Corporation's near-IR camera (ANI), deployed at Andes Lidar Observatory (ALO), Cerro Pachon Chile (30S,70W) since 2010, images the bright OH Meinel (4,2) airglow band. The imager provides detailed observations of gravity waves and instability dynamics, as described by Hecht et al. (2014). The camera employs a wide-angle lens that views a 73 by 73 degree region of the sky, approximately 120 km x 120 km at 85 km altitude. Image cadence of 30s allows for detailed spectral analysis of the horizontal components of wave features, including the evolution and decay of instability features. The SABER instrument on NASA's TIMED spacecraft provides remote soundings of kinetic temperature profiles from the lower stratosphere to the lower thermosphere. Horizontal and vertical filtering techniques allow SABER temperatures to be analyzed for gravity wave variances [Walterscheid and Christensen, 2016]. Here we compare the statistical characteristics of horizontal wave spectra, derived from airglow imagery, with vertical wave variances derived from SABER temperature profiles. The analysis is performed for a period of strong mountain wave activity over the Andes spanning the period between June and September 2012. Hecht, J. H., et al. (2014), The life cycle of instability features measured from the Andes Lidar Observatory over Cerro Pachon on March 24, 2012, J. Geophys. Res. Atmos., 119, 8872-8898, doi:10.1002/2014JD021726. Walterscheid, R. L., and A. B. Christensen (2016), Low-latitude gravity wave variances in the mesosphere and lower thermosphere derived from SABER temperature observation and compared with model simulation of waves generated by deep tropical convection, J. Geophys. Res. Atmos., 121, 11,900-11,912, doi:10.1002/2016JD024843.
Tunali, Ilke; Stringfield, Olya; Guvenis, Albert; Wang, Hua; Liu, Ying; Balagurunathan, Yoganand; Lambin, Philippe; Gillies, Robert J; Schabath, Matthew B
2017-11-10
The goal of this study was to extract features from radial deviation and radial gradient maps which were derived from thoracic CT scans of patients diagnosed with lung adenocarcinoma and assess whether these features are associated with overall survival. We used two independent cohorts from different institutions for training (n= 61) and test (n= 47) and focused our analyses on features that were non-redundant and highly reproducible. To reduce the number of features and covariates into a single parsimonious model, a backward elimination approach was applied. Out of 48 features that were extracted, 31 were eliminated because they were not reproducible or were redundant. We considered 17 features for statistical analysis and identified a final model containing the two most highly informative features that were associated with lung cancer survival. One of the two features, radial deviation outside-border separation standard deviation, was replicated in a test cohort exhibiting a statistically significant association with lung cancer survival (multivariable hazard ratio = 0.40; 95% confidence interval 0.17-0.97). Additionally, we explored the biological underpinnings of these features and found radial gradient and radial deviation image features were significantly associated with semantic radiological features.
Natural image statistics and low-complexity feature selection.
Vasconcelos, Manuela; Vasconcelos, Nuno
2009-02-01
Low-complexity feature selection is analyzed in the context of visual recognition. It is hypothesized that high-order dependences of bandpass features contain little information for discrimination of natural images. This hypothesis is characterized formally by the introduction of the concepts of conjunctive interference and decomposability order of a feature set. Necessary and sufficient conditions for the feasibility of low-complexity feature selection are then derived in terms of these concepts. It is shown that the intrinsic complexity of feature selection is determined by the decomposability order of the feature set and not its dimension. Feature selection algorithms are then derived for all levels of complexity and are shown to be approximated by existing information-theoretic methods, which they consistently outperform. The new algorithms are also used to objectively test the hypothesis of low decomposability order through comparison of classification performance. It is shown that, for image classification, the gain of modeling feature dependencies has strongly diminishing returns: best results are obtained under the assumption of decomposability order 1. This suggests a generic law for bandpass features extracted from natural images: that the effect, on the dependence of any two features, of observing any other feature is constant across image classes.
NASA Astrophysics Data System (ADS)
He, Ting; Fan, Ming; Zhang, Peng; Li, Hui; Zhang, Juan; Shao, Guoliang; Li, Lihua
2018-03-01
Breast cancer can be classified into four molecular subtypes of Luminal A, Luminal B, HER2 and Basal-like, which have significant differences in treatment and survival outcomes. We in this study aim to predict immunohistochemistry (IHC) determined molecular subtypes of breast cancer using image features derived from tumor and peritumoral stroma region based on diffusion weighted imaging (DWI). A dataset of 126 breast cancer patients were collected who underwent preoperative breast MRI with a 3T scanner. The apparent diffusion coefficients (ADCs) were recorded from DWI, and breast image was segmented into regions comprising the tumor and the surrounding stromal. Statistical characteristics in various breast tumor and peritumoral regions were computed, including mean, minimum, maximum, variance, interquartile range, range, skewness, and kurtosis of ADC values. Additionally, the difference of features between each two regions were also calculated. The univariate logistic based classifier was performed for evaluating the performance of the individual features for discriminating subtypes. For multi-class classification, multivariate logistic regression model was trained and validated. The results showed that the tumor boundary and proximal peritumoral stroma region derived features have a higher performance in classification compared to that of the other regions. Furthermore, the prediction model using statistical features, difference features and all the features combined from these regions generated AUC values of 0.774, 0.796 and 0.811, respectively. The results in this study indicate that ADC feature in tumor and peritumoral stromal region would be valuable for estimating the molecular subtype in breast cancer.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ma, C; Yin, Y
2014-06-01
Purpose: The aim of this study was to explore the characteristics derived from 18F-fluorodeoxyglucose (18F-FDG) PET image and assess its capacity in staging of esophageal squamous cell carcinoma (ESCC). Methods: 26 patients with newly diagnosed ESCC who underwent 18F-FDG PET scan were included in this study. Different image-derived indices including the standardized uptake value (SUV), gross tumor length, texture features and shape feature were considered. Taken the histopathologic examination as the gold standard, the extracted capacities of indices in staging of ESCC were assessed by Kruskal-Wallis test and Mann-Whitney test. Specificity and sensitivity for each of the studied parameters weremore » derived using receiver-operating characteristic curves. Results: 18F-FDG SUVmax and SUVmean showed statistically significant capability in AJCC and TNM stages. Texture features such as ENT and CORR were significant factors for N stages(p=0.040, p=0.029). Both FDG PET Longitudinal length and shape feature Eccentricity (EC) (p≤0.010) provided powerful stratification in the primary ESCC AJCC and TNM stages than SUV and texture features. Receiver-operating-characteristic curve analysis showed that tumor textural analysis can capability M stages with higher sensitivity than SUV measurement but lower in T and N stages. Conclusion: The 18F-FDG image-derived characteristics of SUV, textural features and shape feature allow for good stratification AJCC and TNM stage in ESCC patients.« less
Creation of a virtual cutaneous tissue bank
NASA Astrophysics Data System (ADS)
LaFramboise, William A.; Shah, Sujal; Hoy, R. W.; Letbetter, D.; Petrosko, P.; Vennare, R.; Johnson, Peter C.
2000-04-01
Cellular and non-cellular constituents of skin contain fundamental morphometric features and structural patterns that correlate with tissue function. High resolution digital image acquisitions performed using an automated system and proprietary software to assemble adjacent images and create a contiguous, lossless, digital representation of individual microscope slide specimens. Serial extraction, evaluation and statistical analysis of cutaneous feature is performed utilizing an automated analysis system, to derive normal cutaneous parameters comprising essential structural skin components. Automated digital cutaneous analysis allows for fast extraction of microanatomic dat with accuracy approximating manual measurement. The process provides rapid assessment of feature both within individual specimens and across sample populations. The images, component data, and statistical analysis comprise a bioinformatics database to serve as an architectural blueprint for skin tissue engineering and as a diagnostic standard of comparison for pathologic specimens.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Richen; Guo, Hanqi; Yuan, Xiaoru
Most of the existing approaches to visualize vector field ensembles are to reveal the uncertainty of individual variables, for example, statistics, variability, etc. However, a user-defined derived feature like vortex or air mass is also quite significant, since they make more sense to domain scientists. In this paper, we present a new framework to extract user-defined derived features from different simulation runs. Specially, we use a detail-to-overview searching scheme to help extract vortex with a user-defined shape. We further compute the geometry information including the size, the geo-spatial location of the extracted vortexes. We also design some linked views tomore » compare them between different runs. At last, the temporal information such as the occurrence time of the feature is further estimated and compared. Results show that our method is capable of extracting the features across different runs and comparing them spatially and temporally.« less
Derivative pricing with non-linear Fokker-Planck dynamics
NASA Astrophysics Data System (ADS)
Michael, Fredrick; Johnson, M. D.
2003-06-01
We examine how the Black-Scholes derivative pricing formula is modified when the underlying security obeys non-extensive statistics and Fokker-Planck dynamics. An unusual feature of such securities is that the volatility in the underlying Ito-Langevin equation depends implicitly on the actual market rate of return. This complicates most approaches to valuation. Here we show that progress is possible using variations of the Cox-Ross valuation technique.
Tunali, Ilke; Stringfield, Olya; Guvenis, Albert; Wang, Hua; Liu, Ying; Balagurunathan, Yoganand; Lambin, Philippe; Gillies, Robert J.; Schabath, Matthew B.
2017-01-01
The goal of this study was to extract features from radial deviation and radial gradient maps which were derived from thoracic CT scans of patients diagnosed with lung adenocarcinoma and assess whether these features are associated with overall survival. We used two independent cohorts from different institutions for training (n= 61) and test (n= 47) and focused our analyses on features that were non-redundant and highly reproducible. To reduce the number of features and covariates into a single parsimonious model, a backward elimination approach was applied. Out of 48 features that were extracted, 31 were eliminated because they were not reproducible or were redundant. We considered 17 features for statistical analysis and identified a final model containing the two most highly informative features that were associated with lung cancer survival. One of the two features, radial deviation outside-border separation standard deviation, was replicated in a test cohort exhibiting a statistically significant association with lung cancer survival (multivariable hazard ratio = 0.40; 95% confidence interval 0.17-0.97). Additionally, we explored the biological underpinnings of these features and found radial gradient and radial deviation image features were significantly associated with semantic radiological features. PMID:29221183
Electromagnetic sinc Schell-model beams and their statistical properties.
Mei, Zhangrong; Mao, Yonghua
2014-09-22
A class of electromagnetic sources with sinc Schell-model correlations is introduced. The conditions on source parameters guaranteeing that the source generates a physical beam are derived. The evolution behaviors of statistical properties for the electromagnetic stochastic beams generated by this new source on propagating in free space and in atmosphere turbulence are investigated with the help of the weighted superposition method and by numerical simulations. It is demonstrated that the intensity distributions of such beams exhibit unique features on propagating in free space and produce a double-layer flat-top profile of being shape-invariant in the far field. This feature makes this new beam particularly suitable for some special laser processing applications. The influences of the atmosphere turbulence with a non-Kolmogorov power spectrum on statistical properties of the new beams are analyzed in detail.
NASA Astrophysics Data System (ADS)
Nagarajan, Mahesh B.; Checefsky, Walter A.; Abidin, Anas Z.; Tsai, Halley; Wang, Xixi; Hobbs, Susan K.; Bauer, Jan S.; Baum, Thomas; Wismüller, Axel
2015-03-01
While the proximal femur is preferred for measuring bone mineral density (BMD) in fracture risk estimation, the introduction of volumetric quantitative computed tomography has revealed stronger associations between BMD and spinal fracture status. In this study, we propose to capture properties of trabecular bone structure in spinal vertebrae with advanced second-order statistical features for purposes of fracture risk assessment. For this purpose, axial multi-detector CT (MDCT) images were acquired from 28 spinal vertebrae specimens using a whole-body 256-row CT scanner with a dedicated calibration phantom. A semi-automated method was used to annotate the trabecular compartment in the central vertebral slice with a circular region of interest (ROI) to exclude cortical bone; pixels within were converted to values indicative of BMD. Six second-order statistical features derived from gray-level co-occurrence matrices (GLCM) and the mean BMD within the ROI were then extracted and used in conjunction with a generalized radial basis functions (GRBF) neural network to predict the failure load of the specimens; true failure load was measured through biomechanical testing. Prediction performance was evaluated with a root-mean-square error (RMSE) metric. The best prediction performance was observed with GLCM feature `correlation' (RMSE = 1.02 ± 0.18), which significantly outperformed all other GLCM features (p < 0.01). GLCM feature correlation also significantly outperformed MDCTmeasured mean BMD (RMSE = 1.11 ± 0.17) (p< 10-4). These results suggest that biomechanical strength prediction in spinal vertebrae can be significantly improved through characterization of trabecular bone structure with GLCM-derived texture features.
NASA Astrophysics Data System (ADS)
Huang, Haiping
2017-05-01
Revealing hidden features in unlabeled data is called unsupervised feature learning, which plays an important role in pretraining a deep neural network. Here we provide a statistical mechanics analysis of the unsupervised learning in a restricted Boltzmann machine with binary synapses. A message passing equation to infer the hidden feature is derived, and furthermore, variants of this equation are analyzed. A statistical analysis by replica theory describes the thermodynamic properties of the model. Our analysis confirms an entropy crisis preceding the non-convergence of the message passing equation, suggesting a discontinuous phase transition as a key characteristic of the restricted Boltzmann machine. Continuous phase transition is also confirmed depending on the embedded feature strength in the data. The mean-field result under the replica symmetric assumption agrees with that obtained by running message passing algorithms on single instances of finite sizes. Interestingly, in an approximate Hopfield model, the entropy crisis is absent, and a continuous phase transition is observed instead. We also develop an iterative equation to infer the hyper-parameter (temperature) hidden in the data, which in physics corresponds to iteratively imposing Nishimori condition. Our study provides insights towards understanding the thermodynamic properties of the restricted Boltzmann machine learning, and moreover important theoretical basis to build simplified deep networks.
Soguero-Ruiz, Cristina; Hindberg, Kristian; Rojo-Alvarez, Jose Luis; Skrovseth, Stein Olav; Godtliebsen, Fred; Mortensen, Kim; Revhaug, Arthur; Lindsetmo, Rolv-Ole; Augestad, Knut Magne; Jenssen, Robert
2016-09-01
The free text in electronic health records (EHRs) conveys a huge amount of clinical information about health state and patient history. Despite a rapidly growing literature on the use of machine learning techniques for extracting this information, little effort has been invested toward feature selection and the features' corresponding medical interpretation. In this study, we focus on the task of early detection of anastomosis leakage (AL), a severe complication after elective surgery for colorectal cancer (CRC) surgery, using free text extracted from EHRs. We use a bag-of-words model to investigate the potential for feature selection strategies. The purpose is earlier detection of AL and prediction of AL with data generated in the EHR before the actual complication occur. Due to the high dimensionality of the data, we derive feature selection strategies using the robust support vector machine linear maximum margin classifier, by investigating: 1) a simple statistical criterion (leave-one-out-based test); 2) an intensive-computation statistical criterion (Bootstrap resampling); and 3) an advanced statistical criterion (kernel entropy). Results reveal a discriminatory power for early detection of complications after CRC (sensitivity 100%; specificity 72%). These results can be used to develop prediction models, based on EHR data, that can support surgeons and patients in the preoperative decision making phase.
Nagarajan, Mahesh B; Coan, Paola; Huber, Markus B; Diemoz, Paul C; Glaser, Christian; Wismüller, Axel
2014-02-01
Phase-contrast computed tomography (PCI-CT) has shown tremendous potential as an imaging modality for visualizing human cartilage with high spatial resolution. Previous studies have demonstrated the ability of PCI-CT to visualize (1) structural details of the human patellar cartilage matrix and (2) changes to chondrocyte organization induced by osteoarthritis. This study investigates the use of high-dimensional geometric features in characterizing such chondrocyte patterns in the presence or absence of osteoarthritic damage. Geometrical features derived from the scaling index method (SIM) and statistical features derived from gray-level co-occurrence matrices were extracted from 842 regions of interest (ROI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. These features were subsequently used in a machine learning task with support vector regression to classify ROIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver-operating characteristic curve (AUC). SIM-derived geometrical features exhibited the best classification performance (AUC, 0.95 ± 0.06) and were most robust to changes in ROI size. These results suggest that such geometrical features can provide a detailed characterization of the chondrocyte organization in the cartilage matrix in an automated and non-subjective manner, while also enabling classification of cartilage as healthy or osteoarthritic with high accuracy. Such features could potentially serve as imaging markers for evaluating osteoarthritis progression and its response to different therapeutic intervention strategies.
1992-01-01
entropy , energy. variance, skewness, and object. It can also be applied to an image of a phenomenon. It kurtosis. These parameters are then used as...statistic. The co-occurrence matrix method is used in this study to derive texture values of entropy . Limogeneity. energy (similar to the GLDV angular...from working with the co-occurrence matrix method. Seven convolution sizes were chosen to derive the texture values of entropy , local homogeneity, and
SoFoCles: feature filtering for microarray classification based on gene ontology.
Papachristoudis, Georgios; Diplaris, Sotiris; Mitkas, Pericles A
2010-02-01
Marker gene selection has been an important research topic in the classification analysis of gene expression data. Current methods try to reduce the "curse of dimensionality" by using statistical intra-feature set calculations, or classifiers that are based on the given dataset. In this paper, we present SoFoCles, an interactive tool that enables semantic feature filtering in microarray classification problems with the use of external, well-defined knowledge retrieved from the Gene Ontology. The notion of semantic similarity is used to derive genes that are involved in the same biological path during the microarray experiment, by enriching a feature set that has been initially produced with legacy methods. Among its other functionalities, SoFoCles offers a large repository of semantic similarity methods that are used in order to derive feature sets and marker genes. The structure and functionality of the tool are discussed in detail, as well as its ability to improve classification accuracy. Through experimental evaluation, SoFoCles is shown to outperform other classification schemes in terms of classification accuracy in two real datasets using different semantic similarity computation approaches.
Joint resonant CMB power spectrum and bispectrum estimation
NASA Astrophysics Data System (ADS)
Meerburg, P. Daniel; Münchmeyer, Moritz; Wandelt, Benjamin
2016-02-01
We develop the tools necessary to assess the statistical significance of resonant features in the CMB correlation functions, combining power spectrum and bispectrum measurements. This significance is typically addressed by running a large number of simulations to derive the probability density function (PDF) of the feature-amplitude in the Gaussian case. Although these simulations are tractable for the power spectrum, for the bispectrum they require significant computational resources. We show that, by assuming that the PDF is given by a multivariate Gaussian where the covariance is determined by the Fisher matrix of the sine and cosine terms, we can efficiently produce spectra that are statistically close to those derived from full simulations. By drawing a large number of spectra from this PDF, both for the power spectrum and the bispectrum, we can quickly determine the statistical significance of candidate signatures in the CMB, considering both single frequency and multifrequency estimators. We show that for resonance models, cosmology and foreground parameters have little influence on the estimated amplitude, which allows us to simplify the analysis considerably. A more precise likelihood treatment can then be applied to candidate signatures only. We also discuss a modal expansion approach for the power spectrum, aimed at quickly scanning through large families of oscillating models.
Ekins, Sean; Olechno, Joe; Williams, Antony J.
2013-01-01
Dispensing and dilution processes may profoundly influence estimates of biological activity of compounds. Published data show Ephrin type-B receptor 4 IC50 values obtained via tip-based serial dilution and dispensing versus acoustic dispensing with direct dilution differ by orders of magnitude with no correlation or ranking of datasets. We generated computational 3D pharmacophores based on data derived by both acoustic and tip-based transfer. The computed pharmacophores differ significantly depending upon dispensing and dilution methods. The acoustic dispensing-derived pharmacophore correctly identified active compounds in a subsequent test set where the tip-based method failed. Data from acoustic dispensing generates a pharmacophore containing two hydrophobic features, one hydrogen bond donor and one hydrogen bond acceptor. This is consistent with X-ray crystallography studies of ligand-protein interactions and automatically generated pharmacophores derived from this structural data. In contrast, the tip-based data suggest a pharmacophore with two hydrogen bond acceptors, one hydrogen bond donor and no hydrophobic features. This pharmacophore is inconsistent with the X-ray crystallographic studies and automatically generated pharmacophores. In short, traditional dispensing processes are another important source of error in high-throughput screening that impacts computational and statistical analyses. These findings have far-reaching implications in biological research. PMID:23658723
Ding, Liya; Martinez, Aleix M
2010-11-01
The appearance-based approach to face detection has seen great advances in the last several years. In this approach, we learn the image statistics describing the texture pattern (appearance) of the object class we want to detect, e.g., the face. However, this approach has had limited success in providing an accurate and detailed description of the internal facial features, i.e., eyes, brows, nose, and mouth. In general, this is due to the limited information carried by the learned statistical model. While the face template is relatively rich in texture, facial features (e.g., eyes, nose, and mouth) do not carry enough discriminative information to tell them apart from all possible background images. We resolve this problem by adding the context information of each facial feature in the design of the statistical model. In the proposed approach, the context information defines the image statistics most correlated with the surroundings of each facial component. This means that when we search for a face or facial feature, we look for those locations which most resemble the feature yet are most dissimilar to its context. This dissimilarity with the context features forces the detector to gravitate toward an accurate estimate of the position of the facial feature. Learning to discriminate between feature and context templates is difficult, however, because the context and the texture of the facial features vary widely under changing expression, pose, and illumination, and may even resemble one another. We address this problem with the use of subclass divisions. We derive two algorithms to automatically divide the training samples of each facial feature into a set of subclasses, each representing a distinct construction of the same facial component (e.g., closed versus open eyes) or its context (e.g., different hairstyles). The first algorithm is based on a discriminant analysis formulation. The second algorithm is an extension of the AdaBoost approach. We provide extensive experimental results using still images and video sequences for a total of 3,930 images. We show that the results are almost as good as those obtained with manual detection.
Empirical Reference Distributions for Networks of Different Size
Smith, Anna; Calder, Catherine A.; Browning, Christopher R.
2016-01-01
Network analysis has become an increasingly prevalent research tool across a vast range of scientific fields. Here, we focus on the particular issue of comparing network statistics, i.e. graph-level measures of network structural features, across multiple networks that differ in size. Although “normalized” versions of some network statistics exist, we demonstrate via simulation why direct comparison is often inappropriate. We consider normalizing network statistics relative to a simple fully parameterized reference distribution and demonstrate via simulation how this is an improvement over direct comparison, but still sometimes problematic. We propose a new adjustment method based on a reference distribution constructed as a mixture model of random graphs which reflect the dependence structure exhibited in the observed networks. We show that using simple Bernoulli models as mixture components in this reference distribution can provide adjusted network statistics that are relatively comparable across different network sizes but still describe interesting features of networks, and that this can be accomplished at relatively low computational expense. Finally, we apply this methodology to a collection of ecological networks derived from the Los Angeles Family and Neighborhood Survey activity location data. PMID:27721556
Style consistent classification of isogenous patterns.
Sarkar, Prateek; Nagy, George
2005-01-01
In many applications of pattern recognition, patterns appear together in groups (fields) that have a common origin. For example, a printed word is usually a field of character patterns printed in the same font. A common origin induces consistency of style in features measured on patterns. The features of patterns co-occurring in a field are statistically dependent because they share the same, albeit unknown, style. Style constrained classifiers achieve higher classification accuracy by modeling such dependence among patterns in a field. Effects of style consistency on the distributions of field-features (concatenation of pattern features) can be modeled by hierarchical mixtures. Each field derives from a mixture of styles, while, within a field, a pattern derives from a class-style conditional mixture of Gaussians. Based on this model, an optimal style constrained classifier processes entire fields of patterns rendered in a consistent but unknown style. In a laboratory experiment, style constrained classification reduced errors on fields of printed digits by nearly 25 percent over singlet classifiers. Longer fields favor our classification method because they furnish more information about the underlying style.
NASA Astrophysics Data System (ADS)
Ghavami, Raouf; Sadeghi, Faridoon; Rasouli, Zolikha; Djannati, Farhad
2012-12-01
Experimental values for the 13C NMR chemical shifts (ppm, TMS = 0) at 300 K ranging from 96.28 ppm (C4' of indole derivative 17) to 159.93 ppm (C4' of indole derivative 23) relative to deuteride chloroform (CDCl3, 77.0 ppm) or dimethylsulfoxide (DMSO, 39.50 ppm) as internal reference in CDCl3 or DMSO-d6 solutions have been collected from literature for thirty 2-functionalized 5-(methylsulfonyl)-1-phenyl-1H-indole derivatives containing different substituted groups. An effective quantitative structure-property relationship (QSPR) models were built using hybrid method combining genetic algorithm (GA) based on stepwise selection multiple linear regression (SWS-MLR) as feature-selection tools and correlation models between each carbon atom of indole derivative and calculated descriptors. Each compound was depicted by molecular structural descriptors that encode constitutional, topological, geometrical, electrostatic, and quantum chemical features. The accuracy of all developed models were confirmed using different types of internal and external procedures and various statistical tests. Furthermore, the domain of applicability for each model which indicates the area of reliable predictions was defined.
Yang, Jaw-Yen; Yan, Chih-Yuan; Diaz, Manuel; Huang, Juan-Chen; Li, Zhihui; Zhang, Hanxin
2014-01-08
The ideal quantum gas dynamics as manifested by the semiclassical ellipsoidal-statistical (ES) equilibrium distribution derived in Wu et al. (Wu et al . 2012 Proc. R. Soc. A 468 , 1799-1823 (doi:10.1098/rspa.2011.0673)) is numerically studied for particles of three statistics. This anisotropic ES equilibrium distribution was derived using the maximum entropy principle and conserves the mass, momentum and energy, but differs from the standard Fermi-Dirac or Bose-Einstein distribution. The present numerical method combines the discrete velocity (or momentum) ordinate method in momentum space and the high-resolution shock-capturing method in physical space. A decoding procedure to obtain the necessary parameters for determining the ES distribution is also devised. Computations of two-dimensional Riemann problems are presented, and various contours of the quantities unique to this ES model are illustrated. The main flow features, such as shock waves, expansion waves and slip lines and their complex nonlinear interactions, are depicted and found to be consistent with existing calculations for a classical gas.
Bayesian depth estimation from monocular natural images.
Su, Che-Chun; Cormack, Lawrence K; Bovik, Alan C
2017-05-01
Estimating an accurate and naturalistic dense depth map from a single monocular photographic image is a difficult problem. Nevertheless, human observers have little difficulty understanding the depth structure implied by photographs. Two-dimensional (2D) images of the real-world environment contain significant statistical information regarding the three-dimensional (3D) structure of the world that the vision system likely exploits to compute perceived depth, monocularly as well as binocularly. Toward understanding how this might be accomplished, we propose a Bayesian model of monocular depth computation that recovers detailed 3D scene structures by extracting reliable, robust, depth-sensitive statistical features from single natural images. These features are derived using well-accepted univariate natural scene statistics (NSS) models and recent bivariate/correlation NSS models that describe the relationships between 2D photographic images and their associated depth maps. This is accomplished by building a dictionary of canonical local depth patterns from which NSS features are extracted as prior information. The dictionary is used to create a multivariate Gaussian mixture (MGM) likelihood model that associates local image features with depth patterns. A simple Bayesian predictor is then used to form spatial depth estimates. The depth results produced by the model, despite its simplicity, correlate well with ground-truth depths measured by a current-generation terrestrial light detection and ranging (LIDAR) scanner. Such a strong form of statistical depth information could be used by the visual system when creating overall estimated depth maps incorporating stereopsis, accommodation, and other conditions. Indeed, even in isolation, the Bayesian predictor delivers depth estimates that are competitive with state-of-the-art "computer vision" methods that utilize highly engineered image features and sophisticated machine learning algorithms.
Image segmentation by hierarchial agglomeration of polygons using ecological statistics
Prasad, Lakshman; Swaminarayan, Sriram
2013-04-23
A method for rapid hierarchical image segmentation based on perceptually driven contour completion and scene statistics is disclosed. The method begins with an initial fine-scale segmentation of an image, such as obtained by perceptual completion of partial contours into polygonal regions using region-contour correspondences established by Delaunay triangulation of edge pixels as implemented in VISTA. The resulting polygons are analyzed with respect to their size and color/intensity distributions and the structural properties of their boundaries. Statistical estimates of granularity of size, similarity of color, texture, and saliency of intervening boundaries are computed and formulated into logical (Boolean) predicates. The combined satisfiability of these Boolean predicates by a pair of adjacent polygons at a given segmentation level qualifies them for merging into a larger polygon representing a coarser, larger-scale feature of the pixel image and collectively obtains the next level of polygonal segments in a hierarchy of fine-to-coarse segmentations. The iterative application of this process precipitates textured regions as polygons with highly convolved boundaries and helps distinguish them from objects which typically have more regular boundaries. The method yields a multiscale decomposition of an image into constituent features that enjoy a hierarchical relationship with features at finer and coarser scales. This provides a traversable graph structure from which feature content and context in terms of other features can be derived, aiding in automated image understanding tasks. The method disclosed is highly efficient and can be used to decompose and analyze large images.
High-resolution land cover classification using low resolution global data
NASA Astrophysics Data System (ADS)
Carlotto, Mark J.
2013-05-01
A fusion approach is described that combines texture features from high-resolution panchromatic imagery with land cover statistics derived from co-registered low-resolution global databases to obtain high-resolution land cover maps. The method does not require training data or any human intervention. We use an MxN Gabor filter bank consisting of M=16 oriented bandpass filters (0-180°) at N resolutions (3-24 meters/pixel). The size range of these spatial filters is consistent with the typical scale of manmade objects and patterns of cultural activity in imagery. Clustering reduces the complexity of the data by combining pixels that have similar texture into clusters (regions). Texture classification assigns a vector of class likelihoods to each cluster based on its textural properties. Classification is unsupervised and accomplished using a bank of texture anomaly detectors. Class likelihoods are modulated by land cover statistics derived from lower resolution global data over the scene. Preliminary results from a number of Quickbird scenes show our approach is able to classify general land cover features such as roads, built up area, forests, open areas, and bodies of water over a wide range of scenes.
The value of parsing as feature generation for gene mention recognition
Smith, Larry H; Wilbur, W John
2009-01-01
We measured the extent to which information surrounding a base noun phrase reflects the presence of a gene name, and evaluated seven different parsers in their ability to provide information for that purpose. Using the GENETAG corpus as a gold standard, we performed machine learning to recognize from its context when a base noun phrase contained a gene name. Starting with the best lexical features, we assessed the gain of adding dependency or dependency-like relations from a full sentence parse. Features derived from parsers improved performance in this partial gene mention recognition task by a small but statistically significant amount. There were virtually no differences between parsers in these experiments. PMID:19345281
Classifiers utilized to enhance acoustic based sensors to identify round types of artillery/mortar
NASA Astrophysics Data System (ADS)
Grasing, David; Desai, Sachi; Morcos, Amir
2008-04-01
Feature extraction methods based on the statistical analysis of the change in event pressure levels over a period and the level of ambient pressure excitation facilitate the development of a robust classification algorithm. The features reliably discriminates mortar and artillery variants via acoustic signals produced during the launch events. Utilizing acoustic sensors to exploit the sound waveform generated from the blast for the identification of mortar and artillery variants as type A, etcetera through analysis of the waveform. Distinct characteristics arise within the different mortar/artillery variants because varying HE mortar payloads and related charges emphasize varying size events at launch. The waveform holds various harmonic properties distinct to a given mortar/artillery variant that through advanced signal processing and data mining techniques can employed to classify a given type. The skewness and other statistical processing techniques are used to extract the predominant components from the acoustic signatures at ranges exceeding 3000m. Exploiting these techniques will help develop a feature set highly independent of range, providing discrimination based on acoustic elements of the blast wave. Highly reliable discrimination will be achieved with a feedforward neural network classifier trained on a feature space derived from the distribution of statistical coefficients, frequency spectrum, and higher frequency details found within different energy bands. The processes that are described herein extend current technologies, which emphasis acoustic sensor systems to provide such situational awareness.
Artillery/mortar type classification based on detected acoustic transients
NASA Astrophysics Data System (ADS)
Morcos, Amir; Grasing, David; Desai, Sachi
2008-04-01
Feature extraction methods based on the statistical analysis of the change in event pressure levels over a period and the level of ambient pressure excitation facilitate the development of a robust classification algorithm. The features reliably discriminates mortar and artillery variants via acoustic signals produced during the launch events. Utilizing acoustic sensors to exploit the sound waveform generated from the blast for the identification of mortar and artillery variants as type A, etcetera through analysis of the waveform. Distinct characteristics arise within the different mortar/artillery variants because varying HE mortar payloads and related charges emphasize varying size events at launch. The waveform holds various harmonic properties distinct to a given mortar/artillery variant that through advanced signal processing and data mining techniques can employed to classify a given type. The skewness and other statistical processing techniques are used to extract the predominant components from the acoustic signatures at ranges exceeding 3000m. Exploiting these techniques will help develop a feature set highly independent of range, providing discrimination based on acoustic elements of the blast wave. Highly reliable discrimination will be achieved with a feed-forward neural network classifier trained on a feature space derived from the distribution of statistical coefficients, frequency spectrum, and higher frequency details found within different energy bands. The processes that are described herein extend current technologies, which emphasis acoustic sensor systems to provide such situational awareness.
Artillery/mortar round type classification to increase system situational awareness
NASA Astrophysics Data System (ADS)
Desai, Sachi; Grasing, David; Morcos, Amir; Hohil, Myron
2008-04-01
Feature extraction methods based on the statistical analysis of the change in event pressure levels over a period and the level of ambient pressure excitation facilitate the development of a robust classification algorithm. The features reliably discriminates mortar and artillery variants via acoustic signals produced during the launch events. Utilizing acoustic sensors to exploit the sound waveform generated from the blast for the identification of mortar and artillery variants as type A, etcetera through analysis of the waveform. Distinct characteristics arise within the different mortar/artillery variants because varying HE mortar payloads and related charges emphasize varying size events at launch. The waveform holds various harmonic properties distinct to a given mortar/artillery variant that through advanced signal processing and data mining techniques can employed to classify a given type. The skewness and other statistical processing techniques are used to extract the predominant components from the acoustic signatures at ranges exceeding 3000m. Exploiting these techniques will help develop a feature set highly independent of range, providing discrimination based on acoustic elements of the blast wave. Highly reliable discrimination will be achieved with a feedforward neural network classifier trained on a feature space derived from the distribution of statistical coefficients, frequency spectrum, and higher frequency details found within different energy bands. The processes that are described herein extend current technologies, which emphasis acoustic sensor systems to provide such situational awareness.
Determination of precipitation profiles from airborne passive microwave radiometric measurements
NASA Technical Reports Server (NTRS)
Kummerow, Christian; Hakkarinen, Ida M.; Pierce, Harold F.; Weinman, James A.
1991-01-01
This study presents the first quantitative retrievals of vertical profiles of precipitation derived from multispectral passive microwave radiometry. Measurements of microwave brightness temperature (Tb) obtained by a NASA high-altitude research aircraft are related to profiles of rainfall rate through a multichannel piecewise-linear statistical regression procedure. Statistics for Tb are obtained from a set of cloud radiative models representing a wide variety of convective, stratiform, and anvil structures. The retrieval scheme itself determines which cloud model best fits the observed meteorological conditions. Retrieved rainfall rate profiles are converted to equivalent radar reflectivity for comparison with observed reflectivities from a ground-based research radar. Results for two case studies, a stratiform rain situation and an intense convective thunderstorm, show that the radiometrically derived profiles capture the major features of the observed vertical structure of hydrometer density.
NASA Astrophysics Data System (ADS)
Xu, Y.; Jones, A. D.; Rhoades, A.
2017-12-01
Precipitation is a key component in hydrologic cycles, and changing precipitation regimes contribute to more intense and frequent drought and flood events around the world. Numerical climate modeling is a powerful tool to study climatology and to predict future changes. Despite the continuous improvement in numerical models, long-term precipitation prediction remains a challenge especially at regional scales. To improve numerical simulations of precipitation, it is important to find out where the uncertainty in precipitation simulations comes from. There are two types of uncertainty in numerical model predictions. One is related to uncertainty in the input data, such as model's boundary and initial conditions. These uncertainties would propagate to the final model outcomes even if the numerical model has exactly replicated the true world. But a numerical model cannot exactly replicate the true world. Therefore, the other type of model uncertainty is related the errors in the model physics, such as the parameterization of sub-grid scale processes, i.e., given precise input conditions, how much error could be generated by the in-precise model. Here, we build two statistical models based on a neural network algorithm to predict long-term variation of precipitation over California: one uses "true world" information derived from observations, and the other uses "modeled world" information using model inputs and outputs from the North America Coordinated Regional Downscaling Project (NA CORDEX). We derive multiple climate feature metrics as the predictors for the statistical model to represent the impact of global climate on local hydrology, and include topography as a predictor to represent the local control. We first compare the predictors between the true world and the modeled world to determine the errors contained in the input data. By perturbing the predictors in the statistical model, we estimate how much uncertainty in the model's final outcomes is accounted for by each predictor. By comparing the statistical model derived from true world information and modeled world information, we assess the errors lying in the physics of the numerical models. This work provides a unique insight to assess the performance of numerical climate models, and can be used to guide improvement of precipitation prediction.
NASA Astrophysics Data System (ADS)
Shi, Wenzhong; Deng, Susu; Xu, Wenbing
2018-02-01
For automatic landslide detection, landslide morphological features should be quantitatively expressed and extracted. High-resolution Digital Elevation Models (DEMs) derived from airborne Light Detection and Ranging (LiDAR) data allow fine-scale morphological features to be extracted, but noise in DEMs influences morphological feature extraction, and the multi-scale nature of landslide features should be considered. This paper proposes a method to extract landslide morphological features characterized by homogeneous spatial patterns. Both profile and tangential curvature are utilized to quantify land surface morphology, and a local Gi* statistic is calculated for each cell to identify significant patterns of clustering of similar morphometric values. The method was tested on both synthetic surfaces simulating natural terrain and airborne LiDAR data acquired over an area dominated by shallow debris slides and flows. The test results of the synthetic data indicate that the concave and convex morphologies of the simulated terrain features at different scales and distinctness could be recognized using the proposed method, even when random noise was added to the synthetic data. In the test area, cells with large local Gi* values were extracted at a specified significance level from the profile and the tangential curvature image generated from the LiDAR-derived 1-m DEM. The morphologies of landslide main scarps, source areas and trails were clearly indicated, and the morphological features were represented by clusters of extracted cells. A comparison with the morphological feature extraction method based on curvature thresholds proved the proposed method's robustness to DEM noise. When verified against a landslide inventory, the morphological features of almost all recent (< 5 years) landslides and approximately 35% of historical (> 10 years) landslides were extracted. This finding indicates that the proposed method can facilitate landslide detection, although the cell clusters extracted from curvature images should be filtered using a filtering strategy based on supplementary information provided by expert knowledge or other data sources.
Referenceless perceptual fog density prediction model
NASA Astrophysics Data System (ADS)
Choi, Lark Kwon; You, Jaehee; Bovik, Alan C.
2014-02-01
We propose a perceptual fog density prediction model based on natural scene statistics (NSS) and "fog aware" statistical features, which can predict the visibility in a foggy scene from a single image without reference to a corresponding fogless image, without side geographical camera information, without training on human-rated judgments, and without dependency on salient objects such as lane markings or traffic signs. The proposed fog density predictor only makes use of measurable deviations from statistical regularities observed in natural foggy and fog-free images. A fog aware collection of statistical features is derived from a corpus of foggy and fog-free images by using a space domain NSS model and observed characteristics of foggy images such as low contrast, faint color, and shifted intensity. The proposed model not only predicts perceptual fog density for the entire image but also provides a local fog density index for each patch. The predicted fog density of the model correlates well with the measured visibility in a foggy scene as measured by judgments taken in a human subjective study on a large foggy image database. As one application, the proposed model accurately evaluates the performance of defog algorithms designed to enhance the visibility of foggy images.
NASA Astrophysics Data System (ADS)
Bao, Zhenkun; Li, Xiaolong; Luo, Xiangyang
2017-01-01
Extracting informative statistic features is the most essential technical issue of steganalysis. Among various steganalysis methods, probability density function (PDF) and characteristic function (CF) moments are two important types of features due to the excellent ability for distinguishing the cover images from the stego ones. The two types of features are quite similar in definition. The only difference is that the PDF moments are computed in the spatial domain, while the CF moments are computed in the Fourier-transformed domain. Then, the comparison between PDF and CF moments is an interesting question of steganalysis. Several theoretical results have been derived, and CF moments are proved better than PDF moments in some cases. However, in the log prediction error wavelet subband of wavelet decomposition, some experiments show that the result is opposite and lacks a rigorous explanation. To solve this problem, a comparison result based on the rigorous proof is presented: the first-order PDF moment is proved better than the CF moment, while the second-order CF moment is better than the PDF moment. It tries to open the theoretical discussion on steganalysis and the question of finding suitable statistical features.
Recurrence plot statistics and the effect of embedding
NASA Astrophysics Data System (ADS)
March, T. K.; Chapman, S. C.; Dendy, R. O.
2005-01-01
Recurrence plots provide a graphical representation of the recurrent patterns in a timeseries, the quantification of which is a relatively new field. Here we derive analytical expressions which relate the values of key statistics, notably determinism and entropy of line length distribution, to the correlation sum as a function of embedding dimension. These expressions are obtained by deriving the transformation which generates an embedded recurrence plot from an unembedded plot. A single unembedded recurrence plot thus provides the statistics of all possible embedded recurrence plots. If the correlation sum scales exponentially with embedding dimension, we show that these statistics are determined entirely by the exponent of the exponential. This explains the results of Iwanski and Bradley [J.S. Iwanski, E. Bradley, Recurrence plots of experimental data: to embed or not to embed? Chaos 8 (1998) 861-871] who found that certain recurrence plot statistics are apparently invariant to embedding dimension for certain low-dimensional systems. We also examine the relationship between the mutual information content of two timeseries and the common recurrent structure seen in their recurrence plots. This allows time-localized contributions to mutual information to be visualized. This technique is demonstrated using geomagnetic index data; we show that the AU and AL geomagnetic indices share half their information, and find the timescale on which mutual features appear.
NASA Astrophysics Data System (ADS)
Bucy, Brandon R.
While much of physics education research (PER) has traditionally been conducted in introductory undergraduate courses, researchers have begun to study student understanding of physics concepts at the upper-level. In this dissertation, we describe investigations conducted in advanced undergraduate thermodynamics courses. We present and discuss results pertaining to student understanding of two topics: entropy and the role of mixed second-order partial derivatives in thermodynamics. Our investigations into student understanding of entropy consisted of an analysis of written student responses to researcher-designed diagnostic questions. Data gathered in clinical interviews is employed to illustrate and extend results gathered from written responses. The question sets provided students with several ideal gas processes, and asked students to determine and compare the entropy changes of these processes. We administered the question sets to students from six distinct populations, including students enrolled in classical thermodynamics, statistical mechanics, thermal physics, physical chemistry, and chemical engineering courses, as well as a sample of physics graduate students. Data was gathered both before and after instruction in several samples. Several noteworthy features of student reasoning are identified and discussed. These features include student ideas about entropy prior to instruction, as well as specific difficulties and other aspects of student reasoning evident after instruction. As an example, students from various populations tended to emphasize either the thermodynamic or the statistical definition of entropy. Both approaches present students with a unique set of benefits as well as challenges. We additionally studied student understanding of partial derivatives in a thermodynamics context. We identified specific difficulties related to the mixed second partial derivatives of a thermodynamic state function, based on an analysis of student responses to homework and exam problems. Students tended to set these partial derivatives identically equal to zero. Students also displayed difficulties in relating the physical description of a material property to a corresponding mathematical statement involving partial derivatives. We describe the development of a guided-inquiry tutorial activity designed to address these specific difficulties. This tutorial focused on the graphical interpretation of partial derivatives. Preliminary results suggest that the tutorial was effective in addressing several student difficulties related to partial derivatives.
Deriving Color-Color Transformations for VRI Photometry
NASA Astrophysics Data System (ADS)
Taylor, B. J.; Joner, M. D.
2006-12-01
In this paper, transformations between Cousins R-I and other indices are considered. New transformations to Cousins V-R and Johnson V-K are derived, a published transformation involving T1-T2 on the Washington system is rederived, and the basis for a transformation involving b-y is considered. In addition, a statistically rigorous procedure for deriving such transformations is presented and discussed in detail. Highlights of the discussion include (1) the need for statistical analysis when least-squares relations are determined and interpreted, (2) the permitted forms and best forms for such relations, (3) the essential role played by accidental errors, (4) the decision process for selecting terms to appear in the relations, (5) the use of plots of residuals, (6) detection of influential data, (7) a protocol for assessing systematic effects from absorption features and other sources, (8) the reasons for avoiding extrapolation of the relations, (9) a protocol for ensuring uniformity in data used to determine the relations, and (10) the derivation and testing of the accidental errors of those data. To put the last of these subjects in perspective, it is shown that rms errors for VRI photometry have been as small as 6 mmag for more than three decades and that standard errors for quantities derived from such photometry can be as small as 1 mmag or less.
Lee, Myungeun; Woo, Boyeong; Kuo, Michael D.; Jamshidi, Neema
2017-01-01
Objective The purpose of this study was to evaluate the reliability and quality of radiomic features in glioblastoma multiforme (GBM) derived from tumor volumes obtained with semi-automated tumor segmentation software. Materials and Methods MR images of 45 GBM patients (29 males, 16 females) were downloaded from The Cancer Imaging Archive, in which post-contrast T1-weighted imaging and fluid-attenuated inversion recovery MR sequences were used. Two raters independently segmented the tumors using two semi-automated segmentation tools (TumorPrism3D and 3D Slicer). Regions of interest corresponding to contrast-enhancing lesion, necrotic portions, and non-enhancing T2 high signal intensity component were segmented for each tumor. A total of 180 imaging features were extracted, and their quality was evaluated in terms of stability, normalized dynamic range (NDR), and redundancy, using intra-class correlation coefficients, cluster consensus, and Rand Statistic. Results Our study results showed that most of the radiomic features in GBM were highly stable. Over 90% of 180 features showed good stability (intra-class correlation coefficient [ICC] ≥ 0.8), whereas only 7 features were of poor stability (ICC < 0.5). Most first order statistics and morphometric features showed moderate-to-high NDR (4 > NDR ≥1), while above 35% of the texture features showed poor NDR (< 1). Features were shown to cluster into only 5 groups, indicating that they were highly redundant. Conclusion The use of semi-automated software tools provided sufficiently reliable tumor segmentation and feature stability; thus helping to overcome the inherent inter-rater and intra-rater variability of user intervention. However, certain aspects of feature quality, including NDR and redundancy, need to be assessed for determination of representative signature features before further development of radiomics. PMID:28458602
Lee, Myungeun; Woo, Boyeong; Kuo, Michael D; Jamshidi, Neema; Kim, Jong Hyo
2017-01-01
The purpose of this study was to evaluate the reliability and quality of radiomic features in glioblastoma multiforme (GBM) derived from tumor volumes obtained with semi-automated tumor segmentation software. MR images of 45 GBM patients (29 males, 16 females) were downloaded from The Cancer Imaging Archive, in which post-contrast T1-weighted imaging and fluid-attenuated inversion recovery MR sequences were used. Two raters independently segmented the tumors using two semi-automated segmentation tools (TumorPrism3D and 3D Slicer). Regions of interest corresponding to contrast-enhancing lesion, necrotic portions, and non-enhancing T2 high signal intensity component were segmented for each tumor. A total of 180 imaging features were extracted, and their quality was evaluated in terms of stability, normalized dynamic range (NDR), and redundancy, using intra-class correlation coefficients, cluster consensus, and Rand Statistic. Our study results showed that most of the radiomic features in GBM were highly stable. Over 90% of 180 features showed good stability (intra-class correlation coefficient [ICC] ≥ 0.8), whereas only 7 features were of poor stability (ICC < 0.5). Most first order statistics and morphometric features showed moderate-to-high NDR (4 > NDR ≥1), while above 35% of the texture features showed poor NDR (< 1). Features were shown to cluster into only 5 groups, indicating that they were highly redundant. The use of semi-automated software tools provided sufficiently reliable tumor segmentation and feature stability; thus helping to overcome the inherent inter-rater and intra-rater variability of user intervention. However, certain aspects of feature quality, including NDR and redundancy, need to be assessed for determination of representative signature features before further development of radiomics.
NASA Astrophysics Data System (ADS)
Abidin, Anas Z.; Nagarajan, Mahesh B.; Checefsky, Walter A.; Coan, Paola; Diemoz, Paul C.; Hobbs, Susan K.; Huber, Markus B.; Wismüller, Axel
2015-03-01
Phase contrast X-ray computed tomography (PCI-CT) has recently emerged as a novel imaging technique that allows visualization of cartilage soft tissue, subsequent examination of chondrocyte patterns, and their correlation to osteoarthritis. Previous studies have shown that 2D texture features are effective at distinguishing between healthy and osteoarthritic regions of interest annotated in the radial zone of cartilage matrix on PCI-CT images. In this study, we further extend the texture analysis to 3D and investigate the ability of volumetric texture features at characterizing chondrocyte patterns in the cartilage matrix for purposes of classification. Here, we extracted volumetric texture features derived from Minkowski Functionals and gray-level co-occurrence matrices (GLCM) from 496 volumes of interest (VOI) annotated on PCI-CT images of human patellar cartilage specimens. The extracted features were then used in a machine-learning task involving support vector regression to classify ROIs as healthy or osteoarthritic. Classification performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC). The best classification performance was observed with GLCM features correlation (AUC = 0.83 +/- 0.06) and homogeneity (AUC = 0.82 +/- 0.07), which significantly outperformed all Minkowski Functionals (p < 0.05). These results suggest that such quantitative analysis of chondrocyte patterns in human patellar cartilage matrix involving GLCM-derived statistical features can distinguish between healthy and osteoarthritic tissue with high accuracy.
Anomalous sea surface structures as an object of statistical topography
NASA Astrophysics Data System (ADS)
Klyatskin, V. I.; Koshel, K. V.
2015-06-01
By exploiting ideas of statistical topography, we analyze the stochastic boundary problem of emergence of anomalous high structures on the sea surface. The kinematic boundary condition on the sea surface is assumed to be a closed stochastic quasilinear equation. Applying the stochastic Liouville equation, and presuming the stochastic nature of a given hydrodynamic velocity field within the diffusion approximation, we derive an equation for a spatially single-point, simultaneous joint probability density of the surface elevation field and its gradient. An important feature of the model is that it accounts for stochastic bottom irregularities as one, but not a single, perturbation. Hence, we address the assumption of the infinitely deep ocean to obtain statistic features of the surface elevation field and the squared elevation gradient field. According to the calculations, we show that clustering in the absolute surface elevation gradient field happens with the unit probability. It results in the emergence of rare events such as anomalous high structures and deep gaps on the sea surface almost in every realization of a stochastic velocity field.
Learning Midlevel Auditory Codes from Natural Sound Statistics.
Młynarski, Wiktor; McDermott, Josh H
2018-03-01
Interaction with the world requires an organism to transform sensory signals into representations in which behaviorally meaningful properties of the environment are made explicit. These representations are derived through cascades of neuronal processing stages in which neurons at each stage recode the output of preceding stages. Explanations of sensory coding may thus involve understanding how low-level patterns are combined into more complex structures. To gain insight into such midlevel representations for sound, we designed a hierarchical generative model of natural sounds that learns combinations of spectrotemporal features from natural stimulus statistics. In the first layer, the model forms a sparse convolutional code of spectrograms using a dictionary of learned spectrotemporal kernels. To generalize from specific kernel activation patterns, the second layer encodes patterns of time-varying magnitude of multiple first-layer coefficients. When trained on corpora of speech and environmental sounds, some second-layer units learned to group similar spectrotemporal features. Others instantiate opponency between distinct sets of features. Such groupings might be instantiated by neurons in the auditory cortex, providing a hypothesis for midlevel neuronal computation.
Cloud field classification based on textural features
NASA Technical Reports Server (NTRS)
Sengupta, Sailes Kumar
1989-01-01
An essential component in global climate research is accurate cloud cover and type determination. Of the two approaches to texture-based classification (statistical and textural), only the former is effective in the classification of natural scenes such as land, ocean, and atmosphere. In the statistical approach that was adopted, parameters characterizing the stochastic properties of the spatial distribution of grey levels in an image are estimated and then used as features for cloud classification. Two types of textural measures were used. One is based on the distribution of the grey level difference vector (GLDV), and the other on a set of textural features derived from the MaxMin cooccurrence matrix (MMCM). The GLDV method looks at the difference D of grey levels at pixels separated by a horizontal distance d and computes several statistics based on this distribution. These are then used as features in subsequent classification. The MaxMin tectural features on the other hand are based on the MMCM, a matrix whose (I,J)th entry give the relative frequency of occurrences of the grey level pair (I,J) that are consecutive and thresholded local extremes separated by a given pixel distance d. Textural measures are then computed based on this matrix in much the same manner as is done in texture computation using the grey level cooccurrence matrix. The database consists of 37 cloud field scenes from LANDSAT imagery using a near IR visible channel. The classification algorithm used is the well known Stepwise Discriminant Analysis. The overall accuracy was estimated by the percentage or correct classifications in each case. It turns out that both types of classifiers, at their best combination of features, and at any given spatial resolution give approximately the same classification accuracy. A neural network based classifier with a feed forward architecture and a back propagation training algorithm is used to increase the classification accuracy, using these two classes of features. Preliminary results based on the GLDV textural features alone look promising.
Hafen, G M; Hurst, C; Yearwood, J; Smith, J; Dzalilov, Z; Robinson, P J
2008-10-05
Cystic fibrosis is the most common fatal genetic disorder in the Caucasian population. Scoring systems for assessment of Cystic fibrosis disease severity have been used for almost 50 years, without being adapted to the milder phenotype of the disease in the 21st century. The aim of this current project is to develop a new scoring system using a database and employing various statistical tools. This study protocol reports the development of the statistical tools in order to create such a scoring system. The evaluation is based on the Cystic Fibrosis database from the cohort at the Royal Children's Hospital in Melbourne. Initially, unsupervised clustering of the all data records was performed using a range of clustering algorithms. In particular incremental clustering algorithms were used. The clusters obtained were characterised using rules from decision trees and the results examined by clinicians. In order to obtain a clearer definition of classes expert opinion of each individual's clinical severity was sought. After data preparation including expert-opinion of an individual's clinical severity on a 3 point-scale (mild, moderate and severe disease), two multivariate techniques were used throughout the analysis to establish a method that would have a better success in feature selection and model derivation: 'Canonical Analysis of Principal Coordinates' and 'Linear Discriminant Analysis'. A 3-step procedure was performed with (1) selection of features, (2) extracting 5 severity classes out of a 3 severity class as defined per expert-opinion and (3) establishment of calibration datasets. (1) Feature selection: CAP has a more effective "modelling" focus than DA.(2) Extraction of 5 severity classes: after variables were identified as important in discriminating contiguous CF severity groups on the 3-point scale as mild/moderate and moderate/severe, Discriminant Function (DF) was used to determine the new groups mild, intermediate moderate, moderate, intermediate severe and severe disease. (3) Generated confusion tables showed a misclassification rate of 19.1% for males and 16.5% for females, with a majority of misallocations into adjacent severity classes particularly for males. Our preliminary data show that using CAP for detection of selection features and Linear DA to derive the actual model in a CF database might be helpful in developing a scoring system. However, there are several limitations, particularly more data entry points are needed to finalize a score and the statistical tools have further to be refined and validated, with re-running the statistical methods in the larger dataset.
Yang, Jaw-Yen; Yan, Chih-Yuan; Diaz, Manuel; Huang, Juan-Chen; Li, Zhihui; Zhang, Hanxin
2014-01-01
The ideal quantum gas dynamics as manifested by the semiclassical ellipsoidal-statistical (ES) equilibrium distribution derived in Wu et al. (Wu et al. 2012 Proc. R. Soc. A 468, 1799–1823 (doi:10.1098/rspa.2011.0673)) is numerically studied for particles of three statistics. This anisotropic ES equilibrium distribution was derived using the maximum entropy principle and conserves the mass, momentum and energy, but differs from the standard Fermi–Dirac or Bose–Einstein distribution. The present numerical method combines the discrete velocity (or momentum) ordinate method in momentum space and the high-resolution shock-capturing method in physical space. A decoding procedure to obtain the necessary parameters for determining the ES distribution is also devised. Computations of two-dimensional Riemann problems are presented, and various contours of the quantities unique to this ES model are illustrated. The main flow features, such as shock waves, expansion waves and slip lines and their complex nonlinear interactions, are depicted and found to be consistent with existing calculations for a classical gas. PMID:24399919
Huynh-Thu, Vân Anh; Saeys, Yvan; Wehenkel, Louis; Geurts, Pierre
2012-07-01
Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can only identify variables that provide a significant amount of information in isolation from the other variables. As biological processes are expected to involve complex interactions between variables, univariate methods thus potentially miss some informative biomarkers. Variable relevance scores provided by machine learning techniques, however, are potentially able to highlight multivariate interacting effects, but unlike the p-values returned by univariate tests, these relevance scores are usually not statistically interpretable. This lack of interpretability hampers the determination of a relevance threshold for extracting a feature subset from the rankings and also prevents the wide adoption of these methods by practicians. We evaluated several, existing and novel, procedures that extract relevant features from rankings derived from machine learning approaches. These procedures replace the relevance scores with measures that can be interpreted in a statistical way, such as p-values, false discovery rates, or family wise error rates, for which it is easier to determine a significance level. Experiments were performed on several artificial problems as well as on real microarray datasets. Although the methods differ in terms of computing times and the tradeoff, they achieve in terms of false positives and false negatives, some of them greatly help in the extraction of truly relevant biomarkers and should thus be of great practical interest for biologists and physicians. As a side conclusion, our experiments also clearly highlight that using model performance as a criterion for feature selection is often counter-productive. Python source codes of all tested methods, as well as the MATLAB scripts used for data simulation, can be found in the Supplementary Material.
Bremer, Peer-Timo; Weber, Gunther; Tierny, Julien; Pascucci, Valerio; Day, Marcus S; Bell, John B
2011-09-01
Large-scale simulations are increasingly being used to study complex scientific and engineering phenomena. As a result, advanced visualization and data analysis are also becoming an integral part of the scientific process. Often, a key step in extracting insight from these large simulations involves the definition, extraction, and evaluation of features in the space and time coordinates of the solution. However, in many applications, these features involve a range of parameters and decisions that will affect the quality and direction of the analysis. Examples include particular level sets of a specific scalar field, or local inequalities between derived quantities. A critical step in the analysis is to understand how these arbitrary parameters/decisions impact the statistical properties of the features, since such a characterization will help to evaluate the conclusions of the analysis as a whole. We present a new topological framework that in a single-pass extracts and encodes entire families of possible features definitions as well as their statistical properties. For each time step we construct a hierarchical merge tree a highly compact, yet flexible feature representation. While this data structure is more than two orders of magnitude smaller than the raw simulation data it allows us to extract a set of features for any given parameter selection in a postprocessing step. Furthermore, we augment the trees with additional attributes making it possible to gather a large number of useful global, local, as well as conditional statistic that would otherwise be extremely difficult to compile. We also use this representation to create tracking graphs that describe the temporal evolution of the features over time. Our system provides a linked-view interface to explore the time-evolution of the graph interactively alongside the segmentation, thus making it possible to perform extensive data analysis in a very efficient manner. We demonstrate our framework by extracting and analyzing burning cells from a large-scale turbulent combustion simulation. In particular, we show how the statistical analysis enabled by our techniques provides new insight into the combustion process.
Multi-scale radiomic analysis of sub-cortical regions in MRI related to autism, gender and age
NASA Astrophysics Data System (ADS)
Chaddad, Ahmad; Desrosiers, Christian; Toews, Matthew
2017-03-01
We propose using multi-scale image textures to investigate links between neuroanatomical regions and clinical variables in MRI. Texture features are derived at multiple scales of resolution based on the Laplacian-of-Gaussian (LoG) filter. Three quantifier functions (Average, Standard Deviation and Entropy) are used to summarize texture statistics within standard, automatically segmented neuroanatomical regions. Significance tests are performed to identify regional texture differences between ASD vs. TDC and male vs. female groups, as well as correlations with age (corrected p < 0.05). The open-access brain imaging data exchange (ABIDE) brain MRI dataset is used to evaluate texture features derived from 31 brain regions from 1112 subjects including 573 typically developing control (TDC, 99 females, 474 males) and 539 Autism spectrum disorder (ASD, 65 female and 474 male) subjects. Statistically significant texture differences between ASD vs. TDC groups are identified asymmetrically in the right hippocampus, left choroid-plexus and corpus callosum (CC), and symmetrically in the cerebellar white matter. Sex-related texture differences in TDC subjects are found in primarily in the left amygdala, left cerebellar white matter, and brain stem. Correlations between age and texture in TDC subjects are found in the thalamus-proper, caudate and pallidum, most exhibiting bilateral symmetry.
A Stochastic Model of Space-Time Variability of Mesoscale Rainfall: Statistics of Spatial Averages
NASA Technical Reports Server (NTRS)
Kundu, Prasun K.; Bell, Thomas L.
2003-01-01
A characteristic feature of rainfall statistics is that they depend on the space and time scales over which rain data are averaged. A previously developed spectral model of rain statistics that is designed to capture this property, predicts power law scaling behavior for the second moment statistics of area-averaged rain rate on the averaging length scale L as L right arrow 0. In the present work a more efficient method of estimating the model parameters is presented, and used to fit the model to the statistics of area-averaged rain rate derived from gridded radar precipitation data from TOGA COARE. Statistical properties of the data and the model predictions are compared over a wide range of averaging scales. An extension of the spectral model scaling relations to describe the dependence of the average fraction of grid boxes within an area containing nonzero rain (the "rainy area fraction") on the grid scale L is also explored.
A Set of Handwriting Features for Use in Automated Writer Identification.
Miller, John J; Patterson, Robert Bradley; Gantz, Donald T; Saunders, Christopher P; Walch, Mark A; Buscaglia, JoAnn
2017-05-01
A writer's biometric identity can be characterized through the distribution of physical feature measurements ("writer's profile"); a graph-based system that facilitates the quantification of these features is described. To accomplish this quantification, handwriting is segmented into basic graphical forms ("graphemes"), which are "skeletonized" to yield the graphical topology of the handwritten segment. The graph-based matching algorithm compares the graphemes first by their graphical topology and then by their geometric features. Graphs derived from known writers can be compared against graphs extracted from unknown writings. The process is computationally intensive and relies heavily upon statistical pattern recognition algorithms. This article focuses on the quantification of these physical features and the construction of the associated pattern recognition methods for using the features to discriminate among writers. The graph-based system described in this article has been implemented in a highly accurate and approximately language-independent biometric recognition system of writers of cursive documents. © 2017 American Academy of Forensic Sciences.
Texture analysis of pulmonary parenchyma in normal and emphysematous lung
NASA Astrophysics Data System (ADS)
Uppaluri, Renuka; Mitsa, Theophano; Hoffman, Eric A.; McLennan, Geoffrey; Sonka, Milan
1996-04-01
Tissue characterization using texture analysis is gaining increasing importance in medical imaging. We present a completely automated method for discriminating between normal and emphysematous regions from CT images. This method involves extracting seventeen features which are based on statistical, hybrid and fractal texture models. The best subset of features is derived from the training set using the divergence technique. A minimum distance classifier is used to classify the samples into one of the two classes--normal and emphysema. Sensitivity and specificity and accuracy values achieved were 80% or greater in most cases proving that texture analysis holds great promise in identifying emphysema.
Stratiform/convective rain delineation for TRMM microwave imager
NASA Astrophysics Data System (ADS)
Islam, Tanvir; Srivastava, Prashant K.; Dai, Qiang; Gupta, Manika; Wan Jaafar, Wan Zurina
2015-10-01
This article investigates the potential for using machine learning algorithms to delineate stratiform/convective (S/C) rain regimes for passive microwave imager taking calibrated brightness temperatures as only spectral parameters. The algorithms have been implemented for the Tropical Rainfall Measuring Mission (TRMM) microwave imager (TMI), and calibrated as well as validated taking the Precipitation Radar (PR) S/C information as the target class variables. Two different algorithms are particularly explored for the delineation. The first one is metaheuristic adaptive boosting algorithm that includes the real, gentle, and modest versions of the AdaBoost. The second one is the classical linear discriminant analysis that includes the Fisher's and penalized versions of the linear discriminant analysis. Furthermore, prior to the development of the delineation algorithms, a feature selection analysis has been conducted for a total of 85 features, which contains the combinations of brightness temperatures from 10 GHz to 85 GHz and some derived indexes, such as scattering index, polarization corrected temperature, and polarization difference with the help of mutual information aided minimal redundancy maximal relevance criterion (mRMR). It has been found that the polarization corrected temperature at 85 GHz and the features derived from the "addition" operator associated with the 85 GHz channels have good statistical dependency to the S/C target class variables. Further, it has been shown how the mRMR feature selection technique helps to reduce the number of features without deteriorating the results when applying through the machine learning algorithms. The proposed scheme is able to delineate the S/C rain regimes with reasonable accuracy. Based on the statistical validation experience from the validation period, the Matthews correlation coefficients are in the range of 0.60-0.70. Since, the proposed method does not rely on any a priori information, this makes it very suitable for other microwave sensors having similar channels to the TMI. The method could possibly benefit the constellation sensors in the Global Precipitation Measurement (GPM) mission era.
Assessment of features for automatic CTG analysis based on expert annotation.
Chudácek, Vacláv; Spilka, Jirí; Lhotská, Lenka; Janku, Petr; Koucký, Michal; Huptych, Michal; Bursa, Miroslav
2011-01-01
Cardiotocography (CTG) is the monitoring of fetal heart rate (FHR) and uterine contractions (TOCO) since 1960's used routinely by obstetricians to detect fetal hypoxia. The evaluation of the FHR in clinical settings is based on an evaluation of macroscopic morphological features and so far has managed to avoid adopting any achievements from the HRV research field. In this work, most of the ever-used features utilized for FHR characterization, including FIGO, HRV, nonlinear, wavelet, and time and frequency domain features, are investigated and the features are assessed based on their statistical significance in the task of distinguishing the FHR into three FIGO classes. Annotation derived from the panel of experts instead of the commonly utilized pH values was used for evaluation of the features on a large data set (552 records). We conclude the paper by presenting the best uncorrelated features and their individual rank of importance according to the meta-analysis of three different ranking methods. Number of acceleration and deceleration, interval index, as well as Lempel-Ziv complexity and Higuchi's fractal dimension are among the top five features.
NASA Technical Reports Server (NTRS)
Abbey, Craig K.; Eckstein, Miguel P.
2002-01-01
We consider estimation and statistical hypothesis testing on classification images obtained from the two-alternative forced-choice experimental paradigm. We begin with a probabilistic model of task performance for simple forced-choice detection and discrimination tasks. Particular attention is paid to general linear filter models because these models lead to a direct interpretation of the classification image as an estimate of the filter weights. We then describe an estimation procedure for obtaining classification images from observer data. A number of statistical tests are presented for testing various hypotheses from classification images based on some more compact set of features derived from them. As an example of how the methods we describe can be used, we present a case study investigating detection of a Gaussian bump profile.
Cloud and surface textural features in polar regions
NASA Technical Reports Server (NTRS)
Welch, Ronald M.; Kuo, Kwo-Sen; Sengupta, Sailes K.
1990-01-01
The study examines the textural signatures of clouds, ice-covered mountains, solid and broken sea ice and floes, and open water. The textural features are computed from sum and difference histogram and gray-level difference vector statistics defined at various pixel displacement distances derived from Landsat multispectral scanner data. Polar cloudiness, snow-covered mountainous regions, solid sea ice, glaciers, and open water have distinguishable texture features. This suggests that textural measures can be successfully applied to the detection of clouds over snow-covered mountains, an ability of considerable importance for the modeling of snow-melt runoff. However, broken stratocumulus cloud decks and thin cirrus over broken sea ice remain difficult to distinguish texturally. It is concluded that even with high spatial resolution imagery, it may not be possible to distinguish broken stratocumulus and thin clouds from sea ice in the marginal ice zone using the visible channel textural features alone.
Spectroscopic Diagnosis of Arsenic Contamination in Agricultural Soils
Shi, Tiezhu; Liu, Huizeng; Chen, Yiyun; Fei, Teng; Wang, Junjie; Wu, Guofeng
2017-01-01
This study investigated the abilities of pre-processing, feature selection and machine-learning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative scatter correction, standard normal variate, and mean centering. Principle component analysis (PCA) and the RELIEF algorithm were used to extract spectral features. Machine-learning methods, including random forests (RF), artificial neural network (ANN), radial basis function- and linear function- based support vector machine (RBF- and LF-SVM) were employed for establishing diagnosis models. The model accuracies were evaluated and compared by using overall accuracies (OAs). The statistical significance of the difference between models was evaluated by using McNemar’s test (Z value). The results showed that the OAs varied with the different combinations of pre-processing, feature selection, and classification methods. Feature selection methods could improve the modeling efficiencies and diagnosis accuracies, and RELIEF often outperformed PCA. The optimal models established by RF (OA = 86%), ANN (OA = 89%), RBF- (OA = 89%) and LF-SVM (OA = 87%) had no statistical difference in diagnosis accuracies (Z < 1.96, p < 0.05). These results indicated that it was feasible to diagnose soil arsenic contamination using reflectance spectroscopy. The appropriate combination of multivariate methods was important to improve diagnosis accuracies. PMID:28471412
Che Hasan, Rozaimi; Ierodiaconou, Daniel; Laurenson, Laurie; Schimel, Alexandre
2014-01-01
Multibeam echosounders (MBES) are increasingly becoming the tool of choice for marine habitat mapping applications. In turn, the rapid expansion of habitat mapping studies has resulted in a need for automated classification techniques to efficiently map benthic habitats, assess confidence in model outputs, and evaluate the importance of variables driving the patterns observed. The benthic habitat characterisation process often involves the analysis of MBES bathymetry, backscatter mosaic or angular response with observation data providing ground truth. However, studies that make use of the full range of MBES outputs within a single classification process are limited. We present an approach that integrates backscatter angular response with MBES bathymetry, backscatter mosaic and their derivatives in a classification process using a Random Forests (RF) machine-learning algorithm to predict the distribution of benthic biological habitats. This approach includes a method of deriving statistical features from backscatter angular response curves created from MBES data collated within homogeneous regions of a backscatter mosaic. Using the RF algorithm we assess the relative importance of each variable in order to optimise the classification process and simplify models applied. The results showed that the inclusion of the angular response features in the classification process improved the accuracy of the final habitat maps from 88.5% to 93.6%. The RF algorithm identified bathymetry and the angular response mean as the two most important predictors. However, the highest classification rates were only obtained after incorporating additional features derived from bathymetry and the backscatter mosaic. The angular response features were found to be more important to the classification process compared to the backscatter mosaic features. This analysis indicates that integrating angular response information with bathymetry and the backscatter mosaic, along with their derivatives, constitutes an important improvement for studying the distribution of benthic habitats, which is necessary for effective marine spatial planning and resource management. PMID:24824155
A graph-based watershed merging using fuzzy C-means and simulated annealing for image segmentation
NASA Astrophysics Data System (ADS)
Vadiveloo, Mogana; Abdullah, Rosni; Rajeswari, Mandava
2015-12-01
In this paper, we have addressed the issue of over-segmented regions produced in watershed by merging the regions using global feature. The global feature information is obtained from clustering the image in its feature space using Fuzzy C-Means (FCM) clustering. The over-segmented regions produced by performing watershed on the gradient of the image are then mapped to this global information in the feature space. Further to this, the global feature information is optimized using Simulated Annealing (SA). The optimal global feature information is used to derive the similarity criterion to merge the over-segmented watershed regions which are represented by the region adjacency graph (RAG). The proposed method has been tested on digital brain phantom simulated dataset to segment white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF) soft tissues regions. The experiments showed that the proposed method performs statistically better, with average of 95.242% regions are merged, than the immersion watershed and average accuracy improvement of 8.850% in comparison with RAG-based immersion watershed merging using global and local features.
Identifying sports videos using replay, text, and camera motion features
NASA Astrophysics Data System (ADS)
Kobla, Vikrant; DeMenthon, Daniel; Doermann, David S.
1999-12-01
Automated classification of digital video is emerging as an important piece of the puzzle in the design of content management systems for digital libraries. The ability to classify videos into various classes such as sports, news, movies, or documentaries, increases the efficiency of indexing, browsing, and retrieval of video in large databases. In this paper, we discuss the extraction of features that enable identification of sports videos directly from the compressed domain of MPEG video. These features include detecting the presence of action replays, determining the amount of scene text in vide, and calculating various statistics on camera and/or object motion. The features are derived from the macroblock, motion,and bit-rate information that is readily accessible from MPEG video with very minimal decoding, leading to substantial gains in processing speeds. Full-decoding of selective frames is required only for text analysis. A decision tree classifier built using these features is able to identify sports clips with an accuracy of about 93 percent.
Terrain-driven unstructured mesh development through semi-automatic vertical feature extraction
NASA Astrophysics Data System (ADS)
Bilskie, Matthew V.; Coggin, David; Hagen, Scott C.; Medeiros, Stephen C.
2015-12-01
A semi-automated vertical feature terrain extraction algorithm is described and applied to a two-dimensional, depth-integrated, shallow water equation inundation model. The extracted features describe what are commonly sub-mesh scale elevation details (ridge and valleys), which may be ignored in standard practice because adequate mesh resolution cannot be afforded. The extraction algorithm is semi-automated, requires minimal human intervention, and is reproducible. A lidar-derived digital elevation model (DEM) of coastal Mississippi and Alabama serves as the source data for the vertical feature extraction. Unstructured mesh nodes and element edges are aligned to the vertical features and an interpolation algorithm aimed at minimizing topographic elevation error assigns elevations to mesh nodes via the DEM. The end result is a mesh that accurately represents the bare earth surface as derived from lidar with element resolution in the floodplain ranging from 15 m to 200 m. To examine the influence of the inclusion of vertical features on overland flooding, two additional meshes were developed, one without crest elevations of the features and another with vertical features withheld. All three meshes were incorporated into a SWAN+ADCIRC model simulation of Hurricane Katrina. Each of the three models resulted in similar validation statistics when compared to observed time-series water levels at gages and post-storm collected high water marks. Simulated water level peaks yielded an R2 of 0.97 and upper and lower 95% confidence interval of ∼ ± 0.60 m. From the validation at the gages and HWM locations, it was not clear which of the three model experiments performed best in terms of accuracy. Examination of inundation extent among the three model results were compared to debris lines derived from NOAA post-event aerial imagery, and the mesh including vertical features showed higher accuracy. The comparison of model results to debris lines demonstrates that additional validation techniques are necessary for state-of-the-art flood inundation models. In addition, the semi-automated, unstructured mesh generation process presented herein increases the overall accuracy of simulated storm surge across the floodplain without reliance on hand digitization or sacrificing computational cost.
Recognition of speaker-dependent continuous speech with KEAL
NASA Astrophysics Data System (ADS)
Mercier, G.; Bigorgne, D.; Miclet, L.; Le Guennec, L.; Querre, M.
1989-04-01
A description of the speaker-dependent continuous speech recognition system KEAL is given. An unknown utterance, is recognized by means of the followng procedures: acoustic analysis, phonetic segmentation and identification, word and sentence analysis. The combination of feature-based, speaker-independent coarse phonetic segmentation with speaker-dependent statistical classification techniques is one of the main design features of the acoustic-phonetic decoder. The lexical access component is essentially based on a statistical dynamic programming technique which aims at matching a phonemic lexical entry containing various phonological forms, against a phonetic lattice. Sentence recognition is achieved by use of a context-free grammar and a parsing algorithm derived from Earley's parser. A speaker adaptation module allows some of the system parameters to be adjusted by matching known utterances with their acoustical representation. The task to be performed, described by its vocabulary and its grammar, is given as a parameter of the system. Continuously spoken sentences extracted from a 'pseudo-Logo' language are analyzed and results are presented.
NASA Astrophysics Data System (ADS)
Zhang, Wenlan; Luo, Ting; Jiang, Gangyi; Jiang, Qiuping; Ying, Hongwei; Lu, Jing
2016-06-01
Visual comfort assessment (VCA) for stereoscopic images is a particularly significant yet challenging task in 3D quality of experience research field. Although the subjective assessment given by human observers is known as the most reliable way to evaluate the experienced visual discomfort, it is time-consuming and non-systematic. Therefore, it is of great importance to develop objective VCA approaches that can faithfully predict the degree of visual discomfort as human beings do. In this paper, a novel two-stage objective VCA framework is proposed. The main contribution of this study is that the important visual attention mechanism of human visual system is incorporated for visual comfort-aware feature extraction. Specifically, in the first stage, we first construct an adaptive 3D visual saliency detection model to derive saliency map of a stereoscopic image, and then a set of saliency-weighted disparity statistics are computed and combined to form a single feature vector to represent a stereoscopic image in terms of visual comfort. In the second stage, a high dimensional feature vector is fused into a single visual comfort score by performing random forest algorithm. Experimental results on two benchmark databases confirm the superior performance of the proposed approach.
Sturm, Marc; Quinten, Sascha; Huber, Christian G.; Kohlbacher, Oliver
2007-01-01
We propose a new model for predicting the retention time of oligonucleotides. The model is based on ν support vector regression using features derived from base sequence and predicted secondary structure of oligonucleotides. Because of the secondary structure information, the model is applicable even at relatively low temperatures where the secondary structure is not suppressed by thermal denaturing. This makes the prediction of oligonucleotide retention time for arbitrary temperatures possible, provided that the target temperature lies within the temperature range of the training data. We describe different possibilities of feature calculation from base sequence and secondary structure, present the results and compare our model to existing models. PMID:17567619
IDH mutation assessment of glioma using texture features of multimodal MR images
NASA Astrophysics Data System (ADS)
Zhang, Xi; Tian, Qiang; Wu, Yu-Xia; Xu, Xiao-Pan; Li, Bao-Juan; Liu, Yi-Xiong; Liu, Yang; Lu, Hong-Bing
2017-03-01
Purpose: To 1) find effective texture features from multimodal MRI that can distinguish IDH mutant and wild status, and 2) propose a radiomic strategy for preoperatively detecting IDH mutation patients with glioma. Materials and Methods: 152 patients with glioma were retrospectively included from the Cancer Genome Atlas. Corresponding T1-weighted image before- and post-contrast, T2-weighted image and fluid-attenuation inversion recovery image from the Cancer Imaging Archive were analyzed. Specific statistical tests were applied to analyze the different kind of baseline information of LrGG patients. Finally, 168 texture features were derived from multimodal MRI per patient. Then the support vector machine-based recursive feature elimination (SVM-RFE) and classification strategy was adopted to find the optimal feature subset and build the identification models for detecting the IDH mutation. Results: Among 152 patients, 92 and 60 were confirmed to be IDH-wild and mutant, respectively. Statistical analysis showed that the patients without IDH mutation was significant older than patients with IDH mutation (p<0.01), and the distribution of some histological subtypes was significant different between IDH wild and mutant groups (p<0.01). After SVM-RFE, 15 optimal features were determined for IDH mutation detection. The accuracy, sensitivity, specificity, and AUC after SVM-RFE and parameter optimization were 82.2%, 85.0%, 78.3%, and 0.841, respectively. Conclusion: This study presented a radiomic strategy for noninvasively discriminating IDH mutation of patients with glioma. It effectively incorporated kinds of texture features from multimodal MRI, and SVM-based classification strategy. Results suggested that features selected from SVM-RFE were more potential to identifying IDH mutation. The proposed radiomics strategy could facilitate the clinical decision making in patients with glioma.
Progress in Turbulence Detection via GNSS Occultation Data
NASA Technical Reports Server (NTRS)
Cornman, L. B.; Goodrich, R. K.; Axelrad, P.; Barlow, E.
2012-01-01
The increased availability of radio occultation (RO) data offers the ability to detect and study turbulence in the Earth's atmosphere. An analysis of how RO data can be used to determine the strength and location of turbulent regions is presented. This includes the derivation of a model for the power spectrum of the log-amplitude and phase fluctuations of the permittivity (or index of refraction) field. The bulk of the paper is then concerned with the estimation of the model parameters. Parameter estimators are introduced and some of their statistical properties are studied. These estimators are then applied to simulated log-amplitude RO signals. This includes the analysis of global statistics derived from a large number of realizations, as well as case studies that illustrate various specific aspects of the problem. Improvements to the basic estimation methods are discussed, and their beneficial properties are illustrated. The estimation techniques are then applied to real occultation data. Only two cases are presented, but they illustrate some of the salient features inherent in real data.
Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
NASA Astrophysics Data System (ADS)
Adams, W. H.; Iyengar, Giridharan; Lin, Ching-Yung; Naphade, Milind Ramesh; Neti, Chalapathy; Nock, Harriet J.; Smith, John R.
2003-12-01
We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM), hidden Markov models (HMM), and support vector machines (SVM). Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.
Gershunov, A.; Barnett, T.P.; Cayan, D.R.; Tubbs, T.; Goddard, L.
2000-01-01
Three long-range forecasting methods have been evaluated for prediction and downscaling of seasonal and intraseasonal precipitation statistics in California. Full-statistical, hybrid-dynamical - statistical and full-dynamical approaches have been used to forecast El Nin??o - Southern Oscillation (ENSO) - related total precipitation, daily precipitation frequency, and average intensity anomalies during the January - March season. For El Nin??o winters, the hybrid approach emerges as the best performer, while La Nin??a forecasting skill is poor. The full-statistical forecasting method features reasonable forecasting skill for both La Nin??a and El Nin??o winters. The performance of the full-dynamical approach could not be evaluated as rigorously as that of the other two forecasting schemes. Although the full-dynamical forecasting approach is expected to outperform simpler forecasting schemes in the long run, evidence is presented to conclude that, at present, the full-dynamical forecasting approach is the least viable of the three, at least in California. The authors suggest that operational forecasting of any intraseasonal temperature, precipitation, or streamflow statistic derivable from the available records is possible now for ENSO-extreme years.
Comparative docking and CoMFA analysis of curcumine derivatives as HIV-1 integrase inhibitors.
Gupta, Pawan; Garg, Prabha; Roy, Nilanjan
2011-08-01
The docking studies and comparative molecular field analysis (CoMFA) were performed on highly active molecules of curcumine derivatives against 3' processing activity of HIV-1 integrase (IN) enzyme. The optimum CoMFA model was selected with statistically significant cross-validated r(2) value of 0.815 and non-cross validated r (2) value of 0.99. The common pharmacophore of highly active molecules was used for screening of HIV-1 IN inhibitors. The high contribution of polar interactions in pharmacophore mapping is well supported by docking and CoMFA results. The results of docking, CoMFA, and pharmacophore mapping give structural insights as well as important binding features of curcumine derivatives as HIV-1 IN inhibitors which can provide guidance for the rational design of novel HIV-1 IN inhibitors.
Gao, Dashan; Vasconcelos, Nuno
2009-01-01
A decision-theoretic formulation of visual saliency, first proposed for top-down processing (object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint of computational parsimony. The saliency of the visual features at a given location of the visual field is defined as the power of those features to discriminate between the stimulus at the location and a null hypothesis. For bottom-up saliency, this is the set of visual features that surround the location under consideration. Discrimination is defined in an information-theoretic sense and the optimal saliency detector derived for a class of stimuli that complies with known statistical properties of natural images. It is shown that under the assumption that saliency is driven by linear filtering, the optimal detector consists of what is usually referred to as the standard architecture of V1: a cascade of linear filtering, divisive normalization, rectification, and spatial pooling. The optimal detector is also shown to replicate the fundamental properties of the psychophysics of saliency: stimulus pop-out, saliency asymmetries for stimulus presence versus absence, disregard of feature conjunctions, and Weber's law. Finally, it is shown that the optimal saliency architecture can be applied to the solution of generic inference problems. In particular, for the class of stimuli studied, it performs the three fundamental operations of statistical inference: assessment of probabilities, implementation of Bayes decision rule, and feature selection.
Prediction of survival with multi-scale radiomic analysis in glioblastoma patients.
Chaddad, Ahmad; Sabri, Siham; Niazi, Tamim; Abdulkarim, Bassam
2018-06-19
We propose a multiscale texture features based on Laplacian-of Gaussian (LoG) filter to predict progression free (PFS) and overall survival (OS) in patients newly diagnosed with glioblastoma (GBM). Experiments use the extracted features derived from 40 patients of GBM with T1-weighted imaging (T1-WI) and Fluid-attenuated inversion recovery (FLAIR) images that were segmented manually into areas of active tumor, necrosis, and edema. Multiscale texture features were extracted locally from each of these areas of interest using a LoG filter and the relation between features to OS and PFS was investigated using univariate (i.e., Spearman's rank correlation coefficient, log-rank test and Kaplan-Meier estimator) and multivariate analyses (i.e., Random Forest classifier). Three and seven features were statistically correlated with PFS and OS, respectively, with absolute correlation values between 0.32 and 0.36 and p < 0.05. Three features derived from active tumor regions only were associated with OS (p < 0.05) with hazard ratios (HR) of 2.9, 3, and 3.24, respectively. Combined features showed an AUC value of 85.37 and 85.54% for predicting the PFS and OS of GBM patients, respectively, using the random forest (RF) classifier. We presented a multiscale texture features to characterize the GBM regions and predict he PFS and OS. The efficiency achievable suggests that this technique can be developed into a GBM MR analysis system suitable for clinical use after a thorough validation involving more patients. Graphical abstract Scheme of the proposed model for characterizing the heterogeneity of GBM regions and predicting the overall survival and progression free survival of GBM patients. (1) Acquisition of pretreatment MRI images; (2) Affine registration of T1-WI image with its corresponding FLAIR images, and GBM subtype (phenotypes) labelling; (3) Extraction of nine texture features from the three texture scales fine, medium, and coarse derived from each of GBM regions; (4) Comparing heterogeneity between GBM regions by ANOVA test; Survival analysis using Univariate (Spearman rank correlation between features and survival (i.e., PFS and OS) based on each of the GBM regions, Kaplan-Meier estimator and log-rank test to predict the PFS and OS of patient groups that grouped based on median of feature), and multivariate (random forest model) for predicting the PFS and OS of patients groups that grouped based on median of PFS and OS.
NASA Astrophysics Data System (ADS)
Walz, Michael; Leckebusch, Gregor C.
2016-04-01
Extratropical wind storms pose one of the most dangerous and loss intensive natural hazards for Europe. However, due to only 50 years of high quality observational data, it is difficult to assess the statistical uncertainty of these sparse events just based on observations. Over the last decade seasonal ensemble forecasts have become indispensable in quantifying the uncertainty of weather prediction on seasonal timescales. In this study seasonal forecasts are used in a climatological context: By making use of the up to 51 ensemble members, a broad and physically consistent statistical base can be created. This base can then be used to assess the statistical uncertainty of extreme wind storm occurrence more accurately. In order to determine the statistical uncertainty of storms with different paths of progression, a probabilistic clustering approach using regression mixture models is used to objectively assign storm tracks (either based on core pressure or on extreme wind speeds) to different clusters. The advantage of this technique is that the entire lifetime of a storm is considered for the clustering algorithm. Quadratic curves are found to describe the storm tracks most accurately. Three main clusters (diagonal, horizontal or vertical progression of the storm track) can be identified, each of which have their own particulate features. Basic storm features like average velocity and duration are calculated and compared for each cluster. The main benefit of this clustering technique, however, is to evaluate if the clusters show different degrees of uncertainty, e.g. more (less) spread for tracks approaching Europe horizontally (diagonally). This statistical uncertainty is compared for different seasonal forecast products.
NASA Astrophysics Data System (ADS)
Ruohoniemi, J. M.; Maimaiti, M.; Baker, J. B.; Ribeiro, A. J.
2017-12-01
Previous studies have shown that during quiet geomagnetic conditions F-region subauroral ionospheric plasma exhibits drifts of a few tens of m/s, predominantly in the westward direction. However, the exact driving mechanisms for this plasma motion are still not well understood. Recent expansion of SuperDARN radars into the mid-latitude region has provided new opportunities to study subauroral ionospheric convection over large areas and with greater spatial resolution and statistical significance than previously possible. Mid-latitude SuperDARN radars tend to observe subauroral ionospheric backscatter with low Doppler velocities on most geomagnetically quiet nights. In this study, we have used two years of data obtained from the six mid-latitude SuperDARN radars in the North American sector to derive a statistical model of quiet-time nightside mid-latitude plasma convection between 52°- 58° magnetic latitude. The model is organized in MLAT-MLT coordinates and has a spatial resolution of 1°x 7 min with each grid cell typically counting thousands of velocity measurements. Our results show that the flow is predominantly westward (20 - 60 m/s) and weakly northward (0 -20 m/s) near midnight but with a strong seasonal dependence such that the flows tend to be strongest and most spatially variable in winter. These statistical results are in good agreement with previously reported observations from ISR measurements but also show some interesting new features, one being a significant latitudinal variation of zonal flow velocity near midnight in winter. In this presentation, we describe the derivation of the nightside quite-time subauroral convection model, analyze its most prominent features, and discuss the results in terms of the Ionosphere-Thermosphere coupling and penetration electric fields.
A statistical parts-based appearance model of inter-subject variability.
Toews, Matthew; Collins, D Louis; Arbel, Tal
2006-01-01
In this article, we present a general statistical parts-based model for representing the appearance of an image set, applied to the problem of inter-subject MR brain image matching. In contrast with global image representations such as active appearance models, the parts-based model consists of a collection of localized image parts whose appearance, geometry and occurrence frequency are quantified statistically. The parts-based approach explicitly addresses the case where one-to-one correspondence does not exist between subjects due to anatomical differences, as parts are not expected to occur in all subjects. The model can be learned automatically, discovering structures that appear with statistical regularity in a large set of subject images, and can be robustly fit to new images, all in the presence of significant inter-subject variability. As parts are derived from generic scale-invariant features, the framework can be applied in a wide variety of image contexts, in order to study the commonality of anatomical parts or to group subjects according to the parts they share. Experimentation shows that a parts-based model can be learned from a large set of MR brain images, and used to determine parts that are common within the group of subjects. Preliminary results indicate that the model can be used to automatically identify distinctive features for inter-subject image registration despite large changes in appearance.
Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.
Tekwe, Carmen D; Carroll, Raymond J; Dabney, Alan R
2012-08-01
Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. ctekwe@stat.tamu.edu.
2016-01-01
The goal of this study is to quantify the effects of vocal fold nodules on vibratory motion in children using high-speed videoendoscopy. Differences in vibratory motion were evaluated in 20 children with vocal fold nodules (5–11 years) and 20 age and gender matched typically developing children (5–11 years) during sustained phonation at typical pitch and loudness. Normalized kinematic features of vocal fold displacements from the mid-membranous vocal fold point were extracted from the steady-state high-speed video. A total of 12 kinematic features representing spatial and temporal characteristics of vibratory motion were calculated. Average values and standard deviations (cycle-to-cycle variability) of the following kinematic features were computed: normalized peak displacement, normalized average opening velocity, normalized average closing velocity, normalized peak closing velocity, speed quotient, and open quotient. Group differences between children with and without vocal fold nodules were statistically investigated. While a moderate effect size was observed for the spatial feature of speed quotient, and the temporal feature of normalized average closing velocity in children with nodules compared to vocally normal children, none of the features were statistically significant between the groups after Bonferroni correction. The kinematic analysis of the mid-membranous vocal fold displacement revealed that children with nodules primarily differ from typically developing children in closing phase kinematics of the glottal cycle, whereas the opening phase kinematics are similar. Higher speed quotients and similar opening phase velocities suggest greater relative forces are acting on vocal fold in the closing phase. These findings suggest that future large-scale studies should focus on spatial and temporal features related to the closing phase of the glottal cycle for differentiating the kinematics of children with and without vocal fold nodules. PMID:27124157
Isotope chirality in long-armed multifunctional organosilicon ("Cephalopod") molecules.
Barabás, Béla; Kurdi, Róbert; Zucchi, Claudia; Pályi, Gyula
2018-07-01
Long-armed multifunctional organosilicon molecules display self-replicating and self-perfecting behavior in asymmetric autocatalysis (Soai reaction). Two representatives of this class were studied by statistical methods aiming at determination of probabilities of natural abundance chiral isotopomers. The results, reported here, show an astonishing richness of possibilities of the formation of chiral isotopically substituted derivatives. This feature could serve as a model for the evolution of biological chirality in prebiotic and early biotic stereochemistry. © 2018 Wiley Periodicals, Inc.
EOG and EMG: two important switches in automatic sleep stage classification.
Estrada, E; Nazeran, H; Barragan, J; Burk, J R; Lucas, E A; Behbehani, K
2006-01-01
Sleep is a natural periodic state of rest for the body, in which the eyes are usually closed and consciousness is completely or partially lost. In this investigation we used the EOG and EMG signals acquired from 10 patients undergoing overnight polysomnography with their sleep stages determined by expert sleep specialists based on RK rules. Differentiation between Stage 1, Awake and REM stages challenged a well trained neural network classifier to distinguish between classes when only EEG-derived signal features were used. To meet this challenge and improve the classification rate, extra features extracted from EOG and EMG signals were fed to the classifier. In this study, two simple feature extraction algorithms were applied to EOG and EMG signals. The statistics of the results were calculated and displayed in an easy to visualize fashion to observe tendencies for each sleep stage. Inclusion of these features show a great promise to improve the classification rate towards the target rate of 100%
Characterizing multivariate decoding models based on correlated EEG spectral features
McFarland, Dennis J.
2013-01-01
Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
Khan, Adil Mehmood; Lee, Young-Koo; Lee, Sungyoung Y; Kim, Tae-Seong
2010-09-01
Physical-activity recognition via wearable sensors can provide valuable information regarding an individual's degree of functional ability and lifestyle. In this paper, we present an accelerometer sensor-based approach for human-activity recognition. Our proposed recognition method uses a hierarchical scheme. At the lower level, the state to which an activity belongs, i.e., static, transition, or dynamic, is recognized by means of statistical signal features and artificial-neural nets (ANNs). The upper level recognition uses the autoregressive (AR) modeling of the acceleration signals, thus, incorporating the derived AR-coefficients along with the signal-magnitude area and tilt angle to form an augmented-feature vector. The resulting feature vector is further processed by the linear-discriminant analysis and ANNs to recognize a particular human activity. Our proposed activity-recognition method recognizes three states and 15 activities with an average accuracy of 97.9% using only a single triaxial accelerometer attached to the subject's chest.
Financial Stylized Facts in the Word of Mouth Model
NASA Astrophysics Data System (ADS)
Misawa, Tadanobu; Watanabe, Kyoko; Shimokawa, Tetsuya
Recently, we proposed an agent-based model called the word of mouth model to analyze the influence of an information transmission process to price formation in financial markets. Especially, the short-term predictability of asset return was focused on and an explanation in the view of information transmission was provided to the question why the predictability was much clearly observed in the small-sized stocks. This paper, to extend the previous study, demonstrates that the word of mouth model also has a consistency with other important financial stylized facts. This strengthens the possibility that the information transmission among investors plays a crucial role in price formation. Concretely, this paper addresses two famous statistical features of returns; the leptokurtic distribution of return and the autocorrelation of return volatility. The reasons why these statistical facts receive especial attentions of researchers among financial stylized facts are their statistical robustness and practical importance, such as the applications to the derivative pricing problems.
Camera-Model Identification Using Markovian Transition Probability Matrix
NASA Astrophysics Data System (ADS)
Xu, Guanshuo; Gao, Shang; Shi, Yun Qing; Hu, Ruimin; Su, Wei
Detecting the (brands and) models of digital cameras from given digital images has become a popular research topic in the field of digital forensics. As most of images are JPEG compressed before they are output from cameras, we propose to use an effective image statistical model to characterize the difference JPEG 2-D arrays of Y and Cb components from the JPEG images taken by various camera models. Specifically, the transition probability matrices derived from four different directional Markov processes applied to the image difference JPEG 2-D arrays are used to identify statistical difference caused by image formation pipelines inside different camera models. All elements of the transition probability matrices, after a thresholding technique, are directly used as features for classification purpose. Multi-class support vector machines (SVM) are used as the classification tool. The effectiveness of our proposed statistical model is demonstrated by large-scale experimental results.
NASA Technical Reports Server (NTRS)
Melbourne, William G.
1986-01-01
In double differencing a regression system obtained from concurrent Global Positioning System (GPS) observation sequences, one either undersamples the system to avoid introducing colored measurement statistics, or one fully samples the system incurring the resulting non-diagonal covariance matrix for the differenced measurement errors. A suboptimal estimation result will be obtained in the undersampling case and will also be obtained in the fully sampled case unless the color noise statistics are taken into account. The latter approach requires a least squares weighting matrix derived from inversion of a non-diagonal covariance matrix for the differenced measurement errors instead of inversion of the customary diagonal one associated with white noise processes. Presented is the so-called fully redundant double differencing algorithm for generating a weighted double differenced regression system that yields equivalent estimation results, but features for certain cases a diagonal weighting matrix even though the differenced measurement error statistics are highly colored.
Computing group cardinality constraint solutions for logistic regression problems.
Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M
2017-01-01
We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints. Copyright © 2016 Elsevier B.V. All rights reserved.
Kriete, A; Schäffer, R; Harms, H; Aus, H M
1987-06-01
Nuclei of the cells from the thyroid gland were analyzed in a transmission electron microscope by direct TV scanning and on-line image processing. The method uses the advantages of a visual-perception model to detect structures in noisy and low-contrast images. The features analyzed include area, a form factor and texture parameters from the second derivative stage. Three tumor-free thyroid tissues, three follicular adenomas, three follicular carcinomas and three papillary carcinomas were studied. The computer-aided cytophotometric method showed that the most significant differences were the statistics of the chromatin texture features of homogeneity and regularity. These findings document the possibility of an automated differentiation of tumors at the ultrastructural level.
Possible detection of an emission feature near 584 A in the direction of G191-B2B
NASA Technical Reports Server (NTRS)
Green, James; Bowyer, Stuart; Jelinsky, Patrick
1990-01-01
A possible spectral emission feature is reported in the direction of the nearby hot white dwarf G191-B2B at 581.5 + or - 6 A with a significance of 3.8 sigma. This emission has been identified as He I 584.3 A. The emission cannot be due to local geocoronal emission or interplanetary backscatter of solar He I 584 A emission because the feature is not detected in a nearby sky exposure. Possible sources for this emission are examined, including the photosphere of G191-B2B, the comparison star G191-B2A, and a possible nebulosity near or around G191-B2B. The parameters required to explain the emission are derived for each case. All of these explanations require unexpected physical conditions; hence we believe this result must receive confirming verification despite the statistical likelihood of the detection.
NASA Astrophysics Data System (ADS)
Sofia, G.; Tarolli, P.; Dalla Fontana, G.
2012-04-01
In floodplains, massive investments in land reclamation have always played an important role in the past for flood protection. In these contexts, human alteration is reflected by artificial features ('Anthropogenic features'), such as banks, levees or road scarps, that constantly increase and change, in response to the rapid growth of human populations. For these areas, various existing and emerging applications require up-to-date, accurate and sufficiently attributed digital data, but such information is usually lacking, especially when dealing with large-scale applications. More recently, National or Local Mapping Agencies, in Europe, are moving towards the generation of digital topographic information that conforms to reality and are highly reliable and up to date. LiDAR Digital Terrain Models (DTMs) covering large areas are readily available for public authorities, and there is a greater and more widespread interest in the application of such information by agencies responsible for land management for the development of automated methods aimed at solving geomorphological and hydrological problems. Automatic feature recognition based upon DTMs can offer, for large-scale applications, a quick and accurate method that can help in improving topographic databases, and that can overcome some of the problems associated with traditional, field-based, geomorphological mapping, such as restrictions on access, and constraints of time or costs. Although anthropogenic features as levees and road scarps are artificial structures that actually do not belong to what is usually defined as the bare ground surface, they are implicitly embedded in digital terrain models (DTMs). Automatic feature recognition based upon DTMs, therefore, can offer a quick and accurate method that does not require additional data, and that can help in improving flood defense asset information, flood modeling or other applications. In natural contexts, morphological indicators derived from high resolution topography have been proven to be reliable for feasible applications. The use of statistical operators as thresholds for these geomorphic parameters, furthermore, showed a high reliability for feature extraction in mountainous environments. The goal of this research is to test if these morphological indicators and objective thresholds can be feasible also in floodplains, where features assume different characteristics and other artificial disturbances might be present. In the work, three different geomorphic parameters are tested and applied at different scales on a LiDAR DTM of typical alluvial plain's area in the North East of Italy. The box-plot is applied to identify the threshold for feature extraction, and a filtering procedure is proposed, to improve the quality of the final results. The effectiveness of the different geomorphic parameters is analyzed, comparing automatically derived features with the surveyed ones. The results highlight the capability of high resolution topography, geomorphic indicators and statistical thresholds for anthropogenic features extraction and characterization in a floodplains context.
Time-nonlocal kinetic equations, jerk and hyperjerk in plasmas and solar physics
NASA Astrophysics Data System (ADS)
El-Nabulsi, Rami Ahmad
2018-06-01
The simulation and analysis of nonlocal effects in fluids and plasmas is an inherently complicated problem due to the massive breadth of physics required to describe the nonlocal dynamics. This is a multi-physics problem that draws upon various miscellaneous fields, such as electromagnetism and statistical mechanics. In this paper we strive to focus on one narrow but motivating mathematical way: the derivation of nonlocal plasma-fluid equations from a generalized nonlocal Liouville derivative operator motivated from Suykens's nonlocal arguments. The paper aims to provide a guideline toward modeling nonlocal effects occurring in plasma-fluid systems by means of a generalized nonlocal Boltzmann equation. The generalized nonlocal equations of fluid dynamics are derived and their implications in plasma-fluid systems are addressed, discussed and analyzed. Three main topics were discussed: Landau damping in plasma electrodynamics, ideal MHD and solar wind. A number of features were revealed, analyzed and confronted with recent research results and observations.
NASA Astrophysics Data System (ADS)
Kuo, K. S.; Bosilovich, M. G.; Oloso, A.; Collow, A.
2016-12-01
A superbly optimized method has been developed for a Big Data technology, whereby events of Earth Science phenomena can be identified and tracked with unprecedented efficiency. An event in this context is an episode of a phenomenon evolving in both space and time and not just a snapshot in time. This innovative method is applied to 36 years of both MERRA and MERRA-2 datasets, with a visibility criterion related to blizzard-like conditions. Visibility is estimated using a formula that takes snowfall rate and near-surface winds into consideration. MERRA-2 features additional observing systems and improvements to the hydrological cycle not present in the original MERRA atmospheric reanalysis, including the forcing of the land surface with observations-based precipitation fields and the implementation of a moisture constraint, preventing an imbalance in global precipitation and surface evaporation. MERRA-2 also features numerous advancements in the underlying model, such as in the surface layer and boundary layer parameterizations and in the cumulus convection scheme. The data assimilation has been updated to the latest Gridpoint Statistical Interpolation version and includes global dry mass constraints that help minimize spurious temporal variability effects. Thus, we report and compare in this presentation event-based statistics derived from MERRA and MERRA-2, highlighting the differences, especially the superiorities, of MERRA-2. We also identify aspects of MERRA-2 that may need further improvements.
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2007-01-01
Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
Zbilut, Joseph P.; Colosimo, Alfredo; Conti, Filippo; Colafranceschi, Mauro; Manetti, Cesare; Valerio, MariaCristina; Webber, Charles L.; Giuliani, Alessandro
2003-01-01
The problem of protein folding vs. aggregation was investigated in acylphosphatase and the amyloid protein Aβ(1–40) by means of nonlinear signal analysis of their chain hydrophobicity. Numerical descriptors of recurrence patterns provided the basis for statistical evaluation of folding/aggregation distinctive features. Static and dynamic approaches were used to elucidate conditions coincident with folding vs. aggregation using comparisons with known protein secondary structure classifications, site-directed mutagenesis studies of acylphosphatase, and molecular dynamics simulations of amyloid protein, Aβ(1–40). The results suggest that a feature derived from principal component space characterized by the smoothness of singular, deterministic hydrophobicity patches plays a significant role in the conditions governing protein aggregation. PMID:14645049
1983-08-01
drift speed of about 10 ka /day, which places it at the circular drifter track on the release day (185). Note that all drifters south of Pt. Arena were...spatial resolu- tion of 4 ka . *.a Data Collection The NOAA satellites are polar-orbiting satellites which pass overhead twice per day. One pass is...then decreases so that the maximum signal-to-noise occurs for 10 km x 10 ka boxes. Thus at a 12-hour separation in time temperature features of 78 less
NASA Technical Reports Server (NTRS)
Coggeshall, M. E.; Hoffer, R. M.
1973-01-01
Remote sensing equipment and automatic data processing techniques were employed as aids in the institution of improved forest resource management methods. On the basis of automatically calculated statistics derived from manually selected training samples, the feature selection processor of LARSYS selected, upon consideration of various groups of the four available spectral regions, a series of channel combinations whose automatic classification performances (for six cover types, including both deciduous and coniferous forest) were tested, analyzed, and further compared with automatic classification results obtained from digitized color infrared photography.
Thermodynamics-based models of transcriptional regulation with gene sequence.
Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing
2015-12-01
Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.
NASA Astrophysics Data System (ADS)
Jongkreangkrai, C.; Vichianin, Y.; Tocharoenchai, C.; Arimura, H.; Alzheimer's Disease Neuroimaging Initiative
2016-03-01
Several studies have differentiated Alzheimer's disease (AD) using cerebral image features derived from MR brain images. In this study, we were interested in combining hippocampus and amygdala volumes and entorhinal cortex thickness to improve the performance of AD differentiation. Thus, our objective was to investigate the useful features obtained from MRI for classification of AD patients using support vector machine (SVM). T1-weighted MR brain images of 100 AD patients and 100 normal subjects were processed using FreeSurfer software to measure hippocampus and amygdala volumes and entorhinal cortex thicknesses in both brain hemispheres. Relative volumes of hippocampus and amygdala were calculated to correct variation in individual head size. SVM was employed with five combinations of features (H: hippocampus relative volumes, A: amygdala relative volumes, E: entorhinal cortex thicknesses, HA: hippocampus and amygdala relative volumes and ALL: all features). Receiver operating characteristic (ROC) analysis was used to evaluate the method. AUC values of five combinations were 0.8575 (H), 0.8374 (A), 0.8422 (E), 0.8631 (HA) and 0.8906 (ALL). Although “ALL” provided the highest AUC, there were no statistically significant differences among them except for “A” feature. Our results showed that all suggested features may be feasible for computer-aided classification of AD patients.
Hu, Shan; Xu, Chao; Guan, Weiqiao; Tang, Yong; Liu, Yana
2014-01-01
Osteosarcoma is the most common malignant bone tumor among children and adolescents. In this study, image texture analysis was made to extract texture features from bone CR images to evaluate the recognition rate of osteosarcoma. To obtain the optimal set of features, Sym4 and Db4 wavelet transforms and gray-level co-occurrence matrices were applied to the image, with statistical methods being used to maximize the feature selection. To evaluate the performance of these methods, a support vector machine algorithm was used. The experimental results demonstrated that the Sym4 wavelet had a higher classification accuracy (93.44%) than the Db4 wavelet with respect to osteosarcoma occurrence in the epiphysis, whereas the Db4 wavelet had a higher classification accuracy (96.25%) for osteosarcoma occurrence in the diaphysis. Results including accuracy, sensitivity, specificity and ROC curves obtained using the wavelets were all higher than those obtained using the features derived from the GLCM method. It is concluded that, a set of texture features can be extracted from the wavelets and used in computer-aided osteosarcoma diagnosis systems. In addition, this study also confirms that multi-resolution analysis is a useful tool for texture feature extraction during bone CR image processing.
Deriving Case, Agreement and Voice Phenomena in Syntax
ERIC Educational Resources Information Center
Sigurdsson, Einar Freyr
2017-01-01
This dissertation places case, agreement and Voice phenomena in syntax. It argues that the derivation is driven by so-called derivational features, that is, structure-building features (Merge) and probe features (Agree) (Heck and Muller 2007 and Muller 2010; see also Chomsky 2000, 2001). Both types are essential in deriving case and agreement in…
Texture as a basis for acoustic classification of substrate in the nearshore region
NASA Astrophysics Data System (ADS)
Dennison, A.; Wattrus, N. J.
2016-12-01
Segmentation and classification of substrate type from two locations in Lake Superior, are predicted using multivariate statistical processing of textural measures derived from shallow-water, high-resolution multibeam bathymetric data. During a multibeam sonar survey, both bathymetric and backscatter data are collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on substrate type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. Preliminary results from an analysis of bathymetric data and ground-truth samples collected from the Amnicon River, Superior, Wisconsin, and the Lester River, Duluth, Minnesota, demonstrate the ability to process and develop a novel classification scheme of the bottom type in two geomorphologically distinct areas.
Characterizing multivariate decoding models based on correlated EEG spectral features.
McFarland, Dennis J
2013-07-01
Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Public health information and statistics dissemination efforts for Indonesia on the Internet.
Hanani, Febiana; Kobayashi, Takashi; Jo, Eitetsu; Nakajima, Sawako; Oyama, Hiroshi
2011-01-01
To elucidate current issues related to health statistics dissemination efforts on the Internet in Indonesia and to propose a new dissemination website as a solution. A cross-sectional survey was conducted. Sources of statistics were identified using link relationship and Google™ search. Menu used to locate statistics, mode of presentation and means of access to statistics, and available statistics were assessed for each site. Assessment results were used to derive design specification; a prototype system was developed and evaluated with usability test. 49 sources were identified on 18 governmental, 8 international and 5 non-government websites. Of 49 menus identified, 33% used non-intuitive titles and lead to inefficient search. 69% of them were on government websites. Of 31 websites, only 39% and 23% used graph/chart and map for presentation. Further, only 32%, 39% and 19% provided query, export and print feature. While >50% sources reported morbidity, risk factor and service provision statistics, <40% sources reported health resource and mortality statistics. Statistics portal website was developed using Joomla!™ content management system. Usability test demonstrated its potential to improve data accessibility. In this study, government's efforts to disseminate statistics in Indonesia are supported by non-governmental and international organizations and existing their information may not be very useful because it is: a) not widely distributed, b) difficult to locate, and c) not effectively communicated. Actions are needed to ensure information usability, and one of such actions is the development of statistics portal website.
Safaie, Ammar; Wendzel, Aaron; Ge, Zhongfu; Nevers, Meredith; Whitman, Richard L.; Corsi, Steven R.; Phanikumar, Mantha S.
2016-01-01
Statistical and mechanistic models are popular tools for predicting the levels of indicator bacteria at recreational beaches. Researchers tend to use one class of model or the other, and it is difficult to generalize statements about their relative performance due to differences in how the models are developed, tested, and used. We describe a cooperative modeling approach for freshwater beaches impacted by point sources in which insights derived from mechanistic modeling were used to further improve the statistical models and vice versa. The statistical models provided a basis for assessing the mechanistic models which were further improved using probability distributions to generate high-resolution time series data at the source, long-term “tracer” transport modeling based on observed electrical conductivity, better assimilation of meteorological data, and the use of unstructured-grids to better resolve nearshore features. This approach resulted in improved models of comparable performance for both classes including a parsimonious statistical model suitable for real-time predictions based on an easily measurable environmental variable (turbidity). The modeling approach outlined here can be used at other sites impacted by point sources and has the potential to improve water quality predictions resulting in more accurate estimates of beach closures.
Self-Similar Spin Images for Point Cloud Matching
NASA Astrophysics Data System (ADS)
Pulido, Daniel
The rapid growth of Light Detection And Ranging (Lidar) technologies that collect, process, and disseminate 3D point clouds have allowed for increasingly accurate spatial modeling and analysis of the real world. Lidar sensors can generate massive 3D point clouds of a collection area that provide highly detailed spatial and radiometric information. However, a Lidar collection can be expensive and time consuming. Simultaneously, the growth of crowdsourced Web 2.0 data (e.g., Flickr, OpenStreetMap) have provided researchers with a wealth of freely available data sources that cover a variety of geographic areas. Crowdsourced data can be of varying quality and density. In addition, since it is typically not collected as part of a dedicated experiment but rather volunteered, when and where the data is collected is arbitrary. The integration of these two sources of geoinformation can provide researchers the ability to generate products and derive intelligence that mitigate their respective disadvantages and combine their advantages. Therefore, this research will address the problem of fusing two point clouds from potentially different sources. Specifically, we will consider two problems: scale matching and feature matching. Scale matching consists of computing feature metrics of each point cloud and analyzing their distributions to determine scale differences. Feature matching consists of defining local descriptors that are invariant to common dataset distortions (e.g., rotation and translation). Additionally, after matching the point clouds they can be registered and processed further (e.g., change detection). The objective of this research is to develop novel methods to fuse and enhance two point clouds from potentially disparate sources (e.g., Lidar and crowdsourced Web 2.0 datasets). The scope of this research is to investigate both scale and feature matching between two point clouds. The specific focus of this research will be in developing a novel local descriptor based on the concept of self-similarity to aid in the scale and feature matching steps. An open problem in fusion is how best to extract features from two point clouds and then perform feature-based matching. The proposed approach for this matching step is the use of local self-similarity as an invariant measure to match features. In particular, the proposed approach is to combine the concept of local self-similarity with a well-known feature descriptor, Spin Images, and thereby define "Self-Similar Spin Images". This approach is then extended to the case of matching two points clouds in very different coordinate systems (e.g., a geo-referenced Lidar point cloud and stereo-image derived point cloud without geo-referencing). The use of Self-Similar Spin Images is again applied to address this problem by introducing a "Self-Similar Keyscale" that matches the spatial scales of two point clouds. Another open problem is how best to detect changes in content between two point clouds. A method is proposed to find changes between two point clouds by analyzing the order statistics of the nearest neighbors between the two clouds, and thereby define the "Nearest Neighbor Order Statistic" method. Note that the well-known Hausdorff distance is a special case as being just the maximum order statistic. Therefore, by studying the entire histogram of these nearest neighbors it is expected to yield a more robust method to detect points that are present in one cloud but not the other. This approach is applied at multiple resolutions. Therefore, changes detected at the coarsest level will yield large missing targets and at finer levels will yield smaller targets.
Recurrence time statistics: versatile tools for genomic DNA sequence analysis.
Cao, Yinhe; Tung, Wen-Wen; Gao, J B
2004-01-01
With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.
NASA Astrophysics Data System (ADS)
Garcia-Allende, P. Beatriz; Amygdalos, Iakovos; Dhanapala, Hiruni; Goldin, Robert D.; Hanna, George B.; Elson, Daniel S.
2012-01-01
Computer-aided diagnosis of ophthalmic diseases using optical coherence tomography (OCT) relies on the extraction of thickness and size measures from the OCT images, but such defined layers are usually not observed in emerging OCT applications aimed at "optical biopsy" such as pulmonology or gastroenterology. Mathematical methods such as Principal Component Analysis (PCA) or textural analyses including both spatial textural analysis derived from the two-dimensional discrete Fourier transform (DFT) and statistical texture analysis obtained independently from center-symmetric auto-correlation (CSAC) and spatial grey-level dependency matrices (SGLDM), as well as, quantitative measurements of the attenuation coefficient have been previously proposed to overcome this problem. We recently proposed an alternative approach consisting of a region segmentation according to the intensity variation along the vertical axis and a pure statistical technology for feature quantification. OCT images were first segmented in the axial direction in an automated manner according to intensity. Afterwards, a morphological analysis of the segmented OCT images was employed for quantifying the features that served for tissue classification. In this study, a PCA processing of the extracted features is accomplished to combine their discriminative power in a lower number of dimensions. Ready discrimination of gastrointestinal surgical specimens is attained demonstrating that the approach further surpasses the algorithms previously reported and is feasible for tissue classification in the clinical setting.
Murugappan, Murugappan; Murugappan, Subbulakshmi; Zheng, Bong Siao
2013-01-01
[Purpose] Intelligent emotion assessment systems have been highly successful in a variety of applications, such as e-learning, psychology, and psycho-physiology. This study aimed to assess five different human emotions (happiness, disgust, fear, sadness, and neutral) using heart rate variability (HRV) signals derived from an electrocardiogram (ECG). [Subjects] Twenty healthy university students (10 males and 10 females) with a mean age of 23 years participated in this experiment. [Methods] All five emotions were induced by audio-visual stimuli (video clips). ECG signals were acquired using 3 electrodes and were preprocessed using a Butterworth 3rd order filter to remove noise and baseline wander. The Pan-Tompkins algorithm was used to derive the HRV signals from ECG. Discrete wavelet transform (DWT) was used to extract statistical features from the HRV signals using four wavelet functions: Daubechies6 (db6), Daubechies7 (db7), Symmlet8 (sym8), and Coiflet5 (coif5). The k-nearest neighbor (KNN) and linear discriminant analysis (LDA) were used to map the statistical features into corresponding emotions. [Results] KNN provided the maximum average emotion classification rate compared to LDA for five emotions (sadness − 50.28%; happiness − 79.03%; fear − 77.78%; disgust − 88.69%; and neutral − 78.34%). [Conclusion] The results of this study indicate that HRV may be a reliable indicator of changes in the emotional state of subjects and provides an approach to the development of a real-time emotion assessment system with a higher reliability than other systems. PMID:24259846
Murugappan, Murugappan; Murugappan, Subbulakshmi; Zheng, Bong Siao
2013-07-01
[Purpose] Intelligent emotion assessment systems have been highly successful in a variety of applications, such as e-learning, psychology, and psycho-physiology. This study aimed to assess five different human emotions (happiness, disgust, fear, sadness, and neutral) using heart rate variability (HRV) signals derived from an electrocardiogram (ECG). [Subjects] Twenty healthy university students (10 males and 10 females) with a mean age of 23 years participated in this experiment. [Methods] All five emotions were induced by audio-visual stimuli (video clips). ECG signals were acquired using 3 electrodes and were preprocessed using a Butterworth 3rd order filter to remove noise and baseline wander. The Pan-Tompkins algorithm was used to derive the HRV signals from ECG. Discrete wavelet transform (DWT) was used to extract statistical features from the HRV signals using four wavelet functions: Daubechies6 (db6), Daubechies7 (db7), Symmlet8 (sym8), and Coiflet5 (coif5). The k-nearest neighbor (KNN) and linear discriminant analysis (LDA) were used to map the statistical features into corresponding emotions. [Results] KNN provided the maximum average emotion classification rate compared to LDA for five emotions (sadness - 50.28%; happiness - 79.03%; fear - 77.78%; disgust - 88.69%; and neutral - 78.34%). [Conclusion] The results of this study indicate that HRV may be a reliable indicator of changes in the emotional state of subjects and provides an approach to the development of a real-time emotion assessment system with a higher reliability than other systems.
NASA Astrophysics Data System (ADS)
Schumacher, R.; Schimpf, H.; Schiller, J.
2011-06-01
The most challenging problem of Automatic Target Recognition (ATR) is the extraction of robust and independent target features which describe the target unambiguously. These features have to be robust and invariant in different senses: in time, between aspect views (azimuth and elevation angle), between target motion (translation and rotation) and between different target variants. Especially for ground moving targets in military applications an irregular target motion is typical, so that a strong variation of the backscattered radar signal with azimuth and elevation angle makes the extraction of stable and robust features most difficult. For ATR based on High Range Resolution (HRR) profiles and / or Inverse Synthetic Aperture Radar (ISAR) images it is crucial that the reference dataset consists of stable and robust features, which, among others, will depend on the target aspect and depression angle amongst others. Here it is important to find an adequate data grid for an efficient data coverage in the reference dataset for ATR. In this paper the variability of the backscattered radar signals of target scattering centers is analyzed for different HRR profiles and ISAR images from measured turntable datasets of ground targets under controlled conditions. Especially the dependency of the features on the elevation angle is analyzed regarding to the ATR of large strip SAR data with a large range of depression angles by using available (I)SAR datasets as reference. In this work the robustness of these scattering centers is analyzed by extracting their amplitude, phase and position. Therefore turntable measurements under controlled conditions were performed targeting an artificial military reference object called STANDCAM. Measures referring to variability, similarity, robustness and separability regarding the scattering centers are defined. The dependency of the scattering behaviour with respect to azimuth and elevation variations is analyzed. Additionally generic types of features (geometrical, statistical), which can be derived especially from (I)SAR images, are applied to the ATR-task. Therefore subsequently the dependence of individual feature values as well as the feature statistics on aspect (i.e. azimuth and elevation) are presented. The Kolmogorov-Smirnov distance will be used to show how the feature statistics is influenced by varying elevation angles. Finally, confusion matrices are computed between the STANDCAM target at all eleven elevation angles. This helps to assess the robustness of ATR performance under the influence of aspect angle deviations between training set and test set.
ECG Identification System Using Neural Network with Global and Local Features
ERIC Educational Resources Information Center
Tseng, Kuo-Kun; Lee, Dachao; Chen, Charles
2016-01-01
This paper proposes a human identification system via extracted electrocardiogram (ECG) signals. Two hierarchical classification structures based on global shape feature and local statistical feature is used to extract ECG signals. Global shape feature represents the outline information of ECG signals and local statistical feature extracts the…
NASA Astrophysics Data System (ADS)
Wang, Lu; Wei, Wenzhao; Mei, Dongming; Cubed Collaboration
2015-10-01
Noble liquid xenon experiments, such as XENON100, LUX, XENON 1-Ton, and LZ are large dark matter experiments directly searches for weakly interacting massive particles (WIMPs). One of the most important features is to discriminate nuclear recoils from electronic recoils. Detector response is generally calibrated with different radioactive sources including 83mKr, tritiated methane, 241AmBe, 252Cf, and DD-neutrons. The electronic recoil and nuclear recoil bands have been determined by these calibrations. However, the width of nuclear recoil band needs to be fully understood. We derive a theoretical model to understand the correlation of the width of nuclear recoil band and intrinsic statistical variation. In addition, we conduct experiments to validate the theoretical model. In this paper, we present the study of intrinsic statistical variation contributing to the width of nuclear recoil band. DE-FG02-10ER46709 and the state of South Dakota.
An Adaptive Buddy Check for Observational Quality Control
NASA Technical Reports Server (NTRS)
Dee, Dick P.; Rukhovets, Leonid; Todling, Ricardo; DaSilva, Arlindo M.; Larson, Jay W.; Einaudi, Franco (Technical Monitor)
2000-01-01
An adaptive buddy check algorithm is presented that adjusts tolerances for outlier observations based on the variability of surrounding data. The algorithm derives from a statistical hypothesis test combined with maximum-likelihood covariance estimation. Its stability is shown to depend on the initial identification of outliers by a simple background check. The adaptive feature ensures that the final quality control decisions are not very sensitive to prescribed statistics of first-guess and observation errors, nor on other approximations introduced into the algorithm. The implementation of the algorithm in a global atmospheric data assimilation is described. Its performance is contrasted with that of a non-adaptive buddy check, for the surface analysis of an extreme storm that took place in Europe on 27 December 1999. The adaptive algorithm allowed the inclusion of many important observations that differed greatly from the first guess and that would have been excluded on the basis of prescribed statistics. The analysis of the storm development was much improved as a result of these additional observations.
Loop series for discrete statistical models on graphs
NASA Astrophysics Data System (ADS)
Chertkov, Michael; Chernyak, Vladimir Y.
2006-06-01
In this paper we present the derivation details, logic, and motivation for the three loop calculus introduced in Chertkov and Chernyak (2006 Phys. Rev. E 73 065102(R)). Generating functions for each of the three interrelated discrete statistical models are expressed in terms of a finite series. The first term in the series corresponds to the Bethe-Peierls belief-propagation (BP) contribution; the other terms are labelled by loops on the factor graph. All loop contributions are simple rational functions of spin correlation functions calculated within the BP approach. We discuss two alternative derivations of the loop series. One approach implements a set of local auxiliary integrations over continuous fields with the BP contribution corresponding to an integrand saddle-point value. The integrals are replaced by sums in the complementary approach, briefly explained in Chertkov and Chernyak (2006 Phys. Rev. E 73 065102(R)). Local gauge symmetry transformations that clarify an important invariant feature of the BP solution are revealed in both approaches. The individual terms change under the gauge transformation while the partition function remains invariant. The requirement for all individual terms to be nonzero only for closed loops in the factor graph (as opposed to paths with loose ends) is equivalent to fixing the first term in the series to be exactly equal to the BP contribution. Further applications of the loop calculus to problems in statistical physics, computer and information sciences are discussed.
Estimations of ABL fluxes and other turbulence parameters from Doppler lidar data
NASA Technical Reports Server (NTRS)
Gal-Chen, Tzvi; Xu, Mei; Eberhard, Wynn
1989-01-01
Techniques for extraction boundary layer parameters from measurements of a short-pulse CO2 Doppler lidar are described. The measurements are those collected during the First International Satellites Land Surface Climatology Project (ISLSCP) Field Experiment (FIFE). By continuously operating the lidar for about an hour, stable statistics of the radial velocities can be extracted. Assuming that the turbulence is horizontally homogeneous, the mean wind, its standard deviations, and the momentum fluxes were estimated. Spectral analysis of the radial velocities is also performed from which, by examining the amplitude of the power spectrum at the inertial range, the kinetic energy dissipation was deduced. Finally, using the statistical form of the Navier-Stokes equations, the surface heat flux is derived as the residual balance between the vertical gradient of the third moment of the vertical velocity and the kinetic energy dissipation. Combining many measurements would normally reduce the error provided that, it is unbiased and uncorrelated. The nature of some of the algorithms however, is such that, biased and correlated errors may be generated even though the raw measurements are not. Data processing procedures were developed that eliminate bias and minimize error correlation. Once bias and error correlations are accounted for, the large sample size is shown to reduce the errors substantially. The principal features of the derived turbulence statistics for two case studied are presented.
Kumar Myakalwar, Ashwin; Spegazzini, Nicolas; Zhang, Chi; Kumar Anubham, Siva; Dasari, Ramachandra R; Barman, Ishan; Kumar Gundawar, Manoj
2015-08-19
Despite its intrinsic advantages, translation of laser induced breakdown spectroscopy for material identification has been often impeded by the lack of robustness of developed classification models, often due to the presence of spurious correlations. While a number of classifiers exhibiting high discriminatory power have been reported, efforts in establishing the subset of relevant spectral features that enable a fundamental interpretation of the segmentation capability and avoid the 'curse of dimensionality' have been lacking. Using LIBS data acquired from a set of secondary explosives, we investigate judicious feature selection approaches and architect two different chemometrics classifiers -based on feature selection through prerequisite knowledge of the sample composition and genetic algorithm, respectively. While the full spectral input results in classification rate of ca.92%, selection of only carbon to hydrogen spectral window results in near identical performance. Importantly, the genetic algorithm-derived classifier shows a statistically significant improvement to ca. 94% accuracy for prospective classification, even though the number of features used is an order of magnitude smaller. Our findings demonstrate the impact of rigorous feature selection in LIBS and also hint at the feasibility of using a discrete filter based detector thereby enabling a cheaper and compact system more amenable to field operations.
Kumar Myakalwar, Ashwin; Spegazzini, Nicolas; Zhang, Chi; Kumar Anubham, Siva; Dasari, Ramachandra R.; Barman, Ishan; Kumar Gundawar, Manoj
2015-01-01
Despite its intrinsic advantages, translation of laser induced breakdown spectroscopy for material identification has been often impeded by the lack of robustness of developed classification models, often due to the presence of spurious correlations. While a number of classifiers exhibiting high discriminatory power have been reported, efforts in establishing the subset of relevant spectral features that enable a fundamental interpretation of the segmentation capability and avoid the ‘curse of dimensionality’ have been lacking. Using LIBS data acquired from a set of secondary explosives, we investigate judicious feature selection approaches and architect two different chemometrics classifiers –based on feature selection through prerequisite knowledge of the sample composition and genetic algorithm, respectively. While the full spectral input results in classification rate of ca.92%, selection of only carbon to hydrogen spectral window results in near identical performance. Importantly, the genetic algorithm-derived classifier shows a statistically significant improvement to ca. 94% accuracy for prospective classification, even though the number of features used is an order of magnitude smaller. Our findings demonstrate the impact of rigorous feature selection in LIBS and also hint at the feasibility of using a discrete filter based detector thereby enabling a cheaper and compact system more amenable to field operations. PMID:26286630
Volatility behavior of visibility graph EMD financial time series from Ising interacting system
NASA Astrophysics Data System (ADS)
Zhang, Bo; Wang, Jun; Fang, Wen
2015-08-01
A financial market dynamics model is developed and investigated by stochastic Ising system, where the Ising model is the most popular ferromagnetic model in statistical physics systems. Applying two graph based analysis and multiscale entropy method, we investigate and compare the statistical volatility behavior of return time series and the corresponding IMF series derived from the empirical mode decomposition (EMD) method. And the real stock market indices are considered to be comparatively studied with the simulation data of the proposed model. Further, we find that the degree distribution of visibility graph for the simulation series has the power law tails, and the assortative network exhibits the mixing pattern property. All these features are in agreement with the real market data, the research confirms that the financial model established by the Ising system is reasonable.
Traffic Flow of Interacting Self-Driven Particles: Rails and Trails, Vehicles and Vesicles
NASA Astrophysics Data System (ADS)
Chowdhury, Debashish
One common feature of a vehicle, an ant and a kinesin motor is that they all convert chemical energy, derived from fuel or food, into mechanical energy required for their forward movement; such objects have been modelled in recent years as self-driven particles. Cytoskeletal filaments, e.g., microtubules, form a rail network for intra-cellular transport of vesicular cargo by molecular motors like, for example, kinesins. Similarly, ants move along trails while vehicles move along lanes. Therefore, the traffic of vehicles and organisms as well as that of molecular motors can be modelled as systems of interacting self-driven particles; these are of current interest in non-equilibrium statistical mechanics. In this paper we point out the common features of these model systems and emphasize the crucial differences in their physical properties.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cavanaugh, J.E.; McQuarrie, A.D.; Shumway, R.H.
Conventional methods for discriminating between earthquakes and explosions at regional distances have concentrated on extracting specific features such as amplitude and spectral ratios from the waveforms of the P and S phases. We consider here an optimum nonparametric classification procedure derived from the classical approach to discriminating between two Gaussian processes with unequal spectra. Two robust variations based on the minimum discrimination information statistic and Renyi's entropy are also considered. We compare the optimum classification procedure with various amplitude and spectral ratio discriminants and show that its performance is superior when applied to a small population of 8 land-based earthquakesmore » and 8 mining explosions recorded in Scandinavia. Several parametric characterizations of the notion of complexity based on modeling earthquakes and explosions as autoregressive or modulated autoregressive processes are also proposed and their performance compared with the nonparametric and feature extraction approaches.« less
Statistical properties of Chinese phonemic networks
NASA Astrophysics Data System (ADS)
Yu, Shuiyuan; Liu, Haitao; Xu, Chunshan
2011-04-01
The study of properties of speech sound systems is of great significance in understanding the human cognitive mechanism and the working principles of speech sound systems. Some properties of speech sound systems, such as the listener-oriented feature and the talker-oriented feature, have been unveiled with the statistical study of phonemes in human languages and the research of the interrelations between human articulatory gestures and the corresponding acoustic parameters. With all the phonemes of speech sound systems treated as a coherent whole, our research, which focuses on the dynamic properties of speech sound systems in operation, investigates some statistical parameters of Chinese phoneme networks based on real text and dictionaries. The findings are as follows: phonemic networks have high connectivity degrees and short average distances; the degrees obey normal distribution and the weighted degrees obey power law distribution; vowels enjoy higher priority than consonants in the actual operation of speech sound systems; the phonemic networks have high robustness against targeted attacks and random errors. In addition, for investigating the structural properties of a speech sound system, a statistical study of dictionaries is conducted, which shows the higher frequency of shorter words and syllables and the tendency that the longer a word is, the shorter the syllables composing it are. From these structural properties and dynamic properties one can derive the following conclusion: the static structure of a speech sound system tends to promote communication efficiency and save articulation effort while the dynamic operation of this system gives preference to reliable transmission and easy recognition. In short, a speech sound system is an effective, efficient and reliable communication system optimized in many aspects.
Deep-learning derived features for lung nodule classification with limited datasets
NASA Astrophysics Data System (ADS)
Thammasorn, P.; Wu, W.; Pierce, L. A.; Pipavath, S. N.; Lampe, P. D.; Houghton, A. M.; Haynor, D. R.; Chaovalitwongse, W. A.; Kinahan, P. E.
2018-02-01
Only a few percent of indeterminate nodules found in lung CT images are cancer. However, enabling earlier diagnosis is important to avoid invasive procedures or long-time surveillance to those benign nodules. We are evaluating a classification framework using radiomics features derived with a machine learning approach from a small data set of indeterminate CT lung nodule images. We used a retrospective analysis of 194 cases with pulmonary nodules in the CT images with or without contrast enhancement from lung cancer screening clinics. The nodules were contoured by a radiologist and texture features of the lesion were calculated. In addition, sematic features describing shape were categorized. We also explored a Multiband network, a feature derivation path that uses a modified convolutional neural network (CNN) with a Triplet Network. This was trained to create discriminative feature representations useful for variable-sized nodule classification. The diagnostic accuracy was evaluated for multiple machine learning algorithms using texture, shape, and CNN features. In the CT contrast-enhanced group, the texture or semantic shape features yielded an overall diagnostic accuracy of 80%. Use of a standard deep learning network in the framework for feature derivation yielded features that substantially underperformed compared to texture and/or semantic features. However, the proposed Multiband approach of feature derivation produced results similar in diagnostic accuracy to the texture and semantic features. While the Multiband feature derivation approach did not outperform the texture and/or semantic features, its equivalent performance indicates promise for future improvements to increase diagnostic accuracy. Importantly, the Multiband approach adapts readily to different size lesions without interpolation, and performed well with relatively small amount of training data.
McGovern, Donna L; Mosier, Philip D; Roth, Bryan L; Westkaemper, Richard B
2010-04-01
The highly potent and kappa-opioid (KOP) receptor-selective hallucinogen Salvinorin A and selected analogs have been analyzed using the 3D quantitative structure-affinity relationship technique Comparative Molecular Field Analysis (CoMFA) in an effort to derive a statistically significant and predictive model of salvinorin affinity at the KOP receptor and to provide additional statistical support for the validity of previously proposed structure-based interaction models. Two CoMFA models of Salvinorin A analogs substituted at the C-2 position are presented. Separate models were developed based on the radioligand used in the kappa-opioid binding assay, [(3)H]diprenorphine or [(125)I]6 beta-iodo-3,14-dihydroxy-17-cyclopropylmethyl-4,5 alpha-epoxymorphinan ([(125)I]IOXY). For each dataset, three methods of alignment were employed: a receptor-docked alignment derived from the structure-based docking algorithm GOLD, another from the ligand-based alignment algorithm FlexS, and a rigid realignment of the poses from the receptor-docked alignment. The receptor-docked alignment produced statistically superior results compared to either the FlexS alignment or the realignment in both datasets. The [(125)I]IOXY set (Model 1) and [(3)H]diprenorphine set (Model 2) gave q(2) values of 0.592 and 0.620, respectively, using the receptor-docked alignment, and both models produced similar CoMFA contour maps that reflected the stereoelectronic features of the receptor model from which they were derived. Each model gave significantly predictive CoMFA statistics (Model 1 PSET r(2)=0.833; Model 2 PSET r(2)=0.813). Based on the CoMFA contour maps, a binding mode was proposed for amine-containing Salvinorin A analogs that provides a rationale for the observation that the beta-epimers (R-configuration) of protonated amines at the C-2 position have a higher affinity than the corresponding alpha-epimers (S-configuration). (c) 2010. Published by Elsevier Inc.
Cluster expansion for ground states of local Hamiltonians
NASA Astrophysics Data System (ADS)
Bastianello, Alvise; Sotiriadis, Spyros
2016-08-01
A central problem in many-body quantum physics is the determination of the ground state of a thermodynamically large physical system. We construct a cluster expansion for ground states of local Hamiltonians, which naturally incorporates physical requirements inherited by locality as conditions on its cluster amplitudes. Applying a diagrammatic technique we derive the relation of these amplitudes to thermodynamic quantities and local observables. Moreover we derive a set of functional equations that determine the cluster amplitudes for a general Hamiltonian, verify the consistency with perturbation theory and discuss non-perturbative approaches. Lastly we verify the persistence of locality features of the cluster expansion under unitary evolution with a local Hamiltonian and provide applications to out-of-equilibrium problems: a simplified proof of equilibration to the GGE and a cumulant expansion for the statistics of work, for an interacting-to-free quantum quench.
Quantum weak turbulence with applications to semiconductor lasers
NASA Astrophysics Data System (ADS)
Lvov, Yuri Victorovich
Based on a model Hamiltonian appropriate for the description of fermionic systems such as semiconductor lasers, we describe a natural asymptotic closure of the BBGKY hierarchy in complete analogy with that derived for classical weak turbulence. The main features of the interaction Hamiltonian are the inclusion of full Fermi statistics containing Pauli blocking and a simple, phenomenological, uniformly weak two particle interaction potential equivalent to the static screening approximation. The resulting asymytotic closure and quantum kinetic Boltzmann equation are derived in a self consistent manner without resorting to a priori statistical hypotheses or cumulant discard assumptions. We find a new class of solutions to the quantum kinetic equation which are analogous to the Kolmogorov spectra of hydrodynamics and classical weak turbulence. They involve finite fluxes of particles and energy across momentum space and are particularly relevant for describing the behavior of systems containing sources and sinks. We explore these solutions by using differential approximation to collision integral. We make a prima facie case that these finite flux solutions can be important in the context of semiconductor lasers. We show that semiconductor laser output efficiency can be improved by exciting these finite flux solutions. Numerical simulations of the semiconductor Maxwell Bloch equations support the claim.
Multiscale analysis of the CMB temperature derivatives
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marcos-Caballero, A.; Martínez-González, E.; Vielva, P., E-mail: marcos@ifca.unican.es, E-mail: martinez@ifca.unican.es, E-mail: vielva@ifca.unican.es
2017-02-01
We study the Planck CMB temperature at different scales through its derivatives up to second order, which allows one to characterize the local shape and isotropy of the field. The problem of having an incomplete sky in the calculation and statistical characterization of the derivatives is addressed in the paper. The analysis confirms the existence of a low variance in the CMB at large scales, which is also noticeable in the derivatives. Moreover, deviations from the standard model in the gradient, curvature and the eccentricity tensor are studied in terms of extreme values on the data. As it is expected,more » the Cold Spot is detected as one of the most prominent peaks in terms of curvature, but additionally, when the information of the temperature and its Laplacian are combined, another feature with similar probability at the scale of 10{sup o} is also observed. However, the p -value of these two deviations increase above the 6% when they are referred to the variance calculated from the theoretical fiducial model, indicating that these deviations can be associated to the low variance anomaly. Finally, an estimator of the directional anisotropy for spinorial quantities is introduced, which is applied to the spinors derived from the field derivatives. An anisotropic direction whose probability is <1% is detected in the eccentricity tensor.« less
Weak shock propagation through a turbulent atmosphere
NASA Technical Reports Server (NTRS)
Pierce, Allan D.; Sparrow, Victor W.
1990-01-01
Consideration is given to the propagation through turbulence of transient pressure waveforms whose initial onset at any given point is an abrupt shock. The work is motivated by the desire to eventually develop a mathematical model for predicting statistical features, such as peak overpressures and spike widths, of sonic booms generated by supersonic aircraft. It is argued that the transient waveform received at points where x greater than 0 will begin with a pressure jump and a formulation is developed for predicting the amount of this jump and the time derivatives of the pressure waveform immediately following the jump.
feets: feATURE eXTRACTOR for tIME sERIES
NASA Astrophysics Data System (ADS)
Cabral, Juan; Sanchez, Bruno; Ramos, Felipe; Gurovich, Sebastián; Granitto, Pablo; VanderPlas, Jake
2018-06-01
feets characterizes and analyzes light-curves from astronomical photometric databases for modelling, classification, data cleaning, outlier detection and data analysis. It uses machine learning algorithms to determine the numerical descriptors that characterize and distinguish the different variability classes of light-curves; these range from basic statistical measures such as the mean or standard deviation to complex time-series characteristics such as the autocorrelation function. The library is not restricted to the astronomical field and could also be applied to any kind of time series. This project is a derivative work of FATS (ascl:1711.017).
NASA Technical Reports Server (NTRS)
Prill, J. C. (Principal Investigator)
1979-01-01
The author has identified the following significant results. Level 2 forest features (softwood, hardwood, clear-cut, and water) can be classified with an overall accuracy of 71.6 percent plus or minus 6.7 percent at the 90 percent confidence level for the particular data and conditions existing at the time of the study. Signatures derived from training fields taken from only 10 percent of the site are not sufficient to adequately classify the site. The level 3 softwood age group classification appears reasonable, although no statistical evaluation was performed.
Combinatorial interpretation of Haldane-Wu fractional exclusion statistics.
Aringazin, A K; Mazhitov, M I
2002-08-01
Assuming that the maximal allowed number of identical particles in a state is an integer parameter, q, we derive the statistical weight and analyze the associated equation that defines the statistical distribution. The derived distribution covers Fermi-Dirac and Bose-Einstein ones in the particular cases q=1 and q--> infinity (n(i)/q-->1), respectively. We show that the derived statistical weight provides a natural combinatorial interpretation of Haldane-Wu fractional exclusion statistics, and present exact solutions of the distribution equation.
Local Structure Theory for Cellular Automata.
NASA Astrophysics Data System (ADS)
Gutowitz, Howard Andrew
The local structure theory (LST) is a generalization of the mean field theory for cellular automata (CA). The mean field theory makes the assumption that iterative application of the rule does not introduce correlations between the states of cells in different positions. This assumption allows the derivation of a simple formula for the limit density of each possible state of a cell. The most striking feature of CA is that they may well generate correlations between the states of cells as they evolve. The LST takes the generation of correlation explicitly into account. It thus has the potential to describe statistical characteristics in detail. The basic assumption of the LST is that though correlation may be generated by CA evolution, this correlation decays with distance. This assumption allows the derivation of formulas for the estimation of the probability of large blocks of states in terms of smaller blocks of states. Given the probabilities of blocks of size n, probabilities may be assigned to blocks of arbitrary size such that these probability assignments satisfy the Kolmogorov consistency conditions and hence may be used to define a measure on the set of all possible (infinite) configurations. Measures defined in this way are called finite (or n-) block measures. A function called the scramble operator of order n maps a measure to an approximating n-block measure. The action of a CA on configurations induces an action on measures on the set of all configurations. The scramble operator is combined with the CA map on measure to form the local structure operator (LSO). The LSO of order n maps the set of n-block measures into itself. It is hypothesised that the LSO applied to n-block measures approximates the rule itself on general measures, and does so increasingly well as n increases. The fundamental advantage of the LSO is that its action is explicitly computable from a finite system of rational recursion equations. Empirical study of a number of CA rules demonstrates the potential of the LST to describe the statistical features of CA. The behavior of some simple rules is derived analytically. Other rules have more complex, chaotic behavior. Even for these rules, the LST yields an accurate portrait of both small and large time statistics.
Li, Chuan; Sánchez, René-Vinicio; Zurita, Grover; Cerrada, Mariela; Cabrera, Diego
2016-06-17
Fault diagnosis is important for the maintenance of rotating machinery. The detection of faults and fault patterns is a challenging part of machinery fault diagnosis. To tackle this problem, a model for deep statistical feature learning from vibration measurements of rotating machinery is presented in this paper. Vibration sensor signals collected from rotating mechanical systems are represented in the time, frequency, and time-frequency domains, each of which is then used to produce a statistical feature set. For learning statistical features, real-value Gaussian-Bernoulli restricted Boltzmann machines (GRBMs) are stacked to develop a Gaussian-Bernoulli deep Boltzmann machine (GDBM). The suggested approach is applied as a deep statistical feature learning tool for both gearbox and bearing systems. The fault classification performances in experiments using this approach are 95.17% for the gearbox, and 91.75% for the bearing system. The proposed approach is compared to such standard methods as a support vector machine, GRBM and a combination model. In experiments, the best fault classification rate was detected using the proposed model. The results show that deep learning with statistical feature extraction has an essential improvement potential for diagnosing rotating machinery faults.
Fault Diagnosis for Rotating Machinery Using Vibration Measurement Deep Statistical Feature Learning
Li, Chuan; Sánchez, René-Vinicio; Zurita, Grover; Cerrada, Mariela; Cabrera, Diego
2016-01-01
Fault diagnosis is important for the maintenance of rotating machinery. The detection of faults and fault patterns is a challenging part of machinery fault diagnosis. To tackle this problem, a model for deep statistical feature learning from vibration measurements of rotating machinery is presented in this paper. Vibration sensor signals collected from rotating mechanical systems are represented in the time, frequency, and time-frequency domains, each of which is then used to produce a statistical feature set. For learning statistical features, real-value Gaussian-Bernoulli restricted Boltzmann machines (GRBMs) are stacked to develop a Gaussian-Bernoulli deep Boltzmann machine (GDBM). The suggested approach is applied as a deep statistical feature learning tool for both gearbox and bearing systems. The fault classification performances in experiments using this approach are 95.17% for the gearbox, and 91.75% for the bearing system. The proposed approach is compared to such standard methods as a support vector machine, GRBM and a combination model. In experiments, the best fault classification rate was detected using the proposed model. The results show that deep learning with statistical feature extraction has an essential improvement potential for diagnosing rotating machinery faults. PMID:27322273
Propelled microprobes in turbulence
NASA Astrophysics Data System (ADS)
Calzavarini, E.; Huang, Y. X.; Schmitt, F. G.; Wang, L. P.
2018-05-01
The temporal statistics of incompressible fluid velocity and passive scalar fields in developed turbulent conditions is investigated by means of direct numerical simulations along the trajectories of self-propelled pointlike probes drifting in a flow. Such probes are characterized by a propulsion velocity which is fixed in intensity and direction; however, like vessels in a flow they are continuously deviated on their intended course as the result of local sweeping of the fluid flow. The recorded time series by these moving probes represent the simplest realization of transect measurements in a fluid flow environment. We investigate the nontrivial combination of Lagrangian and Eulerian statistical properties displayed by the transect time series. We show that, as a result of the homogeneity and isotropy of the flow, the single-point acceleration statistics of the probes follows a predictable trend at varying the propulsion speed, a feature that is also present in the scalar time-derivative fluctuations. Further, by focusing on two-time statistics we characterize how the Lagrangian-to-Eulerian transition occurs at increasing the propulsion velocity. The analysis of intermittency of temporal increments highlights in a striking way the opposite trends displayed by the fluid velocity and passive scalars.
NASA Astrophysics Data System (ADS)
Vallam, P.; Qin, X. S.
2017-10-01
Anthropogenic-driven climate change would affect the global ecosystem and is becoming a world-wide concern. Numerous studies have been undertaken to determine the future trends of meteorological variables at different scales. Despite these studies, there remains significant uncertainty in the prediction of future climates. To examine the uncertainty arising from using different schemes to downscale the meteorological variables for the future horizons, projections from different statistical downscaling schemes were examined. These schemes included statistical downscaling method (SDSM), change factor incorporated with LARS-WG, and bias corrected disaggregation (BCD) method. Global circulation models (GCMs) based on CMIP3 (HadCM3) and CMIP5 (CanESM2) were utilized to perturb the changes in the future climate. Five study sites (i.e., Alice Springs, Edmonton, Frankfurt, Miami, and Singapore) with diverse climatic conditions were chosen for examining the spatial variability of applying various statistical downscaling schemes. The study results indicated that the regions experiencing heavy precipitation intensities were most likely to demonstrate the divergence between the predictions from various statistical downscaling methods. Also, the variance computed in projecting the weather extremes indicated the uncertainty derived from selection of downscaling tools and climate models. This study could help gain an improved understanding about the features of different downscaling approaches and the overall downscaling uncertainty.
A Study of Feature Extraction Using Divergence Analysis of Texture Features
NASA Technical Reports Server (NTRS)
Hallada, W. A.; Bly, B. G.; Boyd, R. K.; Cox, S.
1982-01-01
An empirical study of texture analysis for feature extraction and classification of high spatial resolution remotely sensed imagery (10 meters) is presented in terms of specific land cover types. The principal method examined is the use of spatial gray tone dependence (SGTD). The SGTD method reduces the gray levels within a moving window into a two-dimensional spatial gray tone dependence matrix which can be interpreted as a probability matrix of gray tone pairs. Haralick et al (1973) used a number of information theory measures to extract texture features from these matrices, including angular second moment (inertia), correlation, entropy, homogeneity, and energy. The derivation of the SGTD matrix is a function of: (1) the number of gray tones in an image; (2) the angle along which the frequency of SGTD is calculated; (3) the size of the moving window; and (4) the distance between gray tone pairs. The first three parameters were varied and tested on a 10 meter resolution panchromatic image of Maryville, Tennessee using the five SGTD measures. A transformed divergence measure was used to determine the statistical separability between four land cover categories forest, new residential, old residential, and industrial for each variation in texture parameters.
A Santos, Jose C; Nassif, Houssam; Page, David; Muggleton, Stephen H; E Sternberg, Michael J
2012-07-11
There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions. The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues cys and leu. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein/ligand interactions. Several of these rules are consistent with descriptions in the literature. In addition to confirming literature results, ProGolem's model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein/hexose interactions and is comparable with state-of-the-art statistical learners.
Free-Form Region Description with Second-Order Pooling.
Carreira, João; Caseiro, Rui; Batista, Jorge; Sminchisescu, Cristian
2015-06-01
Semantic segmentation and object detection are nowadays dominated by methods operating on regions obtained as a result of a bottom-up grouping process (segmentation) but use feature extractors developed for recognition on fixed-form (e.g. rectangular) patches, with full images as a special case. This is most likely suboptimal. In this paper we focus on feature extraction and description over free-form regions and study the relationship with their fixed-form counterparts. Our main contributions are novel pooling techniques that capture the second-order statistics of local descriptors inside such free-form regions. We introduce second-order generalizations of average and max-pooling that together with appropriate non-linearities, derived from the mathematical structure of their embedding space, lead to state-of-the-art recognition performance in semantic segmentation experiments without any type of local feature coding. In contrast, we show that codebook-based local feature coding is more important when feature extraction is constrained to operate over regions that include both foreground and large portions of the background, as typical in image classification settings, whereas for high-accuracy localization setups, second-order pooling over free-form regions produces results superior to those of the winning systems in the contemporary semantic segmentation challenges, with models that are much faster in both training and testing.
Comparison of organs' shapes with geometric and Zernike 3D moments.
Broggio, D; Moignier, A; Ben Brahim, K; Gardumi, A; Grandgirard, N; Pierrat, N; Chea, M; Derreumaux, S; Desbrée, A; Boisserie, G; Aubert, B; Mazeron, J-J; Franck, D
2013-09-01
The morphological similarity of organs is studied with feature vectors based on geometric and Zernike 3D moments. It is particularly investigated if outliers and average models can be identified. For this purpose, the relative proximity to the mean feature vector is defined, principal coordinate and clustering analyses are also performed. To study the consistency and usefulness of this approach, 17 livers and 76 hearts voxel models from several sources are considered. In the liver case, models with similar morphological feature are identified. For the limited amount of studied cases, the liver of the ICRP male voxel model is identified as a better surrogate than the female one. For hearts, the clustering analysis shows that three heart shapes represent about 80% of the morphological variations. The relative proximity and clustering analysis rather consistently identify outliers and average models. For the two cases, identification of outliers and surrogate of average models is rather robust. However, deeper classification of morphological feature is subject to caution and can only be performed after cross analysis of at least two kinds of feature vectors. Finally, the Zernike moments contain all the information needed to re-construct the studied objects and thus appear as a promising tool to derive statistical organ shapes. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Zhang, Huaizhong; Fan, Jun; Perkins, Simon; Pisconti, Addolorata; Simpson, Deborah M.; Bessant, Conrad; Hubbard, Simon; Jones, Andrew R.
2015-01-01
The mzQuantML standard has been developed by the Proteomics Standards Initiative for capturing, archiving and exchanging quantitative proteomic data, derived from mass spectrometry. It is a rich XML‐based format, capable of representing data about two‐dimensional features from LC‐MS data, and peptides, proteins or groups of proteins that have been quantified from multiple samples. In this article we report the development of an open source Java‐based library of routines for mzQuantML, called the mzqLibrary, and associated software for visualising data called the mzqViewer. The mzqLibrary contains routines for mapping (peptide) identifications on quantified features, inference of protein (group)‐level quantification values from peptide‐level values, normalisation and basic statistics for differential expression. These routines can be accessed via the command line, via a Java programming interface access or a basic graphical user interface. The mzqLibrary also contains several file format converters, including import converters (to mzQuantML) from OpenMS, Progenesis LC‐MS and MaxQuant, and exporters (from mzQuantML) to other standards or useful formats (mzTab, HTML, csv). The mzqViewer contains in‐built routines for viewing the tables of data (about features, peptides or proteins), and connects to the R statistical library for more advanced plotting options. The mzqLibrary and mzqViewer packages are available from https://code.google.com/p/mzq‐lib/. PMID:26037908
Tustison, Nicholas J; Shrinidhi, K L; Wintermark, Max; Durst, Christopher R; Kandel, Benjamin M; Gee, James C; Grossman, Murray C; Avants, Brian B
2015-04-01
Segmenting and quantifying gliomas from MRI is an important task for diagnosis, planning intervention, and for tracking tumor changes over time. However, this task is complicated by the lack of prior knowledge concerning tumor location, spatial extent, shape, possible displacement of normal tissue, and intensity signature. To accommodate such complications, we introduce a framework for supervised segmentation based on multiple modality intensity, geometry, and asymmetry feature sets. These features drive a supervised whole-brain and tumor segmentation approach based on random forest-derived probabilities. The asymmetry-related features (based on optimal symmetric multimodal templates) demonstrate excellent discriminative properties within this framework. We also gain performance by generating probability maps from random forest models and using these maps for a refining Markov random field regularized probabilistic segmentation. This strategy allows us to interface the supervised learning capabilities of the random forest model with regularized probabilistic segmentation using the recently developed ANTsR package--a comprehensive statistical and visualization interface between the popular Advanced Normalization Tools (ANTs) and the R statistical project. The reported algorithmic framework was the top-performing entry in the MICCAI 2013 Multimodal Brain Tumor Segmentation challenge. The challenge data were widely varying consisting of both high-grade and low-grade glioma tumor four-modality MRI from five different institutions. Average Dice overlap measures for the final algorithmic assessment were 0.87, 0.78, and 0.74 for "complete", "core", and "enhanced" tumor components, respectively.
Qi, Da; Zhang, Huaizhong; Fan, Jun; Perkins, Simon; Pisconti, Addolorata; Simpson, Deborah M; Bessant, Conrad; Hubbard, Simon; Jones, Andrew R
2015-09-01
The mzQuantML standard has been developed by the Proteomics Standards Initiative for capturing, archiving and exchanging quantitative proteomic data, derived from mass spectrometry. It is a rich XML-based format, capable of representing data about two-dimensional features from LC-MS data, and peptides, proteins or groups of proteins that have been quantified from multiple samples. In this article we report the development of an open source Java-based library of routines for mzQuantML, called the mzqLibrary, and associated software for visualising data called the mzqViewer. The mzqLibrary contains routines for mapping (peptide) identifications on quantified features, inference of protein (group)-level quantification values from peptide-level values, normalisation and basic statistics for differential expression. These routines can be accessed via the command line, via a Java programming interface access or a basic graphical user interface. The mzqLibrary also contains several file format converters, including import converters (to mzQuantML) from OpenMS, Progenesis LC-MS and MaxQuant, and exporters (from mzQuantML) to other standards or useful formats (mzTab, HTML, csv). The mzqViewer contains in-built routines for viewing the tables of data (about features, peptides or proteins), and connects to the R statistical library for more advanced plotting options. The mzqLibrary and mzqViewer packages are available from https://code.google.com/p/mzq-lib/. © 2015 The Authors. PROTEOMICS Published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Docking and multivariate methods to explore HIV-1 drug-resistance: a comparative analysis
NASA Astrophysics Data System (ADS)
Almerico, Anna Maria; Tutone, Marco; Lauria, Antonino
2008-05-01
In this paper we describe a comparative analysis between multivariate and docking methods in the study of the drug resistance to the reverse transcriptase and the protease inhibitors. In our early papers we developed a simple but efficient method to evaluate the features of compounds that are less likely to trigger resistance or are effective against mutant HIV strains, using the multivariate statistical procedures PCA and DA. In the attempt to create a more solid background for the prediction of susceptibility or resistance, we carried out a comparative analysis between our previous multivariate approach and molecular docking study. The intent of this paper is not only to find further support to the results obtained by the combined use of PCA and DA, but also to evidence the structural features, in terms of molecular descriptors, similarity, and energetic contributions, derived from docking, which can account for the arising of drug-resistance against mutant strains.
NASA Astrophysics Data System (ADS)
Liang, Yun-Feng; Shen, Zhao-Qiang; Li, Xiang; Fan, Yi-Zhong; Huang, Xiaoyuan; Lei, Shi-Jun; Feng, Lei; Liang, En-Wei; Chang, Jin
2016-05-01
Galaxy clusters are the largest gravitationally bound objects in the Universe and may be suitable targets for indirect dark matter searches. With 85 months of Fermi LAT Pass 8 publicly available data, we analyze the gamma-ray emission in the direction of 16 nearby galaxy clusters with an unbinned likelihood analysis. No statistically or globally significant γ -ray line feature is identified and a tentative line signal may present at ˜43 GeV . The 95% confidence level upper limits on the velocity-averaged cross section of dark matter particles annihilating into double γ rays (i.e., ⟨σ v ⟩χχ →γ γ) are derived. Unless very optimistic boost factors of dark matter annihilation in these galaxy clusters have been assumed, such constraints are much weaker than the bounds set by the Galactic γ -ray data.
NITPICK: peak identification for mass spectrometry data
Renard, Bernhard Y; Kirchner, Marc; Steen , Hanno; Steen, Judith AJ; Hamprecht , Fred A
2008-01-01
Background The reliable extraction of features from mass spectra is a fundamental step in the automated analysis of proteomic mass spectrometry (MS) experiments. Results This contribution proposes a sparse template regression approach to peak picking called NITPICK. NITPICK is a Non-greedy, Iterative Template-based peak PICKer that deconvolves complex overlapping isotope distributions in multicomponent mass spectra. NITPICK is based on fractional averagine, a novel extension to Senko's well-known averagine model, and on a modified version of sparse, non-negative least angle regression, for which a suitable, statistically motivated early stopping criterion has been derived. The strength of NITPICK is the deconvolution of overlapping mixture mass spectra. Conclusion Extensive comparative evaluation has been carried out and results are provided for simulated and real-world data sets. NITPICK outperforms pepex, to date the only alternate, publicly available, non-greedy feature extraction routine. NITPICK is available as software package for the R programming language and can be downloaded from . PMID:18755032
Kintsch, Walter; Mangalath, Praful
2011-04-01
We argue that word meanings are not stored in a mental lexicon but are generated in the context of working memory from long-term memory traces that record our experience with words. Current statistical models of semantics, such as latent semantic analysis and the Topic model, describe what is stored in long-term memory. The CI-2 model describes how this information is used to construct sentence meanings. This model is a dual-memory model, in that it distinguishes between a gist level and an explicit level. It also incorporates syntactic information about how words are used, derived from dependency grammar. The construction of meaning is conceptualized as feature sampling from the explicit memory traces, with the constraint that the sampling must be contextually relevant both semantically and syntactically. Semantic relevance is achieved by sampling topically relevant features; local syntactic constraints as expressed by dependency relations ensure syntactic relevance. Copyright © 2010 Cognitive Science Society, Inc.
Evidence for variability of the hard X-ray feature in the Hercules X-1 energy spectrum
NASA Technical Reports Server (NTRS)
Tueller, J.; Cline, T. L.; Teegarden, B. J.; Paciesas, W. S.; Boclet, D.; Durochoux, P.; Hameury, J. M.; Prantzos, N.; Haymes, R. C.
1983-01-01
The hard X-ray spectrum of HER X-1 was measured for the first time with a high resolution (1.4 keV FWHM) germanium spectrometer. The observation was performed near the peak of the on-state in the 35 day cycle and the 1.24 pulsations were observed between the energies of 20 keV and 70 keV. The feature corresponds to an excess of 7.5 sigma over the low energy continuum. Smooth continuum models are poor fits to the entire energy range (chance probabilities of 2 percent or less). The best fit energies are 35 keV for an absorption line and 39 keV for an emission line. These are significantly lower energies than those derived from previous experiments. A direct comparison of our data with the results of the MPI/AIT group shows statistically significant variations which strongly suggest variability in the source.
Derivative Free Optimization of Complex Systems with the Use of Statistical Machine Learning Models
2015-09-12
AFRL-AFOSR-VA-TR-2015-0278 DERIVATIVE FREE OPTIMIZATION OF COMPLEX SYSTEMS WITH THE USE OF STATISTICAL MACHINE LEARNING MODELS Katya Scheinberg...COMPLEX SYSTEMS WITH THE USE OF STATISTICAL MACHINE LEARNING MODELS 5a. CONTRACT NUMBER 5b. GRANT NUMBER FA9550-11-1-0239 5c. PROGRAM ELEMENT...developed, which has been the focus of our research. 15. SUBJECT TERMS optimization, Derivative-Free Optimization, Statistical Machine Learning 16. SECURITY
Statistical physics of community ecology: a cavity solution to MacArthur’s consumer resource model
NASA Astrophysics Data System (ADS)
Advani, Madhu; Bunin, Guy; Mehta, Pankaj
2018-03-01
A central question in ecology is to understand the ecological processes that shape community structure. Niche-based theories have emphasized the important role played by competition for maintaining species diversity. Many of these insights have been derived using MacArthur’s consumer resource model (MCRM) or its generalizations. Most theoretical work on the MCRM has focused on small ecosystems with a few species and resources. However theoretical insights derived from small ecosystems many not scale up to large ecosystems with many resources and species because large systems with many interacting components often display new emergent behaviors that cannot be understood or deduced from analyzing smaller systems. To address these shortcomings, we develop a statistical physics inspired cavity method to analyze MCRM when both the number of species and the number of resources is large. Unlike previous work in this limit, our theory addresses resource dynamics and resource depletion and demonstrates that species generically and consistently perturb their environments and significantly modify available ecological niches. We show how our cavity approach naturally generalizes niche theory to large ecosystems by accounting for the effect of collective phenomena on species invasion and ecological stability. Our theory suggests that such phenomena are a generic feature of large, natural ecosystems and must be taken into account when analyzing and interpreting community structure. It also highlights the important role that statistical-physics inspired approaches can play in furthering our understanding of ecology.
Exact and approximate graph matching using random walks.
Gori, Marco; Maggini, Marco; Sarti, Lorenzo
2005-07-01
In this paper, we propose a general framework for graph matching which is suitable for different problems of pattern recognition. The pattern representation we assume is at the same time highly structured, like for classic syntactic and structural approaches, and of subsymbolic nature with real-valued features, like for connectionist and statistic approaches. We show that random walk based models, inspired by Google's PageRank, give rise to a spectral theory that nicely enhances the graph topological features at node level. As a straightforward consequence, we derive a polynomial algorithm for the classic graph isomorphism problem, under the restriction of dealing with Markovian spectrally distinguishable graphs (MSD), a class of graphs that does not seem to be easily reducible to others proposed in the literature. The experimental results that we found on different test-beds of the TC-15 graph database show that the defined MSD class "almost always" covers the database, and that the proposed algorithm is significantly more efficient than top scoring VF algorithm on the same data. Most interestingly, the proposed approach is very well-suited for dealing with partial and approximate graph matching problems, derived for instance from image retrieval tasks. We consider the objects of the COIL-100 visual collection and provide a graph-based representation, whose node's labels contain appropriate visual features. We show that the adoption of classic bipartite graph matching algorithms offers a straightforward generalization of the algorithm given for graph isomorphism and, finally, we report very promising experimental results on the COIL-100 visual collection.
Continuity equation for probability as a requirement of inference over paths
NASA Astrophysics Data System (ADS)
González, Diego; Díaz, Daniela; Davis, Sergio
2016-09-01
Local conservation of probability, expressed as the continuity equation, is a central feature of non-equilibrium Statistical Mechanics. In the existing literature, the continuity equation is always motivated by heuristic arguments with no derivation from first principles. In this work we show that the continuity equation is a logical consequence of the laws of probability and the application of the formalism of inference over paths for dynamical systems. That is, the simple postulate that a system moves continuously through time following paths implies the continuity equation. The translation between the language of dynamical paths to the usual representation in terms of probability densities of states is performed by means of an identity derived from Bayes' theorem. The formalism presented here is valid independently of the nature of the system studied: it is applicable to physical systems and also to more abstract dynamics such as financial indicators, population dynamics in ecology among others.
Personalized Modeling for Prediction with Decision-Path Models
Visweswaran, Shyam; Ferreira, Antonio; Ribeiro, Guilherme A.; Oliveira, Alexandre C.; Cooper, Gregory F.
2015-01-01
Deriving predictive models in medicine typically relies on a population approach where a single model is developed from a dataset of individuals. In this paper we describe and evaluate a personalized approach in which we construct a new type of decision tree model called decision-path model that takes advantage of the particular features of a given person of interest. We introduce three personalized methods that derive personalized decision-path models. We compared the performance of these methods to that of Classification And Regression Tree (CART) that is a population decision tree to predict seven different outcomes in five medical datasets. Two of the three personalized methods performed statistically significantly better on area under the ROC curve (AUC) and Brier skill score compared to CART. The personalized approach of learning decision path models is a new approach for predictive modeling that can perform better than a population approach. PMID:26098570
Separation of man-made and natural patterns in high-altitude imagery of agricultural areas
NASA Technical Reports Server (NTRS)
Samulon, A. S.
1975-01-01
A nonstationary linear digital filter is designed and implemented which extracts the natural features from high-altitude imagery of agricultural areas. Essentially, from an original image a new image is created which displays information related to soil properties, drainage patterns, crop disease, and other natural phenomena, and contains no information about crop type or row spacing. A model is developed to express the recorded brightness in a narrow-band image in terms of man-made and natural contributions and which describes statistically the spatial properties of each. The form of the minimum mean-square error linear filter for estimation of the natural component of the scene is derived and a suboptimal filter is implemented. Nonstationarity of the two-dimensional random processes contained in the model requires a unique technique for deriving the optimum filter. Finally, the filter depends on knowledge of field boundaries. An algorithm for boundary location is proposed, discussed, and implemented.
Manoj Kumar, Palanivelu; Karthikeyan, Chandrabose; Hari Narayana Moorthy, Narayana Subbiah; Trivedi, Piyush
2006-11-01
In the present paper, quantitative structure activity relationship (QSAR) approach was applied to understand the affinity and selectivity of a novel series of triaryl imidazole derivatives towards glucagon receptor. Statistically significant and highly predictive QSARs were derived for glucagon receptor inhibition by triaryl imidazoles using QuaSAR descriptors of molecular operating environment (MOE) employing computer-assisted multiple regression procedure. The generated QSAR models revealed that factors related to hydrophobicity, molecular shape and geometry predominantly influences glucagon receptor binding affinity of the triaryl imidazoles indicating the relevance of shape specific steric interactions between the molecule and the receptor. Further, QSAR models formulated for selective inhibition of glucagon receptor over p38 mitogen activated protein (MAP) kinase of the compounds in the series highlights that the same structural features, which influence the glucagon receptor affinity, also contribute to their selective inhibition.
Quantitative analysis of breast cancer diagnosis using a probabilistic modelling approach.
Liu, Shuo; Zeng, Jinshu; Gong, Huizhou; Yang, Hongqin; Zhai, Jia; Cao, Yi; Liu, Junxiu; Luo, Yuling; Li, Yuhua; Maguire, Liam; Ding, Xuemei
2018-01-01
Breast cancer is the most prevalent cancer in women in most countries of the world. Many computer-aided diagnostic methods have been proposed, but there are few studies on quantitative discovery of probabilistic dependencies among breast cancer data features and identification of the contribution of each feature to breast cancer diagnosis. This study aims to fill this void by utilizing a Bayesian network (BN) modelling approach. A K2 learning algorithm and statistical computation methods are used to construct BN structure and assess the obtained BN model. The data used in this study were collected from a clinical ultrasound dataset derived from a Chinese local hospital and a fine-needle aspiration cytology (FNAC) dataset from UCI machine learning repository. Our study suggested that, in terms of ultrasound data, cell shape is the most significant feature for breast cancer diagnosis, and the resistance index presents a strong probabilistic dependency on blood signals. With respect to FNAC data, bare nuclei are the most important discriminating feature of malignant and benign breast tumours, and uniformity of both cell size and cell shape are tightly interdependent. The BN modelling approach can support clinicians in making diagnostic decisions based on the significant features identified by the model, especially when some other features are missing for specific patients. The approach is also applicable to other healthcare data analytics and data modelling for disease diagnosis. Copyright © 2017 Elsevier Ltd. All rights reserved.
Chi, Wu-Cheng; Lee, W.H.K.; Aston, J.A.D.; Lin, C.J.; Liu, C.-C.
2011-01-01
We develop a new way to invert 2D translational waveforms using Jaeger's (1969) formula to derive rotational ground motions about one axis and estimate the errors in them using techniques from statistical multivariate analysis. This procedure can be used to derive rotational ground motions and strains using arrayed translational data, thus providing an efficient way to calibrate the performance of rotational sensors. This approach does not require a priori information about the noise level of the translational data and elastic properties of the media. This new procedure also provides estimates of the standard deviations of the derived rotations and strains. In this study, we validated this code using synthetic translational waveforms from a seismic array. The results after the inversion of the synthetics for rotations were almost identical with the results derived using a well-tested inversion procedure by Spudich and Fletcher (2009). This new 2D procedure can be applied three times to obtain the full, three-component rotations. Additional modifications can be implemented to the code in the future to study different features of the rotational ground motions and strains induced by the passage of seismic waves.
Bhargava, Dinesh; Karthikeyan, C; Moorthy, N S H N; Trivedi, Piyush
2009-09-01
QSAR study was carried out for a series of piperazinyl phenylalanine derivatives exhibiting VLA-4/VCAM-1 inhibitory activity to find out the structural features responsible for the biological activity. The QSAR study was carried out on V-life Molecular Design Suite software and the derived best QSAR model by partial least square (forward) regression method showed 85.67% variation in biological activity. The statistically significant model with high correlation coefficient (r2=0.85) was selected for further study and the resulted validation parameters of the model, crossed squared correlation coefficient (q2=0.76 and pred_r2=0.42) show the model has good predictive ability. The model showed that the parameters SaaNEindex, SsClcount slogP,and 4PathCount are highly correlated with VLA-4/VCAM-1 inhibitory activity of piperazinyl phenylalanine derivatives. The result of the study suggests that the chlorine atoms in the molecule and fourth order fragmentation patterns in the molecular skeleton favour VLA-4/VCAM-1 inhibition shown by the title compounds whereas lipophilicity and nitrogen bonded to aromatic bond are not conducive for VLA-4/VCAM-1 inhibitory activity.
Qin, Jiang-Bo; Liu, Zhenyu; Zhang, Hui; Shen, Chen; Wang, Xiao-Chun; Tan, Yan; Wang, Shuo; Wu, Xiao-Feng; Tian, Jie
2017-05-07
BACKGROUND Gliomas are the most common primary brain neoplasms. Misdiagnosis occurs in glioma grading due to an overlap in conventional MRI manifestations. The aim of the present study was to evaluate the power of radiomic features based on multiple MRI sequences - T2-Weighted-Imaging-FLAIR (FLAIR), T1-Weighted-Imaging-Contrast-Enhanced (T1-CE), and Apparent Diffusion Coefficient (ADC) map - in glioma grading, and to improve the power of glioma grading by combining features. MATERIAL AND METHODS Sixty-six patients with histopathologically proven gliomas underwent T2-FLAIR and T1WI-CE sequence scanning with some patients (n=63) also undergoing DWI scanning. A total of 114 radiomic features were derived with radiomic methods by using in-house software. All radiomic features were compared between high-grade gliomas (HGGs) and low-grade gliomas (LGGs). Features with significant statistical differences were selected for receiver operating characteristic (ROC) curve analysis. The relationships between significantly different radiomic features and glial fibrillary acidic protein (GFAP) expression were evaluated. RESULTS A total of 8 radiomic features from 3 MRI sequences displayed significant differences between LGGs and HGGs. FLAIR GLCM Cluster Shade, T1-CE GLCM Entropy, and ADC GLCM Homogeneity were the best features to use in differentiating LGGs and HGGs in each MRI sequence. The combined feature was best able to differentiate LGGs and HGGs, which improved the accuracy of glioma grading compared to the above features in each MRI sequence. A significant correlation was found between GFAP and T1-CE GLCM Entropy, as well as between GFAP and ADC GLCM Homogeneity. CONCLUSIONS The combined radiomic feature had the highest efficacy in distinguishing LGGs from HGGs.
Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J
2008-01-01
ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
Digitise This! A Quick and Easy Remote Sensing Method to Monitor the Daily Extent of Dredge Plumes
Evans, Richard D.; Murray, Kathy L.; Field, Stuart N.; Moore, James A. Y.; Shedrawi, George; Huntley, Barton G.; Fearns, Peter; Broomhall, Mark; McKinna, Lachlan I. W.; Marrable, Daniel
2012-01-01
Technological advancements in remote sensing and GIS have improved natural resource managers’ abilities to monitor large-scale disturbances. In a time where many processes are heading towards automation, this study has regressed to simple techniques to bridge a gap found in the advancement of technology. The near-daily monitoring of dredge plume extent is common practice using Moderate Resolution Imaging Spectroradiometer (MODIS) imagery and associated algorithms to predict the total suspended solids (TSS) concentration in the surface waters originating from floods and dredge plumes. Unfortunately, these methods cannot determine the difference between dredge plume and benthic features in shallow, clear water. This case study at Barrow Island, Western Australia, uses hand digitising to demonstrate the ability of human interpretation to determine this difference with a level of confidence and compares the method to contemporary TSS methods. Hand digitising was quick, cheap and required very little training of staff to complete. Results of ANOSIM R statistics show remote sensing derived TSS provided similar spatial results if they were thresholded to at least 3 mg L−1. However, remote sensing derived TSS consistently provided false-positive readings of shallow benthic features as Plume with a threshold up to TSS of 6 mg L−1, and began providing false-negatives (excluding actual plume) at a threshold as low as 4 mg L−1. Semi-automated processes that estimate plume concentration and distinguish between plumes and shallow benthic features without the arbitrary nature of human interpretation would be preferred as a plume monitoring method. However, at this stage, the hand digitising method is very useful and is more accurate at determining plume boundaries over shallow benthic features and is accessible to all levels of management with basic training. PMID:23240055
Recognition of large scale deep-seated landslides in vegetated areas of Taiwan
NASA Astrophysics Data System (ADS)
Lin, C. W.; Tarolli, P.; Tseng, C. M.; Tseng, Y. H.
2012-04-01
In August 2009, Typhoon Morakot triggered thousands of landslides and debris flows, and according to government reports, 619 people were dead and 76 missing and the economic loss was estimated at hundreds million of USD. In particular, the large deep-seated landslides are critical and deserve attention, since they can be affected by a reactivation during intense events, that usually can evolve in destructive failures. These are also difficult to recognize in the field, especially under dense forest areas. A detailed and constantly updated inventory map of such phenomena, and the recognition of their topographic signatures really represents a key tool for landslide risk mitigation, and mapping. The aim of this work is to test the performance of a new developed method for the automatic extraction of geomorphic features related to landslide crowns developed by Tarolli et al. (2010), in support to field surveys in order to develop a detailed and accurate inventory map of such phenomena. The methodology is based on the detection of thresholds derived by the statistical analysis of variability of landform curvature from high resolution LiDAR derived topography. The analysis suggested that the method allowed a good performance in localization and extraction, respect to field analysis, of features related to deep-seated landslides. Thanks to LiDAR capabilty to detect the bare ground elevation data also in forested areas, it was possible to recognize in detail landslide features also in remote regions difficult to access. Reference Tarolli, P., Sofia, G., Dalla Fontana, G. (2010). Geomorphic features extraction from high-resolution topography: landslide crowns and bank erosion, Natural Hazards, doi:10.1007/s11069-010-9695-2
NASA Astrophysics Data System (ADS)
Adib, Artur B.
In the last two decades or so, a collection of results in nonequilibrium statistical mechanics that departs from the traditional near-equilibrium framework introduced by Lars Onsager in 1931 has been derived, yielding new fundamental insights into far-from-equilibrium processes in general. Apart from offering a more quantitative statement of the second law of thermodynamics, some of these results---typified by the so-called "Jarzynski equality"---have also offered novel means of estimating equilibrium quantities from nonequilibrium processes, such as free energy differences from single-molecule "pulling" experiments. This thesis contributes to such efforts by offering three novel results in nonequilibrium statistical mechanics: (a) The entropic analog of the Jarzynski equality; (b) A methodology for estimating free energies from "clamp-and-release" nonequilibrium processes; and (c) A directly measurable symmetry relation in chemical kinetics similar to (but more general than) chemical detailed balance. These results share in common the feature of remaining valid outside Onsager's near-equilibrium regime, and bear direct applicability in protein folding kinetics as well as in single-molecule free energy estimation.
Computer-aided diagnosis of melanoma using border and wavelet-based texture analysis.
Garnavi, Rahil; Aldeen, Mohammad; Bailey, James
2012-11-01
This paper presents a novel computer-aided diagnosis system for melanoma. The novelty lies in the optimised selection and integration of features derived from textural, borderbased and geometrical properties of the melanoma lesion. The texture features are derived from using wavelet-decomposition, the border features are derived from constructing a boundaryseries model of the lesion border and analysing it in spatial and frequency domains, and the geometry features are derived from shape indexes. The optimised selection of features is achieved by using the Gain-Ratio method, which is shown to be computationally efficient for melanoma diagnosis application. Classification is done through the use of four classifiers; namely, Support Vector Machine, Random Forest, Logistic Model Tree and Hidden Naive Bayes. The proposed diagnostic system is applied on a set of 289 dermoscopy images (114 malignant, 175 benign) partitioned into train, validation and test image sets. The system achieves and accuracy of 91.26% and AUC value of 0.937, when 23 features are used. Other important findings include (i) the clear advantage gained in complementing texture with border and geometry features, compared to using texture information only, and (ii) higher contribution of texture features than border-based features in the optimised feature set.
Feature Statistics Modulate the Activation of Meaning during Spoken Word Processing
ERIC Educational Resources Information Center
Devereux, Barry J.; Taylor, Kirsten I.; Randall, Billi; Geertzen, Jeroen; Tyler, Lorraine K.
2016-01-01
Understanding spoken words involves a rapid mapping from speech to conceptual representations. One distributed feature-based conceptual account assumes that the statistical characteristics of concepts' features--the number of concepts they occur in ("distinctiveness/sharedness") and likelihood of co-occurrence ("correlational…
ERIC Educational Resources Information Center
Haberman, Shelby J.
2004-01-01
Statistical and measurement properties are examined for features used in essay assessment to determine the generalizability of the features across populations, prompts, and individuals. Data are employed from TOEFL® and GMAT® examinations and from writing for Criterion?.
Monitoring of bone regeneration process by means of texture analysis
NASA Astrophysics Data System (ADS)
Kokkinou, E.; Boniatis, I.; Costaridou, L.; Saridis, A.; Panagiotopoulos, E.; Panayiotakis, G.
2009-09-01
An image analysis method is proposed for the monitoring of the regeneration of the tibial bone. For this purpose, 130 digitized radiographs of 13 patients, who had undergone tibial lengthening by the Ilizarov method, were studied. For each patient, 10 radiographs, taken at an equal number of postoperative successive time moments, were available. Employing available software, 3 Regions Of Interest (ROIs), corresponding to the: (a) upper, (b) central, and (c) lower aspect of the gap, where bone regeneration was expected to occur, were determined on each radiograph. Employing custom developed algorithms: (i) a number of textural features were generated from each of the ROIs, and (ii) a texture-feature based regression model was designed for the quantitative monitoring of the bone regeneration process. Statistically significant differences (p < 0.05) were derived for the initial and the final textural features values, generated from the first and the last postoperatively obtained radiographs, respectively. A quadratic polynomial regression equation fitted data adequately (r2 = 0.9, p < 0.001). The suggested method may contribute to the monitoring of the tibial bone regeneration process.
Multispectral and geomorphic studies of processed Voyager 2 images of Europa
NASA Technical Reports Server (NTRS)
Meier, T. A.
1984-01-01
High resolution images of Europa taken by the Voyager 2 spacecraft were used to study a portion of Europa's dark lineations and the major white line feature Agenor Linea. Initial image processing of images 1195J2-001 (violet filter), 1198J2-001 (blue filter), 1201J2-001 (orange filter), and 1204J2-001 (ultraviolet filter) was performed at the U.S.G.S. Branch of Astrogeology in Flagstaff, Arizona. Processing was completed through the stages of image registration and color ratio image construction. Pixel printouts were used in a new technique of linear feature profiling to compensate for image misregistration through the mapping of features on the printouts. In all, 193 dark lineation segments were mapped and profiled. The more accurate multispectral data derived by this method was plotted using a new application of the ternary diagram, with orange, blue, and violet relative spectral reflectances serving as end members. Statistical techniques were then applied to the ternary diagram plots. The image products generated at LPI were used mainly to cross-check and verify the results of the ternary diagram analysis.
Using complex networks for text classification: Discriminating informative and imaginative documents
NASA Astrophysics Data System (ADS)
de Arruda, Henrique F.; Costa, Luciano da F.; Amancio, Diego R.
2016-01-01
Statistical methods have been widely employed in recent years to grasp many language properties. The application of such techniques have allowed an improvement of several linguistic applications, such as machine translation and document classification. In the latter, many approaches have emphasised the semantical content of texts, as is the case of bag-of-word language models. These approaches have certainly yielded reasonable performance. However, some potential features such as the structural organization of texts have been used only in a few studies. In this context, we probe how features derived from textual structure analysis can be effectively employed in a classification task. More specifically, we performed a supervised classification aiming at discriminating informative from imaginative documents. Using a networked model that describes the local topological/dynamical properties of function words, we achieved an accuracy rate of up to 95%, which is much higher than similar networked approaches. A systematic analysis of feature relevance revealed that symmetry and accessibility measurements are among the most prominent network measurements. Our results suggest that these measurements could be used in related language applications, as they play a complementary role in characterising texts.
Cai, Suxian; Yang, Shanshan; Zheng, Fang; Lu, Meng; Wu, Yunfeng; Krishnan, Sridhar
2013-01-01
Analysis of knee joint vibration (VAG) signals can provide quantitative indices for detection of knee joint pathology at an early stage. In addition to the statistical features developed in the related previous studies, we extracted two separable features, that is, the number of atoms derived from the wavelet matching pursuit decomposition and the number of significant signal turns detected with the fixed threshold in the time domain. To perform a better classification over the data set of 89 VAG signals, we applied a novel classifier fusion system based on the dynamic weighted fusion (DWF) method to ameliorate the classification performance. For comparison, a single leastsquares support vector machine (LS-SVM) and the Bagging ensemble were used for the classification task as well. The results in terms of overall accuracy in percentage and area under the receiver operating characteristic curve obtained with the DWF-based classifier fusion method reached 88.76% and 0.9515, respectively, which demonstrated the effectiveness and superiority of the DWF method with two distinct features for the VAG signal analysis. PMID:23573175
EDA-gram: designing electrodermal activity fingerprints for visualization and feature extraction.
Chaspari, Theodora; Tsiartas, Andreas; Stein Duker, Leah I; Cermak, Sharon A; Narayanan, Shrikanth S
2016-08-01
Wearable technology permeates every aspect of our daily life increasing the need of reliable and interpretable models for processing the large amount of biomedical data. We propose the EDA-Gram, a multidimensional fingerprint of the electrodermal activity (EDA) signal, inspired by the widely-used notion of spectrogram. The EDA-Gram is based on the sparse decomposition of EDA from a knowledge-driven set of dictionary atoms. The time axis reflects the analysis frames, the spectral dimension depicts the width of selected dictionary atoms, while intensity values are computed from the atom coefficients. In this way, EDA-Gram incorporates the amplitude and shape of Skin Conductance Responses (SCR), which comprise an essential part of the signal. EDA-Gram is further used as a foundation for signal-specific feature design. Our results indicate that the proposed representation can accentuate fine-grain signal fluctuations, which might not always be apparent through simple visual inspection. Statistical analysis and classification/regression experiments further suggest that the derived features can differentiate between multiple arousal levels and stress-eliciting environments for two datasets.
NASA Astrophysics Data System (ADS)
Petty, A.; Tsamados, M.; Kurtz, N. T.; Farrell, S. L.; Newman, T.; Harbeck, J.; Feltham, D. L.; Richter-Menge, J.
2015-12-01
Here we present a detailed analysis of Arctic sea ice topography using high resolution, three-dimensional surface elevation data from the NASA Operation IceBridge Airborne Topographic Mapper (ATM) laser altimeter. We derive novel ice topography statistics from 2009-2014 across both first-year and multiyear ice regimes - including the height, area coverage, orientation and spacing of distinct surface features. The sea ice topography exhibits strong spatial variability, including increased surface feature (e.g. pressure ridge) height and area coverage within the multi-year ice regions. The ice topography also shows a strong coastal dependency, with the feature height and area coverage increasing as a function of proximity to the nearest coastline, especially north of Greenland and the Canadian Archipelago. The ice topography data have also been used to explicitly calculate atmospheric drag coefficients over Arctic sea ice; utilizing existing relationships regarding ridge geometry and their impact on form drag. The results are being used to calibrate the recent drag parameterization scheme included in the sea ice model CICE.
NASA Astrophysics Data System (ADS)
Kushnir, A. F.; Troitsky, E. V.; Haikin, L. M.; Dainty, A.
1999-06-01
A semi-automatic procedure has been developed to achieve statistically optimum discrimination between earthquakes and explosions at local or regional distances based on a learning set specific to a given region. The method is used for step-by-step testing of candidate discrimination features to find the optimum (combination) subset of features, with the decision taken on a rigorous statistical basis. Linear (LDF) and Quadratic (QDF) Discriminant Functions based on Gaussian distributions of the discrimination features are implemented and statistically grounded; the features may be transformed by the Box-Cox transformation z=(1/ α)( yα-1) to make them more Gaussian. Tests of the method were successfully conducted on seismograms from the Israel Seismic Network using features consisting of spectral ratios between and within phases. Results showed that the QDF was more effective than the LDF and required five features out of 18 candidates for the optimum set. It was found that discrimination improved with increasing distance within the local range, and that eliminating transformation of the features and failing to correct for noise led to degradation of discrimination.
NASA Technical Reports Server (NTRS)
Westwater, E. R.; Snider, J. B.; Falls, M. J.; Fionda, E.
1990-01-01
Two seasons of thermal emission measurements, running from December 1987 through February 1988 and from June through August 1988 of thermal emission measurements, taken by a multi-channel, ground-based microwave radiometer, are used to derive single-station zenith attenuation statistics at 20.6 and 31.65 GHz. For the summer period, statistics are also derived for 52.85 GHz. In addition, data from two dual-channel radiometers, separated from Denver by baseline distances of 49 and 168 km, are used to derive two-station attenuation diversity statistics at 20.6 and 31.65 GHz. The multi-channel radiometer is operated at Denver, Colorado; the dual-channel devices are operated at Platteville and Flagler, Colorado. The diversity statistics are presented by cumulative distributions of maximum and minimum attenuation.
Texture Classification by Texton: Statistical versus Binary
Guo, Zhenhua; Zhang, Zhongcheng; Li, Xiu; Li, Qin; You, Jane
2014-01-01
Using statistical textons for texture classification has shown great success recently. The maximal response 8 (Statistical_MR8), image patch (Statistical_Joint) and locally invariant fractal (Statistical_Fractal) are typical statistical texton algorithms and state-of-the-art texture classification methods. However, there are two limitations when using these methods. First, it needs a training stage to build a texton library, thus the recognition accuracy will be highly depended on the training samples; second, during feature extraction, local feature is assigned to a texton by searching for the nearest texton in the whole library, which is time consuming when the library size is big and the dimension of feature is high. To address the above two issues, in this paper, three binary texton counterpart methods were proposed, Binary_MR8, Binary_Joint, and Binary_Fractal. These methods do not require any training step but encode local feature into binary representation directly. The experimental results on the CUReT, UIUC and KTH-TIPS databases show that binary texton could get sound results with fast feature extraction, especially when the image size is not big and the quality of image is not poor. PMID:24520346
Mande, Sharmila S.
2016-01-01
The nature of inter-microbial metabolic interactions defines the stability of microbial communities residing in any ecological niche. Deciphering these interaction patterns is crucial for understanding the mode/mechanism(s) through which an individual microbial community transitions from one state to another (e.g. from a healthy to a diseased state). Statistical correlation techniques have been traditionally employed for mining microbial interaction patterns from taxonomic abundance data corresponding to a given microbial community. In spite of their efficiency, these correlation techniques can capture only 'pair-wise interactions'. Moreover, their emphasis on statistical significance can potentially result in missing out on several interactions that are relevant from a biological standpoint. This study explores the applicability of one of the earliest association rule mining algorithm i.e. the 'Apriori algorithm' for deriving 'microbial association rules' from the taxonomic profile of given microbial community. The classical Apriori approach derives association rules by analysing patterns of co-occurrence/co-exclusion between various '(subsets of) features/items' across various samples. Using real-world microbiome data, the efficiency/utility of this rule mining approach in deciphering multiple (biologically meaningful) association patterns between 'subsets/subgroups' of microbes (constituting microbiome samples) is demonstrated. As an example, association rules derived from publicly available gut microbiome datasets indicate an association between a group of microbes (Faecalibacterium, Dorea, and Blautia) that are known to have mutualistic metabolic associations among themselves. Application of the rule mining approach on gut microbiomes (sourced from the Human Microbiome Project) further indicated similar microbial association patterns in gut microbiomes irrespective of the gender of the subjects. A Linux implementation of the Association Rule Mining (ARM) software (customised for deriving 'microbial association rules' from microbiome data) is freely available for download from the following link: http://metagenomics.atc.tcs.com/arm. PMID:27124399
Tandon, Disha; Haque, Mohammed Monzoorul; Mande, Sharmila S
2016-01-01
The nature of inter-microbial metabolic interactions defines the stability of microbial communities residing in any ecological niche. Deciphering these interaction patterns is crucial for understanding the mode/mechanism(s) through which an individual microbial community transitions from one state to another (e.g. from a healthy to a diseased state). Statistical correlation techniques have been traditionally employed for mining microbial interaction patterns from taxonomic abundance data corresponding to a given microbial community. In spite of their efficiency, these correlation techniques can capture only 'pair-wise interactions'. Moreover, their emphasis on statistical significance can potentially result in missing out on several interactions that are relevant from a biological standpoint. This study explores the applicability of one of the earliest association rule mining algorithm i.e. the 'Apriori algorithm' for deriving 'microbial association rules' from the taxonomic profile of given microbial community. The classical Apriori approach derives association rules by analysing patterns of co-occurrence/co-exclusion between various '(subsets of) features/items' across various samples. Using real-world microbiome data, the efficiency/utility of this rule mining approach in deciphering multiple (biologically meaningful) association patterns between 'subsets/subgroups' of microbes (constituting microbiome samples) is demonstrated. As an example, association rules derived from publicly available gut microbiome datasets indicate an association between a group of microbes (Faecalibacterium, Dorea, and Blautia) that are known to have mutualistic metabolic associations among themselves. Application of the rule mining approach on gut microbiomes (sourced from the Human Microbiome Project) further indicated similar microbial association patterns in gut microbiomes irrespective of the gender of the subjects. A Linux implementation of the Association Rule Mining (ARM) software (customised for deriving 'microbial association rules' from microbiome data) is freely available for download from the following link: http://metagenomics.atc.tcs.com/arm.
Peng, Fei; Li, Jiao-ting; Long, Min
2015-03-01
To discriminate the acquisition pipelines of digital images, a novel scheme for the identification of natural images and computer-generated graphics is proposed based on statistical and textural features. First, the differences between them are investigated from the view of statistics and texture, and 31 dimensions of feature are acquired for identification. Then, LIBSVM is used for the classification. Finally, the experimental results are presented. The results show that it can achieve an identification accuracy of 97.89% for computer-generated graphics, and an identification accuracy of 97.75% for natural images. The analyses also demonstrate the proposed method has excellent performance, compared with some existing methods based only on statistical features or other features. The method has a great potential to be implemented for the identification of natural images and computer-generated graphics. © 2014 American Academy of Forensic Sciences.
Gene Profiling in Experimental Models of Eye Growth: Clues to Myopia Pathogenesis
Stone, Richard A.; Khurana, Tejvir S.
2010-01-01
To understand the complex regulatory pathways that underlie the development of refractive errors, expression profiling has evaluated gene expression in ocular tissues of well-characterized experimental models that alter postnatal eye growth and induce refractive errors. Derived from a variety of platforms (e.g. differential display, spotted microarrays or Affymetrix GeneChips), gene expression patterns are now being identified in species that include chicken, mouse and primate. Reconciling available results is hindered by varied experimental designs and analytical/statistical features. Continued application of these methods offers promise to provide the much-needed mechanistic framework to develop therapies to normalize refractive development in children. PMID:20363242
Statistical Feature Extraction for Artifact Removal from Concurrent fMRI-EEG Recordings
Liu, Zhongming; de Zwart, Jacco A.; van Gelderen, Peter; Kuo, Li-Wei; Duyn, Jeff H.
2011-01-01
We propose a set of algorithms for sequentially removing artifacts related to MRI gradient switching and cardiac pulsations from electroencephalography (EEG) data recorded during functional magnetic resonance imaging (fMRI). Special emphases are directed upon the use of statistical metrics and methods for the extraction and selection of features that characterize gradient and pulse artifacts. To remove gradient artifacts, we use a channel-wise filtering based on singular value decomposition (SVD). To remove pulse artifacts, we first decompose data into temporally independent components and then select a compact cluster of components that possess sustained high mutual information with the electrocardiogram (ECG). After the removal of these components, the time courses of remaining components are filtered by SVD to remove the temporal patterns phase-locked to the cardiac markers derived from the ECG. The filtered component time courses are then inversely transformed into multi-channel EEG time series free of pulse artifacts. Evaluation based on a large set of simultaneous EEG-fMRI data obtained during a variety of behavioral tasks, sensory stimulations and resting conditions showed excellent data quality and robust performance attainable by the proposed methods. These algorithms have been implemented as a Matlab-based toolbox made freely available for public access and research use. PMID:22036675
Statistical feature extraction for artifact removal from concurrent fMRI-EEG recordings.
Liu, Zhongming; de Zwart, Jacco A; van Gelderen, Peter; Kuo, Li-Wei; Duyn, Jeff H
2012-02-01
We propose a set of algorithms for sequentially removing artifacts related to MRI gradient switching and cardiac pulsations from electroencephalography (EEG) data recorded during functional magnetic resonance imaging (fMRI). Special emphasis is directed upon the use of statistical metrics and methods for the extraction and selection of features that characterize gradient and pulse artifacts. To remove gradient artifacts, we use channel-wise filtering based on singular value decomposition (SVD). To remove pulse artifacts, we first decompose data into temporally independent components and then select a compact cluster of components that possess sustained high mutual information with the electrocardiogram (ECG). After the removal of these components, the time courses of remaining components are filtered by SVD to remove the temporal patterns phase-locked to the cardiac timing markers derived from the ECG. The filtered component time courses are then inversely transformed into multi-channel EEG time series free of pulse artifacts. Evaluation based on a large set of simultaneous EEG-fMRI data obtained during a variety of behavioral tasks, sensory stimulations and resting conditions showed excellent data quality and robust performance attainable with the proposed methods. These algorithms have been implemented as a Matlab-based toolbox made freely available for public access and research use. Published by Elsevier Inc.
Ludeña-Choez, Jimmy; Quispe-Soncco, Raisa; Gallardo-Antolín, Ascensión
2017-01-01
Feature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for ABSC could be enhanced by accounting for the vocal production mechanisms of birds, and, in particular, the spectro-temporal structure of bird sounds. In this paper, a new front-end for ABSC is proposed that incorporates this specific information through the non-negative decomposition of bird sound spectrograms. It consists of the following two different stages: short-time feature extraction and temporal feature integration. In the first stage, which aims at providing a better spectral representation of bird sounds on a frame-by-frame basis, two methods are evaluated. In the first method, cepstral-like features (NMF_CC) are extracted by using a filter bank that is automatically learned by means of the application of Non-Negative Matrix Factorization (NMF) on bird audio spectrograms. In the second method, the features are directly derived from the activation coefficients of the spectrogram decomposition as performed through NMF (H_CC). The second stage summarizes the most relevant information contained in the short-time features by computing several statistical measures over long segments. The experiments show that the use of NMF_CC and H_CC in conjunction with temporal integration significantly improves the performance of a Support Vector Machine (SVM)-based ABSC system with respect to conventional MFCC.
Kong, Jun; Wang, Fusheng; Teodoro, George; Cooper, Lee; Moreno, Carlos S; Kurc, Tahsin; Pan, Tony; Saltz, Joel; Brat, Daniel
2013-12-01
In this paper, we present a novel framework for microscopic image analysis of nuclei, data management, and high performance computation to support translational research involving nuclear morphometry features, molecular data, and clinical outcomes. Our image analysis pipeline consists of nuclei segmentation and feature computation facilitated by high performance computing with coordinated execution in multi-core CPUs and Graphical Processor Units (GPUs). All data derived from image analysis are managed in a spatial relational database supporting highly efficient scientific queries. We applied our image analysis workflow to 159 glioblastomas (GBM) from The Cancer Genome Atlas dataset. With integrative studies, we found statistics of four specific nuclear features were significantly associated with patient survival. Additionally, we correlated nuclear features with molecular data and found interesting results that support pathologic domain knowledge. We found that Proneural subtype GBMs had the smallest mean of nuclear Eccentricity and the largest mean of nuclear Extent, and MinorAxisLength. We also found gene expressions of stem cell marker MYC and cell proliferation maker MKI67 were correlated with nuclear features. To complement and inform pathologists of relevant diagnostic features, we queried the most representative nuclear instances from each patient population based on genetic and transcriptional classes. Our results demonstrate that specific nuclear features carry prognostic significance and associations with transcriptional and genetic classes, highlighting the potential of high throughput pathology image analysis as a complementary approach to human-based review and translational research.
Quispe-Soncco, Raisa
2017-01-01
Feature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for ABSC could be enhanced by accounting for the vocal production mechanisms of birds, and, in particular, the spectro-temporal structure of bird sounds. In this paper, a new front-end for ABSC is proposed that incorporates this specific information through the non-negative decomposition of bird sound spectrograms. It consists of the following two different stages: short-time feature extraction and temporal feature integration. In the first stage, which aims at providing a better spectral representation of bird sounds on a frame-by-frame basis, two methods are evaluated. In the first method, cepstral-like features (NMF_CC) are extracted by using a filter bank that is automatically learned by means of the application of Non-Negative Matrix Factorization (NMF) on bird audio spectrograms. In the second method, the features are directly derived from the activation coefficients of the spectrogram decomposition as performed through NMF (H_CC). The second stage summarizes the most relevant information contained in the short-time features by computing several statistical measures over long segments. The experiments show that the use of NMF_CC and H_CC in conjunction with temporal integration significantly improves the performance of a Support Vector Machine (SVM)-based ABSC system with respect to conventional MFCC. PMID:28628630
Shiozaki, H; Yoshinaga, K; Kondo, T; Imai, Y; Shiseki, M; Mori, N; Teramura, M; Motoji, T
2014-01-01
Donor cell-derived leukemia (DCL) is a rare complication of SCT. Here, we present a case of DCL following cord blood transplantation (CBT) and review the clinical features of previously reported DCL. To our knowledge, this is the first report comparing clinical characteristics of DCL from the standpoint of the transplant source, with umbilical cord blood and BM. AML and myelodysplastic syndrome (MDS) were recognized more frequently in DCL after CBT, whereas the incidence of AML and ALL was similar after BMT. The median duration between the occurrence of DCL following CBT and BMT was 14.5 and 36 months, respectively. DCL occurred in a significantly shorter period after CBT than after BMT. Abnormal karyotypes involving chromosome 7 were observed in 52.4% of CBT recipients and 17.3% of BMT recipients; this was a statistically significant difference. Particularly, the frequency of monosomy 7 was significantly higher in DCL after CBT than after BMT. The types of abnormal karyotypes in DCL following BMT were similar to those characteristically observed in adult de novo AML and MDS. DCL patients generally have a poor prognosis in both groups. SCT is the best treatment for curing DCL. DCL appears to have different clinical features according to the transplant source.
Lamart, Stephanie; Griffiths, Nina M; Tchitchek, Nicolas; Angulo, Jaime F; Van der Meeren, Anne
2017-03-01
The aim of this work was to develop a computational tool that integrates several statistical analysis features for biodistribution data from internal contamination experiments. These data represent actinide levels in biological compartments as a function of time and are derived from activity measurements in tissues and excreta. These experiments aim at assessing the influence of different contamination conditions (e.g. intake route or radioelement) on the biological behavior of the contaminant. The ever increasing number of datasets and diversity of experimental conditions make the handling and analysis of biodistribution data difficult. This work sought to facilitate the statistical analysis of a large number of datasets and the comparison of results from diverse experimental conditions. Functional modules were developed using the open-source programming language R to facilitate specific operations: descriptive statistics, visual comparison, curve fitting, and implementation of biokinetic models. In addition, the structure of the datasets was harmonized using the same table format. Analysis outputs can be written in text files and updated data can be written in the consistent table format. Hence, a data repository is built progressively, which is essential for the optimal use of animal data. Graphical representations can be automatically generated and saved as image files. The resulting computational tool was applied using data derived from wound contamination experiments conducted under different conditions. In facilitating biodistribution data handling and statistical analyses, this computational tool ensures faster analyses and a better reproducibility compared with the use of multiple office software applications. Furthermore, re-analysis of archival data and comparison of data from different sources is made much easier. Hence this tool will help to understand better the influence of contamination characteristics on actinide biokinetics. Our approach can aid the optimization of treatment protocols and therefore contribute to the improvement of the medical response after internal contamination with actinides.
Statistical appearance models based on probabilistic correspondences.
Krüger, Julia; Ehrhardt, Jan; Handels, Heinz
2017-04-01
Model-based image analysis is indispensable in medical image processing. One key aspect of building statistical shape and appearance models is the determination of one-to-one correspondences in the training data set. At the same time, the identification of these correspondences is the most challenging part of such methods. In our earlier work, we developed an alternative method using correspondence probabilities instead of exact one-to-one correspondences for a statistical shape model (Hufnagel et al., 2008). In this work, a new approach for statistical appearance models without one-to-one correspondences is proposed. A sparse image representation is used to build a model that combines point position and appearance information at the same time. Probabilistic correspondences between the derived multi-dimensional feature vectors are used to omit the need for extensive preprocessing of finding landmarks and correspondences as well as to reduce the dependence of the generated model on the landmark positions. Model generation and model fitting can now be expressed by optimizing a single global criterion derived from a maximum a-posteriori (MAP) approach with respect to model parameters that directly affect both shape and appearance of the considered objects inside the images. The proposed approach describes statistical appearance modeling in a concise and flexible mathematical framework. Besides eliminating the demand for costly correspondence determination, the method allows for additional constraints as topological regularity in the modeling process. In the evaluation the model was applied for segmentation and landmark identification in hand X-ray images. The results demonstrate the feasibility of the model to detect hand contours as well as the positions of the joints between finger bones for unseen test images. Further, we evaluated the model on brain data of stroke patients to show the ability of the proposed model to handle partially corrupted data and to demonstrate a possible employment of the correspondence probabilities to indicate these corrupted/pathological areas. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jia, Huizhen; Sun, Quansen; Ji, Zexuan; Wang, Tonghan; Chen, Qiang
2014-11-01
The goal of no-reference/blind image quality assessment (NR-IQA) is to devise a perceptual model that can accurately predict the quality of a distorted image as human opinions, in which feature extraction is an important issue. However, the features used in the state-of-the-art "general purpose" NR-IQA algorithms are usually natural scene statistics (NSS) based or are perceptually relevant; therefore, the performance of these models is limited. To further improve the performance of NR-IQA, we propose a general purpose NR-IQA algorithm which combines NSS-based features with perceptually relevant features. The new method extracts features in both the spatial and gradient domains. In the spatial domain, we extract the point-wise statistics for single pixel values which are characterized by a generalized Gaussian distribution model to form the underlying features. In the gradient domain, statistical features based on neighboring gradient magnitude similarity are extracted. Then a mapping is learned to predict quality scores using a support vector regression. The experimental results on the benchmark image databases demonstrate that the proposed algorithm correlates highly with human judgments of quality and leads to significant performance improvements over state-of-the-art methods.
NASA Astrophysics Data System (ADS)
Priya, R. Kanmani Shanmuga; Balaguru, B.; Ramakrishnan, S.
2013-10-01
The capabilities of evolving satellite remote sensing technology, combined with conventional data collection techniques, provide a powerful tool for efficient and cost effective management of living marine resources. Fishes are the valuable living marine resources producing food, profit and pleasure to the human community. Variations in oceanic condition play a role in natural fluctuations of fish stocks. The Satellite Altimeter derived Merged Sea Level Anomaly(MSLA) results in the better understanding of ocean variability and mesosclae oceanography and provides good possibility to reveal the zones of high dynamic activity. This study comprised the synergistic analysis of signatures of SEAWIFS derived chlorophyll concentration, National Oceanic and Atmospheric Administration-Advanced Very High Resolution Radiometer(NOAA-AVHRR) derived Sea Surface Temperature and the monthly Merged Sea Level Anomaly data derived from Topex/Poseidon, Jason-1 and ERS-1 Altimeters for the past 7 years during the period from 1998 to 2004. The overlapping Chlorophyll, SST and MSLA were suggested for delineating Potential Fishing Zones (PFZs). The Chlorophyll and SST data set were found to have influenced by short term persistence from days to week while MSLA signatures of respective features persisted for longer duration. Hence, the study used Altimeter derived MSLA as an index for long term variability detection of fish catches along with Chlorophyll and SST images and the maps showing PFZs of the study area were generated. The real time Fishing statistics of the same duration were procured from FSI Mumbai. The catch contours were generated with respect to peak spectra of chlorophyll variation and trough spectra of MSLA and SST variation. The vice- a- versa patterns were observed in the poor catch contours. The Catch Per Unit Effort (CPUE) for each fishing trail was calculated to normalize the fish catch. Based on the statistical analysis the actual CPUEs were classified at each probable MSLA depth zones and plotted on the same images.
NASA Astrophysics Data System (ADS)
Singh, Ravindra P.; Pallamraju, Duggirala
2017-08-01
This paper describes the development of a new Near InfraRed Imaging Spectrograph (NIRIS) which is capable of simultaneous measurements of OH(6-2) Meinel and O2(0-1) atmospheric band nightglow emission intensities. In this spectrographic technique, rotational line ratios are obtained to derive temperatures corresponding to the emission altitudes of 87 and 94 km. NIRIS has been commissioned for continuous operation from optical aeronomy observatory, Gurushikhar, Mount Abu (24.6°N, 72.8°E) since January 2013. NIRIS uses a diffraction grating of 1200 lines mm^{-1} and 1024× 1024 pixels thermoelectrically cooled CCD camera and has a large field-of-view (FOV) of 80° along the slit orientation. The data analysis methodology adopted for the derivation of mesospheric temperatures is also described in detail. The observed NIRIS temperatures show good correspondence with satellite (SABER) derived temperatures and exhibit both tidal and gravity waves (GW) like features. From the time taken for phase propagation in the emission intensities between these two altitudes, vertical phase speed of gravity waves, cz, is calculated and along with the coherent GW time period `τ ', the vertical wavelength, λ z, is obtained. Using large FOV observations from NIRIS, the meridional wavelengths, λ y, are also calculated. We have used one year of data to study the possible cause(s) for the occurrences of mesospheric temperature inversions (MTIs). From the statistics obtained for 234 nights, it appears that in situ chemical heating is mainly responsible for the observed MTIs than the vertical propagation of the waves. Thus, this paper describes a novel near infrared imaging spectrograph, its working principle, data analysis method for deriving OH and O2 emission intensities and the corresponding rotational temperatures at these altitudes, derivation of gravity wave parameters (τ , cz, λ z, and λ y), and results on the statistical study of MTIs that exist in the earth's mesospheric altitudes.
Entropy, pricing and productivity of pumped-storage
NASA Astrophysics Data System (ADS)
Karakatsanis, Georgios; Tyralis, Hristos; Tzouka, Katerina
2016-04-01
Pumped-storage constitutes today a mature method of bulk electricity storage in the form of hydropower. This bulk electricity storability upgrades the economic value of hydropower as it may mitigate -or even neutralize- stochastic effects deriving from various geophysical and socioeconomic factors, which produce numerous load balance inefficiencies due to increased uncertainty. Pumped-storage further holds a key role for unifying intermittent renewable (i.e. wind, solar) units with controllable non-renewable (i.e. nuclear, coal) fuel electricity generation plants into integrated energy systems. We develop a set of indicators for the measurement of performance of pumped-storage, in terms of the latter's energy and financial contribution to the energy system. More specifically, we use the concept of entropy in order to examine: (1) the statistical features -and correlations- of the energy system's intermittent components and (2) the statistical features of electricity demand prediction deviations. In this way, the macroeconomics of pumped-storage emerges naturally from its statistical features (Karakatsanis et al. 2014). In addition, these findings are combined to actual daily loads. Hence, not only the amount of energy harvested from the pumped-storage component is expected to be important, but the harvesting time as well, as the intraday price of electricity varies significantly. Additionally, the structure of the pumped-storage market proves to be a significant factor as well for the system's energy and financial performance (Paine et al. 2014). According to the above, we aim at postulating a set of general rules on the productivity of pumped-storage for (integrated) energy systems. Keywords: pumped-storage, storability, economic value of hydropower, stochastic effects, uncertainty, energy systems, entropy, intraday electricity price, productivity References 1. Karakatsanis, Georgios et al. (2014), Entropy, pricing and macroeconomics of pumped-storage systems, Vienna, Austria, April 27 - May 2 2014, "The Face of the Earth - Process and Form", European Geophysical Union General Assembly 2. Paine, Nathan et al. (2014), Why market rules matter: Optimizing pumped hydroelectric storage when compensation rules differ, Energy Economics 46, 10-19
Approach for Input Uncertainty Propagation and Robust Design in CFD Using Sensitivity Derivatives
NASA Technical Reports Server (NTRS)
Putko, Michele M.; Taylor, Arthur C., III; Newman, Perry A.; Green, Lawrence L.
2002-01-01
An implementation of the approximate statistical moment method for uncertainty propagation and robust optimization for quasi 3-D Euler CFD code is presented. Given uncertainties in statistically independent, random, normally distributed input variables, first- and second-order statistical moment procedures are performed to approximate the uncertainty in the CFD output. Efficient calculation of both first- and second-order sensitivity derivatives is required. In order to assess the validity of the approximations, these moments are compared with statistical moments generated through Monte Carlo simulations. The uncertainties in the CFD input variables are also incorporated into a robust optimization procedure. For this optimization, statistical moments involving first-order sensitivity derivatives appear in the objective function and system constraints. Second-order sensitivity derivatives are used in a gradient-based search to successfully execute a robust optimization. The approximate methods used throughout the analyses are found to be valid when considering robustness about input parameter mean values.
Feature-Based Statistical Analysis of Combustion Simulation Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bennett, J; Krishnamoorthy, V; Liu, S
2011-11-18
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing andmore » reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion science; however, it is applicable to many other science domains.« less
2012-01-01
Background There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions. Results The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues cys and leu. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein/ligand interactions. Several of these rules are consistent with descriptions in the literature. Conclusions In addition to confirming literature results, ProGolem’s model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein/hexose interactions and is comparable with state-of-the-art statistical learners. PMID:22783946
Robust k-mer frequency estimation using gapped k-mers
Ghandi, Mahmoud; Mohammad-Noori, Morteza
2013-01-01
Oligomers of fixed length, k, commonly known as k-mers, are often used as fundamental elements in the description of DNA sequence features of diverse biological function, or as intermediate elements in the constuction of more complex descriptors of sequence features such as position weight matrices. k-mers are very useful as general sequence features because they constitute a complete and unbiased feature set, and do not require parameterization based on incomplete knowledge of biological mechanisms. However, a fundamental limitation in the use of k-mers as sequence features is that as k is increased, larger spatial correlations in DNA sequence elements can be described, but the frequency of observing any specific k-mer becomes very small, and rapidly approaches a sparse matrix of binary counts. Thus any statistical learning approach using k-mers will be susceptible to noisy estimation of k-mer frequencies once k becomes large. Because all molecular DNA interactions have limited spatial extent, gapped k-mers often carry the relevant biological signal. Here we use gapped k-mer counts to more robustly estimate the ungapped k-mer frequencies, by deriving an equation for the minimum norm estimate of k-mer frequencies given an observed set of gapped k-mer frequencies. We demonstrate that this approach provides a more accurate estimate of the k-mer frequencies in real biological sequences using a sample of CTCF binding sites in the human genome. PMID:23861010
Robust k-mer frequency estimation using gapped k-mers.
Ghandi, Mahmoud; Mohammad-Noori, Morteza; Beer, Michael A
2014-08-01
Oligomers of fixed length, k, commonly known as k-mers, are often used as fundamental elements in the description of DNA sequence features of diverse biological function, or as intermediate elements in the constuction of more complex descriptors of sequence features such as position weight matrices. k-mers are very useful as general sequence features because they constitute a complete and unbiased feature set, and do not require parameterization based on incomplete knowledge of biological mechanisms. However, a fundamental limitation in the use of k-mers as sequence features is that as k is increased, larger spatial correlations in DNA sequence elements can be described, but the frequency of observing any specific k-mer becomes very small, and rapidly approaches a sparse matrix of binary counts. Thus any statistical learning approach using k-mers will be susceptible to noisy estimation of k-mer frequencies once k becomes large. Because all molecular DNA interactions have limited spatial extent, gapped k-mers often carry the relevant biological signal. Here we use gapped k-mer counts to more robustly estimate the ungapped k-mer frequencies, by deriving an equation for the minimum norm estimate of k-mer frequencies given an observed set of gapped k-mer frequencies. We demonstrate that this approach provides a more accurate estimate of the k-mer frequencies in real biological sequences using a sample of CTCF binding sites in the human genome.
Phenotypic characterization of glioblastoma identified through shape descriptors
NASA Astrophysics Data System (ADS)
Chaddad, Ahmad; Desrosiers, Christian; Toews, Matthew
2016-03-01
This paper proposes quantitatively describing the shape of glioblastoma (GBM) tissue phenotypes as a set of shape features derived from segmentations, for the purposes of discriminating between GBM phenotypes and monitoring tumor progression. GBM patients were identified from the Cancer Genome Atlas, and quantitative MR imaging data were obtained from the Cancer Imaging Archive. Three GBM tissue phenotypes are considered including necrosis, active tumor and edema/invasion. Volumetric tissue segmentations are obtained from registered T1˗weighted (T1˗WI) postcontrast and fluid-attenuated inversion recovery (FLAIR) MRI modalities. Shape features are computed from respective tissue phenotype segmentations, and a Kruskal-Wallis test was employed to select features capable of classification with a significance level of p < 0.05. Several classifier models are employed to distinguish phenotypes, where a leave-one-out cross-validation was performed. Eight features were found statistically significant for classifying GBM phenotypes with p <0.05, orientation is uninformative. Quantitative evaluations show the SVM results in the highest classification accuracy of 87.50%, sensitivity of 94.59% and specificity of 92.77%. In summary, the shape descriptors proposed in this work show high performance in predicting GBM tissue phenotypes. They are thus closely linked to morphological characteristics of GBM phenotypes and could potentially be used in a computer assisted labeling system.
Parametric Time-Frequency Analysis and Its Applications in Music Classification
NASA Astrophysics Data System (ADS)
Shen, Ying; Li, Xiaoli; Ma, Ngok-Wah; Krishnan, Sridhar
2010-12-01
Analysis of nonstationary signals, such as music signals, is a challenging task. The purpose of this study is to explore an efficient and powerful technique to analyze and classify music signals in higher frequency range (44.1 kHz). The pursuit methods are good tools for this purpose, but they aimed at representing the signals rather than classifying them as in Y. Paragakin et al., 2009. Among the pursuit methods, matching pursuit (MP), an adaptive true nonstationary time-frequency signal analysis tool, is applied for music classification. First, MP decomposes the sample signals into time-frequency functions or atoms. Atom parameters are then analyzed and manipulated, and discriminant features are extracted from atom parameters. Besides the parameters obtained using MP, an additional feature, central energy, is also derived. Linear discriminant analysis and the leave-one-out method are used to evaluate the classification accuracy rate for different feature sets. The study is one of the very few works that analyze atoms statistically and extract discriminant features directly from the parameters. From our experiments, it is evident that the MP algorithm with the Gabor dictionary decomposes nonstationary signals, such as music signals, into atoms in which the parameters contain strong discriminant information sufficient for accurate and efficient signal classifications.
A Stochastic Fractional Dynamics Model of Space-time Variability of Rain
NASA Technical Reports Server (NTRS)
Kundu, Prasun K.; Travis, James E.
2013-01-01
Rainfall varies in space and time in a highly irregular manner and is described naturally in terms of a stochastic process. A characteristic feature of rainfall statistics is that they depend strongly on the space-time scales over which rain data are averaged. A spectral model of precipitation has been developed based on a stochastic differential equation of fractional order for the point rain rate, that allows a concise description of the second moment statistics of rain at any prescribed space-time averaging scale. The model is thus capable of providing a unified description of the statistics of both radar and rain gauge data. The underlying dynamical equation can be expressed in terms of space-time derivatives of fractional orders that are adjusted together with other model parameters to fit the data. The form of the resulting spectrum gives the model adequate flexibility to capture the subtle interplay between the spatial and temporal scales of variability of rain but strongly constrains the predicted statistical behavior as a function of the averaging length and times scales. We test the model with radar and gauge data collected contemporaneously at the NASA TRMM ground validation sites located near Melbourne, Florida and in Kwajalein Atoll, Marshall Islands in the tropical Pacific. We estimate the parameters by tuning them to the second moment statistics of radar data. The model predictions are then found to fit the second moment statistics of the gauge data reasonably well without any further adjustment.
Zhang, Hanyuan; Tian, Xuemin; Deng, Xiaogang; Cao, Yuping
2018-05-16
As an attractive nonlinear dynamic data analysis tool, global preserving kernel slow feature analysis (GKSFA) has achieved great success in extracting the high nonlinearity and inherently time-varying dynamics of batch process. However, GKSFA is an unsupervised feature extraction method and lacks the ability to utilize batch process class label information, which may not offer the most effective means for dealing with batch process monitoring. To overcome this problem, we propose a novel batch process monitoring method based on the modified GKSFA, referred to as discriminant global preserving kernel slow feature analysis (DGKSFA), by closely integrating discriminant analysis and GKSFA. The proposed DGKSFA method can extract discriminant feature of batch process as well as preserve global and local geometrical structure information of observed data. For the purpose of fault detection, a monitoring statistic is constructed based on the distance between the optimal kernel feature vectors of test data and normal data. To tackle the challenging issue of nonlinear fault variable identification, a new nonlinear contribution plot method is also developed to help identifying the fault variable after a fault is detected, which is derived from the idea of variable pseudo-sample trajectory projection in DGKSFA nonlinear biplot. Simulation results conducted on a numerical nonlinear dynamic system and the benchmark fed-batch penicillin fermentation process demonstrate that the proposed process monitoring and fault diagnosis approach can effectively detect fault and distinguish fault variables from normal variables. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
Research of facial feature extraction based on MMC
NASA Astrophysics Data System (ADS)
Xue, Donglin; Zhao, Jiufen; Tang, Qinhong; Shi, Shaokun
2017-07-01
Based on the maximum margin criterion (MMC), a new algorithm of statistically uncorrelated optimal discriminant vectors and a new algorithm of orthogonal optimal discriminant vectors for feature extraction were proposed. The purpose of the maximum margin criterion is to maximize the inter-class scatter while simultaneously minimizing the intra-class scatter after the projection. Compared with original MMC method and principal component analysis (PCA) method, the proposed methods are better in terms of reducing or eliminating the statistically correlation between features and improving recognition rate. The experiment results on Olivetti Research Laboratory (ORL) face database shows that the new feature extraction method of statistically uncorrelated maximum margin criterion (SUMMC) are better in terms of recognition rate and stability. Besides, the relations between maximum margin criterion and Fisher criterion for feature extraction were revealed.
NASA Astrophysics Data System (ADS)
Nickelsen, Daniel
2017-07-01
The statistics of velocity increments in homogeneous and isotropic turbulence exhibit universal features in the limit of infinite Reynolds numbers. After Kolmogorov’s scaling law from 1941, many turbulence models aim for capturing these universal features, some are known to have an equivalent formulation in terms of Markov processes. We derive the Markov process equivalent to the particularly successful scaling law postulated by She and Leveque. The Markov process is a jump process for velocity increments u(r) in scale r in which the jumps occur randomly but with deterministic width in u. From its master equation we establish a prescription to simulate the She-Leveque process and compare it with Kolmogorov scaling. To put the She-Leveque process into the context of other established turbulence models on the Markov level, we derive a diffusion process for u(r) using two properties of the Navier-Stokes equation. This diffusion process already includes Kolmogorov scaling, extended self-similarity and a class of random cascade models. The fluctuation theorem of this Markov process implies a ‘second law’ that puts a loose bound on the multipliers of the random cascade models. This bound explicitly allows for instances of inverse cascades, which are necessary to satisfy the fluctuation theorem. By adding a jump process to the diffusion process, we go beyond Kolmogorov scaling and formulate the most general scaling law for the class of Markov processes having both diffusion and jump parts. This Markov scaling law includes She-Leveque scaling and a scaling law derived by Yakhot.
Feature selection from a facial image for distinction of sasang constitution.
Koo, Imhoi; Kim, Jong Yeol; Kim, Myoung Geun; Kim, Keun Ho
2009-09-01
Recently, oriental medicine has received attention for providing personalized medicine through consideration of the unique nature and constitution of individual patients. With the eventual goal of globalization, the current trend in oriental medicine research is the standardization by adopting western scientific methods, which could represent a scientific revolution. The purpose of this study is to establish methods for finding statistically significant features in a facial image with respect to distinguishing constitution and to show the meaning of those features. From facial photo images, facial elements are analyzed in terms of the distance, angle and the distance ratios, for which there are 1225, 61 250 and 749 700 features, respectively. Due to the very large number of facial features, it is quite difficult to determine truly meaningful features. We suggest a process for the efficient analysis of facial features including the removal of outliers, control for missing data to guarantee data confidence and calculation of statistical significance by applying ANOVA. We show the statistical properties of selected features according to different constitutions using the nine distances, 10 angles and 10 rates of distance features that are finally established. Additionally, the Sasang constitutional meaning of the selected features is shown here.
Feature Selection from a Facial Image for Distinction of Sasang Constitution
Koo, Imhoi; Kim, Jong Yeol; Kim, Myoung Geun
2009-01-01
Recently, oriental medicine has received attention for providing personalized medicine through consideration of the unique nature and constitution of individual patients. With the eventual goal of globalization, the current trend in oriental medicine research is the standardization by adopting western scientific methods, which could represent a scientific revolution. The purpose of this study is to establish methods for finding statistically significant features in a facial image with respect to distinguishing constitution and to show the meaning of those features. From facial photo images, facial elements are analyzed in terms of the distance, angle and the distance ratios, for which there are 1225, 61 250 and 749 700 features, respectively. Due to the very large number of facial features, it is quite difficult to determine truly meaningful features. We suggest a process for the efficient analysis of facial features including the removal of outliers, control for missing data to guarantee data confidence and calculation of statistical significance by applying ANOVA. We show the statistical properties of selected features according to different constitutions using the nine distances, 10 angles and 10 rates of distance features that are finally established. Additionally, the Sasang constitutional meaning of the selected features is shown here. PMID:19745013
Taxonomy-aware feature engineering for microbiome classification.
Oudah, Mai; Henschel, Andreas
2018-06-15
What is a healthy microbiome? The pursuit of this and many related questions, especially in light of the recently recognized microbial component in a wide range of diseases has sparked a surge in metagenomic studies. They are often not simply attributable to a single pathogen but rather are the result of complex ecological processes. Relatedly, the increasing DNA sequencing depth and number of samples in metagenomic case-control studies enabled the applicability of powerful statistical methods, e.g. Machine Learning approaches. For the latter, the feature space is typically shaped by the relative abundances of operational taxonomic units, as determined by cost-effective phylogenetic marker gene profiles. While a substantial body of microbiome/microbiota research involves unsupervised and supervised Machine Learning, very little attention has been put on feature selection and engineering. We here propose the first algorithm to exploit phylogenetic hierarchy (i.e. an all-encompassing taxonomy) in feature engineering for microbiota classification. The rationale is to exploit the often mono- or oligophyletic distribution of relevant (but hidden) traits by virtue of taxonomic abstraction. The algorithm is embedded in a comprehensive microbiota classification pipeline, which we applied to a diverse range of datasets, distinguishing healthy from diseased microbiota samples. We demonstrate substantial improvements over the state-of-the-art microbiota classification tools in terms of classification accuracy, regardless of the actual Machine Learning technique while using drastically reduced feature spaces. Moreover, generalized features bear great explanatory value: they provide a concise description of conditions and thus help to provide pathophysiological insights. Indeed, the automatically and reproducibly derived features are consistent with previously published domain expert analyses.
NASA Astrophysics Data System (ADS)
Cao, Kunlin; Bhagalia, Roshni; Sood, Anup; Brogi, Edi; Mellinghoff, Ingo K.; Larson, Steven M.
2015-03-01
Positron emission tomography (PET) using uorodeoxyglucose (18F-FDG) is commonly used in the assessment of breast lesions by computing voxel-wise standardized uptake value (SUV) maps. Simple metrics derived from ensemble properties of SUVs within each identified breast lesion are routinely used for disease diagnosis. The maximum SUV within the lesion (SUVmax) is the most popular of these metrics. However these simple metrics are known to be error-prone and are susceptible to image noise. Finding reliable SUV map-based features that correlate to established molecular phenotypes of breast cancer (viz. estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) expression) will enable non-invasive disease management. This study investigated 36 SUV features based on first and second order statistics, local histograms and texture of segmented lesions to predict ER and PR expression in 51 breast cancer patients. True ER and PR expression was obtained via immunohistochemistry (IHC) of tissue samples from each lesion. A supervised learning, adaptive boosting-support vector machine (AdaBoost-SVM), framework was used to select a subset of features to classify breast lesions into distinct phenotypes. Performance of the trained multi-feature classifier was compared against the baseline single-feature SUVmax classifier using receiver operating characteristic (ROC) curves. Results show that texture features encoding local lesion homogeneity extracted from gray-level co-occurrence matrices are the strongest discriminator of lesion ER expression. In particular, classifiers including these features increased prediction accuracy from 0.75 (baseline) to 0.82 and the area under the ROC curve from 0.64 (baseline) to 0.75.
NASA Astrophysics Data System (ADS)
Elgeti, Jens; Gompper, Gerhard
2016-11-01
Both, in their natural environment and in a controlled experimental setup, microswimmers regularly interact with surfaces. These surfaces provide a steric boundary, both for the swimming motion and the hydrodynamic flow pattern. These effects typically imply a strong accumulation of microswimmers near surfaces. While some generic features can be derived, details of the swimmer shape and propulsion mechanism matter, which give rise to a broad range of adhesion phenomena and have to be taken into account to predict the surface accumulation for a given swimmer. We show in this minireview how numerical simulations and analytic theory can be used to predict the accumulation statistics for different systems, with an emphasis on swimmer shape, hydrodynamics interactions, and type of noisy dynamics.
NASA Astrophysics Data System (ADS)
Alekseev, V. A.; Krylova, D. D.
1996-02-01
The analytical investigation of Bloch equations is used to describe the main features of the 1D velocity selective coherent population trapping cooling scheme. For the initial stage of cooling the fraction of cooled atoms is derived in the case of a Gaussian initial velocity distribution. At very long times of interaction the fraction of cooled atoms and the velocity distribution function are described by simple analytical formulae and do not depend on the initial distribution. These results are in good agreement with those of Bardou, Bouchaud, Emile, Aspect and Cohen-Tannoudji based on statistical analysis in terms of Levy flights and with Monte-Carlo simulations of the process.
Fluctuations of global energy release and crackling in nominally brittle heterogeneous fracture.
Barés, J; Hattali, M L; Dalmas, D; Bonamy, D
2014-12-31
The temporal evolution of mechanical energy and spatially averaged crack speed are both monitored in slowly fracturing artificial rocks. Both signals display an irregular burstlike dynamics, with power-law distributed fluctuations spanning a broad range of scales. Yet, the elastic power released at each time step is proportional to the global velocity all along the process, which enables defining a material-constant fracture energy. We characterize the intermittent dynamics by computing the burst statistics. This latter displays the scale-free features signature of crackling dynamics, in qualitative but not quantitative agreement with the depinning interface models derived for fracture problems. The possible sources of discrepancies are pointed out and discussed.
Extraction of lead and ridge characteristics from SAR images of sea ice
NASA Technical Reports Server (NTRS)
Vesecky, John F.; Smith, Martha P.; Samadani, Ramin
1990-01-01
Image-processing techniques for extracting the characteristics of lead and pressure ridge features in SAR images of sea ice are reported. The methods are applied to a SAR image of the Beaufort Sea collected from the Seasat satellite on October 3, 1978. Estimates of lead and ridge statistics are made, e.g., lead and ridge density (number of lead or ridge pixels per unit area of image) and the distribution of lead area and orientation as well as ridge length and orientation. The information derived is useful in both ice science and polar operations for such applications as albedo and heat and momentum transfer estimates, as well as ship routing and offshore engineering.
ms 2: A molecular simulation tool for thermodynamic properties, release 3.0
NASA Astrophysics Data System (ADS)
Rutkai, Gábor; Köster, Andreas; Guevara-Carrion, Gabriela; Janzen, Tatjana; Schappals, Michael; Glass, Colin W.; Bernreuther, Martin; Wafai, Amer; Stephan, Simon; Kohns, Maximilian; Reiser, Steffen; Deublein, Stephan; Horsch, Martin; Hasse, Hans; Vrabec, Jadran
2017-12-01
A new version release (3.0) of the molecular simulation tool ms 2 (Deublein et al., 2011; Glass et al. 2014) is presented. Version 3.0 of ms 2 features two additional ensembles, i.e. microcanonical (NVE) and isobaric-isoenthalpic (NpH), various Helmholtz energy derivatives in the NVE ensemble, thermodynamic integration as a method for calculating the chemical potential, the osmotic pressure for calculating the activity of solvents, the six Maxwell-Stefan diffusion coefficients of quaternary mixtures, statistics for sampling hydrogen bonds, smooth-particle mesh Ewald summation as well as the ability to carry out molecular dynamics runs for an arbitrary number of state points in a single program execution.
2009-04-01
Shelf, and into the Gulf of Mexico, empirically derived chl ; increases were observed in the Tortugas Gyre circulation feature, and in adjacent...Mexico, empirically derived chl a increases were observed in the Tortugas Gyre circulation feature, and in adjacent waters. Analy- sis of the...hurricane interaction also influenced the Tortugas Gyre, a recognized circulation feature in the southern Gulf of Mexico induced by the flow of the
Rand, R.S.; Clark, R.N.; Livo, K.E.
2011-01-01
The Deepwater Horizon oil spill covered a very large geographical area in the Gulf of Mexico creating potentially serious environmental impacts on both marine life and the coastal shorelines. Knowing the oil's areal extent and thickness as well as denoting different categories of the oil's physical state is important for assessing these impacts. High spectral resolution data in hyperspectral imagery (HSI) sensors such as Airborne Visible and Infrared Imaging Spectrometer (AVIRIS) provide a valuable source of information that can be used for analysis by semi-automatic methods for tracking an oil spill's areal extent, oil thickness, and oil categories. However, the spectral behavior of oil in water is inherently a highly non-linear and variable phenomenon that changes depending on oil thickness and oil/water ratios. For certain oil thicknesses there are well-defined absorption features, whereas for very thin films sometimes there are almost no observable features. Feature-based imaging spectroscopy methods are particularly effective at classifying materials that exhibit specific well-defined spectral absorption features. Statistical methods are effective at classifying materials with spectra that exhibit a considerable amount of variability and that do not necessarily exhibit well-defined spectral absorption features. This study investigates feature-based and statistical methods for analyzing oil spills using hyperspectral imagery. The appropriate use of each approach is investigated and a combined feature-based and statistical method is proposed.
Evolution of statistical properties for a nonlinearly propagating sinusoid.
Shepherd, Micah R; Gee, Kent L; Hanford, Amanda D
2011-07-01
The nonlinear propagation of a pure sinusoid is considered using time domain statistics. The probability density function, standard deviation, skewness, kurtosis, and crest factor are computed for both the amplitude and amplitude time derivatives as a function of distance. The amplitude statistics vary only in the postshock realm, while the amplitude derivative statistics vary rapidly in the preshock realm. The statistical analysis also suggests that the sawtooth onset distance can be considered to be earlier than previously realized. © 2011 Acoustical Society of America
A method for automatic feature points extraction of human vertebrae three-dimensional model
NASA Astrophysics Data System (ADS)
Wu, Zhen; Wu, Junsheng
2017-05-01
A method for automatic extraction of the feature points of the human vertebrae three-dimensional model is presented. Firstly, the statistical model of vertebrae feature points is established based on the results of manual vertebrae feature points extraction. Then anatomical axial analysis of the vertebrae model is performed according to the physiological and morphological characteristics of the vertebrae. Using the axial information obtained from the analysis, a projection relationship between the statistical model and the vertebrae model to be extracted is established. According to the projection relationship, the statistical model is matched with the vertebrae model to get the estimated position of the feature point. Finally, by analyzing the curvature in the spherical neighborhood with the estimated position of feature points, the final position of the feature points is obtained. According to the benchmark result on multiple test models, the mean relative errors of feature point positions are less than 5.98%. At more than half of the positions, the error rate is less than 3% and the minimum mean relative error is 0.19%, which verifies the effectiveness of the method.
Hamill, Daniel; Buscombe, Daniel; Wheaton, Joseph M
2018-01-01
Side scan sonar in low-cost 'fishfinder' systems has become popular in aquatic ecology and sedimentology for imaging submerged riverbed sediment at coverages and resolutions sufficient to relate bed texture to grain-size. Traditional methods to map bed texture (i.e. physical samples) are relatively high-cost and low spatial coverage compared to sonar, which can continuously image several kilometers of channel in a few hours. Towards a goal of automating the classification of bed habitat features, we investigate relationships between substrates and statistical descriptors of bed textures in side scan sonar echograms of alluvial deposits. We develop a method for automated segmentation of bed textures into between two to five grain-size classes. Second-order texture statistics are used in conjunction with a Gaussian Mixture Model to classify the heterogeneous bed into small homogeneous patches of sand, gravel, and boulders with an average accuracy of 80%, 49%, and 61%, respectively. Reach-averaged proportions of these sediment types were within 3% compared to similar maps derived from multibeam sonar.
NASA Astrophysics Data System (ADS)
Pilger, Christoph; Schmidt, Carsten; Bittner, Michael
2013-02-01
The detection of infrasonic signals in temperature time series of the mesopause altitude region (at about 80-100 km) is performed at the German Remote Sensing Data Center of the German Aerospace Center (DLR-DFD) using GRIPS instrumentation (GRound-based Infrared P-branch Spectrometers). Mesopause temperature values with a temporal resolution of up to 10 s are derived from the observation of nocturnal airglow emissions and permit the identification of signals within the long-period infrasound range.Spectral intensities of wave signatures with periods between 2.5 and 10 min are estimated applying the wavelet analysis technique to one minute mean temperature values. Selected events as well as the statistical distribution of 40 months of observation are presented and discussed with respect to resonant modes of the atmosphere. The mechanism of acoustic resonance generated by strong infrasonic sources is a potential explanation of distinct features with periods between 3 and 5 min observed in the dataset.
Photon-number statistics in resonance fluorescence
NASA Astrophysics Data System (ADS)
Lenstra, D.
1982-12-01
The theory of photon-number statistics in resonance fluorescence is treated, starting with the general formula for the emission probability of n photons during a given time interval T. The results fully confirm formerly obtained results by Cook that were based on the theory of atomic motion in a traveling wave. General expressions for the factorial moments are derived and explicit results for the mean and the variance are given. It is explicitly shown that the distribution function tends to a Gaussian when T becomes much larger than the natural lifetime of the excited atom. The speed of convergence towards the Gaussian is found to be typically slow, that is, the third normalized central moment (or the skewness) is proportional to T-12. However, numerical results illustrate that the overall features of the distribution function are already well represented by a Gaussian when T is larger than a few natural lifetimes only, at least if the intensity of the exciting field is not too small and its detuning is not too large.
Feature discrimination/identification based upon SAR return variations
NASA Technical Reports Server (NTRS)
Rasco, W. A., Sr.; Pietsch, R.
1978-01-01
A study of the statistics of The look-to-look variation statistics in the returns recorded in-flight by a digital, realtime SAR system are analyzed. The determination that the variations in the look-to-look returns from different classes do carry information content unique to the classes was illustrated by a model based on four variants derived from four look in-flight SAR data under study. The model was limited to four classes of returns: mowed grass on a athletic field, rough unmowed grass and weeds on a large vacant field, young fruit trees in a large orchard, and metal mobile homes and storage buildings in a large mobile home park. The data population in excess of 1000 returns represented over 250 individual pixels from the four classes. The multivariant discriminant model operated on the set of returns for each pixel and assigned that pixel to one of the four classes, based on the target variants and the probability distribution function of the four variants for each class.
Nonlinear scalar forcing based on a reaction analogy
NASA Astrophysics Data System (ADS)
Daniel, Don; Livescu, Daniel
2017-11-01
We present a novel reaction analogy (RA) based forcing method for generating stationary passive scalar fields in incompressible turbulence. The new method can produce more general scalar PDFs (e.g. double-delta) than current methods, while ensuring that scalar fields remain bounded, unlike existent forcing methodologies that can potentially violate naturally existing bounds. Such features are useful for generating initial fields in non-premixed combustion or for studying non-Gaussian scalar turbulence. The RA method mathematically models hypothetical chemical reactions that convert reactants in a mixed state back into its pure unmixed components. Various types of chemical reactions are formulated and the corresponding mathematical expressions derived. For large values of the scalar dissipation rate, the method produces statistically steady double-delta scalar PDFs. Gaussian scalar statistics are recovered for small values of the scalar dissipation rate. In contrast, classical forcing methods consistently produce unimodal Gaussian scalar fields. The ability of the new method to produce fully developed scalar fields is discussed using 2563, 5123, and 10243 periodic box simulations.
The Small World of Psychopathology
Borsboom, Denny; Cramer, Angélique O. J.; Schmittmann, Verena D.; Epskamp, Sacha; Waldorp, Lourens J.
2011-01-01
Background Mental disorders are highly comorbid: people having one disorder are likely to have another as well. We explain empirical comorbidity patterns based on a network model of psychiatric symptoms, derived from an analysis of symptom overlap in the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV). Principal Findings We show that a) half of the symptoms in the DSM-IV network are connected, b) the architecture of these connections conforms to a small world structure, featuring a high degree of clustering but a short average path length, and c) distances between disorders in this structure predict empirical comorbidity rates. Network simulations of Major Depressive Episode and Generalized Anxiety Disorder show that the model faithfully reproduces empirical population statistics for these disorders. Conclusions In the network model, mental disorders are inherently complex. This explains the limited successes of genetic, neuroscientific, and etiological approaches to unravel their causes. We outline a psychosystems approach to investigate the structure and dynamics of mental disorders. PMID:22114671
Calibrating genomic and allelic coverage bias in single-cell sequencing.
Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher
2015-04-16
Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.
Calibrating genomic and allelic coverage bias in single-cell sequencing
Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher
2016-01-01
Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913
NITPICK: peak identification for mass spectrometry data.
Renard, Bernhard Y; Kirchner, Marc; Steen, Hanno; Steen, Judith A J; Hamprecht, Fred A
2008-08-28
The reliable extraction of features from mass spectra is a fundamental step in the automated analysis of proteomic mass spectrometry (MS) experiments. This contribution proposes a sparse template regression approach to peak picking called NITPICK. NITPICK is a Non-greedy, Iterative Template-based peak PICKer that deconvolves complex overlapping isotope distributions in multicomponent mass spectra. NITPICK is based on fractional averaging, a novel extension to Senko's well-known averaging model, and on a modified version of sparse, non-negative least angle regression, for which a suitable, statistically motivated early stopping criterion has been derived. The strength of NITPICK is the deconvolution of overlapping mixture mass spectra. Extensive comparative evaluation has been carried out and results are provided for simulated and real-world data sets. NITPICK outperforms pepex, to date the only alternate, publicly available, non-greedy feature extraction routine. NITPICK is available as software package for the R programming language and can be downloaded from (http://hci.iwr.uni-heidelberg.de/mip/proteomics/).
NASA Astrophysics Data System (ADS)
Howell, R. H.; Sterne, P. A.; Fluss, M. J.; Kaiser, J. H.; Kitazawa, K.; Kojima, H.
1994-05-01
We have measured and calculated the electron-positron momentum distribution of La2-xSrxCuO4 samples for Sr concentrations of 0, 0.1, 0.13, and 0.2. Measured distributions were obtained at room temperature with high statistical precision, greater than 4×108 events, in the Lawrence Livermore National Laboratory positron-annihilation angular correlation spectrometer on single-crystal samples fabricated using the traveling solvent floating zone technique. Corresponding theoretical momentum-density calculations were performed using the linear muffin-tin-orbital method. The momentum distribution of all samples contained features derived from the overlap of the positron distribution with the valence electrons. In addition, discontinuities typical of a Fermi surface are seen in the doped samples. The form and position of these features are in general agreement with the Fermi surface and overall momentum distributions as predicted by band theory. However, the evolution of the Fermi surface with doping differed significantly from expectations based on single electron band theories.
Crystal surface analysis using matrix textural features classified by a probabilistic neural network
NASA Astrophysics Data System (ADS)
Sawyer, Curry R.; Quach, Viet; Nason, Donald; van den Berg, Lodewijk
1991-12-01
A system is under development in which surface quality of a growing bulk mercuric iodide crystal is monitored by video camera at regular intervals for early detection of growth irregularities. Mercuric iodide single crystals are employed in radiation detectors. A microcomputer system is used for image capture and processing. The digitized image is divided into multiple overlapping sub-images and features are extracted from each sub-image based on statistical measures of the gray tone distribution, according to the method of Haralick. Twenty parameters are derived from each sub-image and presented to a probabilistic neural network (PNN) for classification. This number of parameters was found to be optimal for the system. The PNN is a hierarchical, feed-forward network that can be rapidly reconfigured as additional training data become available. Training data is gathered by reviewing digital images of many crystals during their growth cycle and compiling two sets of images, those with and without irregularities.
Visual analysis of the effects of load carriage on gait
NASA Astrophysics Data System (ADS)
Wittman, Michael G.; Ward, James M.; Flynn, Patrick J.
2005-03-01
As early as the 1970's it was determined that gait, or the "manner of walking" is an identifying feature of a human being. Since then, extensive research has been done in the field of computer vision to determine how accurately a subject can be identified by gait characteristics. This has necessarily led to the study of how various data collection conditions, such as terrain type, varying camera angles, or a carried briefcase, may affect the identifying features of gait. However, little or no research has been done to question whether such conditions may be inferred from gait analysis. For example, is it possible to determine characteristics of the walking surface simply by looking at statistics derived from the subject's gait? The question to be addressed is whether significant concealed weight distributed on the subject's torso can be discovered through analysis of his gait. Individual trends in subjects in response to increasing concealed weight will be explored, with the objective of finding universal trends that would have obvious security purposes.
Use of MODIS Cloud Top Pressure to Improve Assimilation Yields of AIRS Radiances in GSI
NASA Technical Reports Server (NTRS)
Zavodsky, Bradley; Srikishen, Jayanthi
2014-01-01
Radiances from hyperspectral sounders such as the Atmospheric Infrared Sounder (AIRS) are routinely assimilated both globally and regionally in operational numerical weather prediction (NWP) systems using the Gridpoint Statistical Interpolation (GSI) data assimilation system. However, only thinned, cloud-free radiances from a 281-channel subset are used, so the overall percentage of these observations that are assimilated is somewhere on the order of 5%. Cloud checks are performed within GSI to determine which channels peak above cloud top; inaccuracies may lead to less assimilated radiances or introduction of biases from cloud-contaminated radiances.Relatively large footprint from AIRS may not optimally represent small-scale cloud features that might be better resolved by higher-resolution imagers like the Moderate Resolution Imaging Spectroradiometer (MODIS). Objective of this project is to "swap" the MODIS-derived cloud top pressure (CTP) for that designated by the AIRS-only quality control within GSI to test the hypothesis that better representation of cloud features will result in higher assimilated radiance yields and improved forecasts.
NASA Astrophysics Data System (ADS)
Carnes, Michael R.; Mitchell, Jim L.; de Witt, P. Webb
1990-10-01
Synthetic temperature profiles are computed from altimeter-derived sea surface heights in the Gulf Stream region. The required relationships between surface height (dynamic height at the surface relative to 1000 dbar) and subsurface temperature are provided from regression relationships between dynamic height and amplitudes of empirical orthogonal functions (EOFs) of the vertical structure of temperature derived by de Witt (1987). Relationships were derived for each month of the year from historical temperature and salinity profiles from the region surrounding the Gulf Stream northeast of Cape Hatteras. Sea surface heights are derived using two different geoid estimates, the feature-modeled geoid and the air-dropped expendable bathythermograph (AXBT) geoid, both described by Carnes et al. (1990). The accuracy of the synthetic profiles is assessed by comparison to 21 AXBT profile sections which were taken during three surveys along 12 Geosat ERM ground tracks nearly contemporaneously with Geosat overflights. The primary error statistic considered is the root-mean-square (rms) difference between AXBT and synthetic isotherm depths. The two sources of error are the EOF relationship and the altimeter-derived surface heights. EOF-related and surface height-related errors in synthetic temperature isotherm depth are of comparable magnitude; each translates into about a 60-m rms isotherm depth error, or a combined 80 m to 90 m error for isotherms in the permanent thermocline. EOF-related errors are responsible for the absence of the near-surface warm core of the Gulf Stream and for the reduced volume of Eighteen Degree Water in the upper few hundred meters of (apparently older) cold-core rings in the synthetic profiles. The overall rms difference between surface heights derived from the altimeter and those computed from AXBT profiles is 0.15 dyn m when the feature-modeled geoid is used and 0.19 dyn m when the AXBT geoid is used; the portion attributable to altimeter-derived surface height errors alone is 0.03 dyn m less for each. In most cases, the deeper structure of the Gulf Stream and eddies is reproduced well by vertical sections of synthetic temperature, with largest errors typically in regions of high horizontal gradient such as across rings and the Gulf Stream front.
Quantifying risks with exact analytical solutions of derivative pricing distribution
NASA Astrophysics Data System (ADS)
Zhang, Kun; Liu, Jing; Wang, Erkang; Wang, Jin
2017-04-01
Derivative (i.e. option) pricing is essential for modern financial instrumentations. Despite of the previous efforts, the exact analytical forms of the derivative pricing distributions are still challenging to obtain. In this study, we established a quantitative framework using path integrals to obtain the exact analytical solutions of the statistical distribution for bond and bond option pricing for the Vasicek model. We discuss the importance of statistical fluctuations away from the expected option pricing characterized by the distribution tail and their associations to value at risk (VaR). The framework established here is general and can be applied to other financial derivatives for quantifying the underlying statistical distributions.
NASA Astrophysics Data System (ADS)
Wang, Dong
2016-03-01
Gears are the most commonly used components in mechanical transmission systems. Their failures may cause transmission system breakdown and result in economic loss. Identification of different gear crack levels is important to prevent any unexpected gear failure because gear cracks lead to gear tooth breakage. Signal processing based methods mainly require expertize to explain gear fault signatures which is usually not easy to be achieved by ordinary users. In order to automatically identify different gear crack levels, intelligent gear crack identification methods should be developed. The previous case studies experimentally proved that K-nearest neighbors based methods exhibit high prediction accuracies for identification of 3 different gear crack levels under different motor speeds and loads. In this short communication, to further enhance prediction accuracies of existing K-nearest neighbors based methods and extend identification of 3 different gear crack levels to identification of 5 different gear crack levels, redundant statistical features are constructed by using Daubechies 44 (db44) binary wavelet packet transform at different wavelet decomposition levels, prior to the use of a K-nearest neighbors method. The dimensionality of redundant statistical features is 620, which provides richer gear fault signatures. Since many of these statistical features are redundant and highly correlated with each other, dimensionality reduction of redundant statistical features is conducted to obtain new significant statistical features. At last, the K-nearest neighbors method is used to identify 5 different gear crack levels under different motor speeds and loads. A case study including 3 experiments is investigated to demonstrate that the developed method provides higher prediction accuracies than the existing K-nearest neighbors based methods for recognizing different gear crack levels under different motor speeds and loads. Based on the new significant statistical features, some other popular statistical models including linear discriminant analysis, quadratic discriminant analysis, classification and regression tree and naive Bayes classifier, are compared with the developed method. The results show that the developed method has the highest prediction accuracies among these statistical models. Additionally, selection of the number of new significant features and parameter selection of K-nearest neighbors are thoroughly investigated.
Feature extraction and classification algorithms for high dimensional data
NASA Technical Reports Server (NTRS)
Lee, Chulhee; Landgrebe, David
1993-01-01
Feature extraction and classification algorithms for high dimensional data are investigated. Developments with regard to sensors for Earth observation are moving in the direction of providing much higher dimensional multispectral imagery than is now possible. In analyzing such high dimensional data, processing time becomes an important factor. With large increases in dimensionality and the number of classes, processing time will increase significantly. To address this problem, a multistage classification scheme is proposed which reduces the processing time substantially by eliminating unlikely classes from further consideration at each stage. Several truncation criteria are developed and the relationship between thresholds and the error caused by the truncation is investigated. Next an approach to feature extraction for classification is proposed based directly on the decision boundaries. It is shown that all the features needed for classification can be extracted from decision boundaries. A characteristic of the proposed method arises by noting that only a portion of the decision boundary is effective in discriminating between classes, and the concept of the effective decision boundary is introduced. The proposed feature extraction algorithm has several desirable properties: it predicts the minimum number of features necessary to achieve the same classification accuracy as in the original space for a given pattern recognition problem; and it finds the necessary feature vectors. The proposed algorithm does not deteriorate under the circumstances of equal means or equal covariances as some previous algorithms do. In addition, the decision boundary feature extraction algorithm can be used both for parametric and non-parametric classifiers. Finally, some problems encountered in analyzing high dimensional data are studied and possible solutions are proposed. First, the increased importance of the second order statistics in analyzing high dimensional data is recognized. By investigating the characteristics of high dimensional data, the reason why the second order statistics must be taken into account in high dimensional data is suggested. Recognizing the importance of the second order statistics, there is a need to represent the second order statistics. A method to visualize statistics using a color code is proposed. By representing statistics using color coding, one can easily extract and compare the first and the second statistics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thor, M; Tyagi, N; Deasy, J
2015-06-15
Purpose: The aim of this study was to explore the use of Magnetic Resonance Imaging (MRI)-derived features as indicators of Radiotherapy (RT)-induced normal tissue morbidity. We also investigate the relationship between these features and RT dose in four critical structures. Methods: We demonstrate our approach for four patients treated with RT for base of tongue cancer in 2005–2007. For each patient, two MRI scans (T1-weighted pre (T1pre) and post (T1post) gadolinium contrast-enhancement) were acquired within the first six months after RT. The assessed morbidity endpoint observed in 2/4 patients was Grade 2+ CTCAEv.3 trismus. Four ipsilateral masticatory-related structures (masseter, lateralmore » and medial pterygoid, and the temporal muscles) were delineated on both T1pre and T1post and these scans were co-registered to the treatment planning CT using a deformable demons algorithm. For each structure, the maximum and mean RT dose, and six MRI-derived features (the second order texture features entropy and homogeneity, and the first order mean, median, kurtosis, and skewness) were extracted and compared structure-wise between patients with and without trismus. All MRI-derived features were calculated as the difference between T1pre and T1post, ΔS. Results: For 5/6 features and all structures, ΔS diverged between trismus and non-trismus patients particularly for the masseter, lateral pterygoid, and temporal muscles using the kurtosis feature (−0.2 vs. 6.4 for lateral pterygoid). Both the maximum and mean RT dose in all four muscles were higher amongst the trismus patients (with the maximum dose being up to 25 Gy higher). Conclusion: Using MRI-derived features to quantify RT-induced normal tissue complications is feasible. We showed that several features are different between patients with and without morbidity and that the RT dose in all investigated structures are higher amongst patients with morbidity. MRI-derived features, therefore, has the potential to improve predictions of normal tissue morbidity.« less
NASA Astrophysics Data System (ADS)
Gábor Hatvani, István; Kern, Zoltán; Leél-Őssy, Szabolcs; Demény, Attila
2018-01-01
Uneven spacing is a common feature of sedimentary paleoclimate records, in many cases causing difficulties in the application of classical statistical and time series methods. Although special statistical tools do exist to assess unevenly spaced data directly, the transformation of such data into a temporally equidistant time series which may then be examined using commonly employed statistical tools remains, however, an unachieved goal. The present paper, therefore, introduces an approach to obtain evenly spaced time series (using cubic spline fitting) from unevenly spaced speleothem records with the application of a spectral guidance to avoid the spectral bias caused by interpolation and retain the original spectral characteristics of the data. The methodology was applied to stable carbon and oxygen isotope records derived from two stalagmites from the Baradla Cave (NE Hungary) dating back to the late 18th century. To show the benefit of the equally spaced records to climate studies, their coherence with climate parameters is explored using wavelet transform coherence and discussed. The obtained equally spaced time series are available at https://doi.org/10.1594/PANGAEA.875917.
Modin, Bitte; Plenty, Stephanie; Låftman, Sara B.; Bergström, Malin; Berlin, Marie; Hjern, Anders
2018-01-01
This study addressed school-contextual features of social disorder in relation to sixth-grade students’ experiences of bullying victimization and mental health complaints. It investigated, firstly, whether the school’s concentrations of behavioural problems were associated with individual students’ likelihood of being bullied, and secondly, whether the school’s concentrations of behavioural problems and bullying victimization predicted students’ emotional and psychosomatic health complaints. The data were derived from the Swedish National Survey of Mental Health among Children and Young People, carried out among sixth-grade students (approximately 12–13 years old) in Sweden in 2009. The analyses were based on information from 59,510 students distributed across 1999 schools. The statistical method used was multilevel modelling. While students’ own behavioural problems were associated with an elevated risk of being bullied, attending a school with a higher concentration of students with behavioural problems also increased the likelihood of being bullied. Attending a school with higher levels of bullying victimization and behavioural problems predicted more emotional and psychosomatic complaints, even when adjusting for their individual level analogues. The findings indicate that school-level features of social disorder influence bullying victimization and mental health complaints among students. PMID:29351244
Accurate Identification of MCI Patients via Enriched White-Matter Connectivity Network
NASA Astrophysics Data System (ADS)
Wee, Chong-Yaw; Yap, Pew-Thian; Brownyke, Jeffery N.; Potter, Guy G.; Steffens, David C.; Welsh-Bohmer, Kathleen; Wang, Lihong; Shen, Dinggang
Mild cognitive impairment (MCI), often a prodromal phase of Alzheimer's disease (AD), is frequently considered to be a good target for early diagnosis and therapeutic interventions of AD. Recent emergence of reliable network characterization techniques have made understanding neurological disorders at a whole brain connectivity level possible. Accordingly, we propose a network-based multivariate classification algorithm, using a collection of measures derived from white-matter (WM) connectivity networks, to accurately identify MCI patients from normal controls. An enriched description of WM connections, utilizing six physiological parameters, i.e., fiber penetration count, fractional anisotropy (FA), mean diffusivity (MD), and principal diffusivities (λ 1, λ 2, λ 3), results in six connectivity networks for each subject to account for the connection topology and the biophysical properties of the connections. Upon parcellating the brain into 90 regions-of-interest (ROIs), the average statistics of each ROI in relation to the remaining ROIs are extracted as features for classification. These features are then sieved to select the most discriminant subset of features for building an MCI classifier via support vector machines (SVMs). Cross-validation results indicate better diagnostic power of the proposed enriched WM connection description than simple description with any single physiological parameter.
Content-Aware Video Adaptation under Low-Bitrate Constraint
NASA Astrophysics Data System (ADS)
Hsiao, Ming-Ho; Chen, Yi-Wen; Chen, Hua-Tsung; Chou, Kuan-Hung; Lee, Suh-Yin
2007-12-01
With the development of wireless network and the improvement of mobile device capability, video streaming is more and more widespread in such an environment. Under the condition of limited resource and inherent constraints, appropriate video adaptations have become one of the most important and challenging issues in wireless multimedia applications. In this paper, we propose a novel content-aware video adaptation in order to effectively utilize resource and improve visual perceptual quality. First, the attention model is derived from analyzing the characteristics of brightness, location, motion vector, and energy features in compressed domain to reduce computation complexity. Then, through the integration of attention model, capability of client device and correlational statistic model, attractive regions of video scenes are derived. The information object- (IOB-) weighted rate distortion model is used for adjusting the bit allocation. Finally, the video adaptation scheme dynamically adjusts video bitstream in frame level and object level. Experimental results validate that the proposed scheme achieves better visual quality effectively and efficiently.
Toropova, Alla P; Schultz, Terry W; Toropov, Andrey A
2016-03-01
Data on toxicity toward Tetrahymena pyriformis is indicator of applicability of a substance in ecologic and pharmaceutical aspects. Quantitative structure-activity relationships (QSARs) between the molecular structure of benzene derivatives and toxicity toward T. pyriformis (expressed as the negative logarithms of the population growth inhibition dose, mmol/L) are established. The available data were randomly distributed three times into the visible training and calibration sets, and invisible validation sets. The statistical characteristics for the validation set are the following: r(2)=0.8179 and s=0.338 (first distribution); r(2)=0.8682 and s=0.341 (second distribution); r(2)=0.8435 and s=0.323 (third distribution). These models are built up using only information on the molecular structure: no data on physicochemical parameters, 3D features of the molecular structure and quantum mechanics descriptors are involved in the modeling process. Copyright © 2016 Elsevier B.V. All rights reserved.
Model-Based Anomaly Detection for a Transparent Optical Transmission System
NASA Astrophysics Data System (ADS)
Bengtsson, Thomas; Salamon, Todd; Ho, Tin Kam; White, Christopher A.
In this chapter, we present an approach for anomaly detection at the physical layer of networks where detailed knowledge about the devices and their operations is available. The approach combines physics-based process models with observational data models to characterize the uncertainties and derive the alarm decision rules. We formulate and apply three different methods based on this approach for a well-defined problem in optical network monitoring that features many typical challenges for this methodology. Specifically, we address the problem of monitoring optically transparent transmission systems that use dynamically controlled Raman amplification systems. We use models of amplifier physics together with statistical estimation to derive alarm decision rules and use these rules to automatically discriminate between measurement errors, anomalous losses, and pump failures. Our approach has led to an efficient tool for systematically detecting anomalies in the system behavior of a deployed network, where pro-active measures to address such anomalies are key to preventing unnecessary disturbances to the system's continuous operation.
Multi-frequency complex network from time series for uncovering oil-water flow structure.
Gao, Zhong-Ke; Yang, Yu-Xuan; Fang, Peng-Cheng; Jin, Ning-De; Xia, Cheng-Yi; Hu, Li-Dan
2015-02-04
Uncovering complex oil-water flow structure represents a challenge in diverse scientific disciplines. This challenge stimulates us to develop a new distributed conductance sensor for measuring local flow signals at different positions and then propose a novel approach based on multi-frequency complex network to uncover the flow structures from experimental multivariate measurements. In particular, based on the Fast Fourier transform, we demonstrate how to derive multi-frequency complex network from multivariate time series. We construct complex networks at different frequencies and then detect community structures. Our results indicate that the community structures faithfully represent the structural features of oil-water flow patterns. Furthermore, we investigate the network statistic at different frequencies for each derived network and find that the frequency clustering coefficient enables to uncover the evolution of flow patterns and yield deep insights into the formation of flow structures. Current results present a first step towards a network visualization of complex flow patterns from a community structure perspective.
Learning optimal features for visual pattern recognition
NASA Astrophysics Data System (ADS)
Labusch, Kai; Siewert, Udo; Martinetz, Thomas; Barth, Erhardt
2007-02-01
The optimal coding hypothesis proposes that the human visual system has adapted to the statistical properties of the environment by the use of relatively simple optimality criteria. We here (i) discuss how the properties of different models of image coding, i.e. sparseness, decorrelation, and statistical independence are related to each other (ii) propose to evaluate the different models by verifiable performance measures (iii) analyse the classification performance on images of handwritten digits (MNIST data base). We first employ the SPARSENET algorithm (Olshausen, 1998) to derive a local filter basis (on 13 × 13 pixels windows). We then filter the images in the database (28 × 28 pixels images of digits) and reduce the dimensionality of the resulting feature space by selecting the locally maximal filter responses. We then train a support vector machine on a training set to classify the digits and report results obtained on a separate test set. Currently, the best state-of-the-art result on the MNIST data base has an error rate of 0,4%. This result, however, has been obtained by using explicit knowledge that is specific to the data (elastic distortion model for digits). We here obtain an error rate of 0,55% which is second best but does not use explicit data specific knowledge. In particular it outperforms by far all methods that do not use data-specific knowledge.
Statistical distribution of wind speeds and directions globally observed by NSCAT
NASA Astrophysics Data System (ADS)
Ebuchi, Naoto
1999-05-01
In order to validate wind vectors derived from the NASA scatterometer (NSCAT), statistical distributions of wind speeds and directions over the global oceans are investigated by comparing with European Centre for Medium-Range Weather Forecasts (ECMWF) wind data. Histograms of wind speeds and directions are calculated from the preliminary and reprocessed NSCAT data products for a period of 8 weeks. For wind speed of the preliminary data products, excessive low wind distribution is pointed out through comparison with ECMWF winds. A hump at the lower wind speed side of the peak in the wind speed histogram is discernible. The shape of the hump varies with incidence angle. Incompleteness of the prelaunch geophysical model function, SASS 2, tentatively used to retrieve wind vectors of the preliminary data products, is considered to cause the skew of the wind speed distribution. On the contrary, histograms of wind speeds of the reprocessed data products show consistent features over the whole range of incidence angles. Frequency distribution of wind directions relative to spacecraft flight direction is calculated to assess self-consistency of the wind directions. It is found that wind vectors of the preliminary data products exhibit systematic directional preference relative to antenna beams. This artificial directivity is also considered to be caused by imperfections in the geophysical model function. The directional distributions of the reprocessed wind vectors show less directivity and consistent features, except for very low wind cases.
Banerjee, Imon; Malladi, Sadhika; Lee, Daniela; Depeursinge, Adrien; Telli, Melinda; Lipson, Jafi; Golden, Daniel; Rubin, Daniel L
2018-01-01
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is sensitive but not specific to determining treatment response in early stage triple-negative breast cancer (TNBC) patients. We propose an efficient computerized technique for assessing treatment response, specifically the residual tumor (RT) status and pathological complete response (pCR), in response to neoadjuvant chemotherapy. The proposed approach is based on Riesz wavelet analysis of pharmacokinetic maps derived from noninvasive DCE-MRI scans, obtained before and after treatment. We compared the performance of Riesz features with the traditional gray level co-occurrence matrices and a comprehensive characterization of the lesion that includes a wide range of quantitative features (e.g., shape and boundary). We investigated a set of predictive models ([Formula: see text]) incorporating distinct combinations of quantitative characterizations and statistical models at different time points of the treatment and some area under the receiver operating characteristic curve (AUC) values we reported are above 0.8. The most efficient models are based on first-order statistics and Riesz wavelets, which predicted RT with an AUC value of 0.85 and pCR with an AUC value of 0.83, improving results reported in a previous study by [Formula: see text]. Our findings suggest that Riesz texture analysis of TNBC lesions can be considered a potential framework for optimizing TNBC patient care.
Approach for Uncertainty Propagation and Robust Design in CFD Using Sensitivity Derivatives
NASA Technical Reports Server (NTRS)
Putko, Michele M.; Newman, Perry A.; Taylor, Arthur C., III; Green, Lawrence L.
2001-01-01
This paper presents an implementation of the approximate statistical moment method for uncertainty propagation and robust optimization for a quasi 1-D Euler CFD (computational fluid dynamics) code. Given uncertainties in statistically independent, random, normally distributed input variables, a first- and second-order statistical moment matching procedure is performed to approximate the uncertainty in the CFD output. Efficient calculation of both first- and second-order sensitivity derivatives is required. In order to assess the validity of the approximations, the moments are compared with statistical moments generated through Monte Carlo simulations. The uncertainties in the CFD input variables are also incorporated into a robust optimization procedure. For this optimization, statistical moments involving first-order sensitivity derivatives appear in the objective function and system constraints. Second-order sensitivity derivatives are used in a gradient-based search to successfully execute a robust optimization. The approximate methods used throughout the analyses are found to be valid when considering robustness about input parameter mean values.
A Stochastic Fractional Dynamics Model of Rainfall Statistics
NASA Astrophysics Data System (ADS)
Kundu, Prasun; Travis, James
2013-04-01
Rainfall varies in space and time in a highly irregular manner and is described naturally in terms of a stochastic process. A characteristic feature of rainfall statistics is that they depend strongly on the space-time scales over which rain data are averaged. A spectral model of precipitation has been developed based on a stochastic differential equation of fractional order for the point rain rate, that allows a concise description of the second moment statistics of rain at any prescribed space-time averaging scale. The model is designed to faithfully reflect the scale dependence and is thus capable of providing a unified description of the statistics of both radar and rain gauge data. The underlying dynamical equation can be expressed in terms of space-time derivatives of fractional orders that are adjusted together with other model parameters to fit the data. The form of the resulting spectrum gives the model adequate flexibility to capture the subtle interplay between the spatial and temporal scales of variability of rain but strongly constrains the predicted statistical behavior as a function of the averaging length and times scales. The main restriction is the assumption that the statistics of the precipitation field is spatially homogeneous and isotropic and stationary in time. We test the model with radar and gauge data collected contemporaneously at the NASA TRMM ground validation sites located near Melbourne, Florida and in Kwajalein Atoll, Marshall Islands in the tropical Pacific. We estimate the parameters by tuning them to the second moment statistics of the radar data. The model predictions are then found to fit the second moment statistics of the gauge data reasonably well without any further adjustment. Some data sets containing periods of non-stationary behavior that involves occasional anomalously correlated rain events, present a challenge for the model.
Rosenkrantz, Andrew B; Doshi, Ankur M; Ginocchio, Luke A; Aphinyanaphongs, Yindalon
2016-12-01
This study aimed to assess the performance of a text classification machine-learning model in predicting highly cited articles within the recent radiological literature and to identify the model's most influential article features. We downloaded from PubMed the title, abstract, and medical subject heading terms for 10,065 articles published in 25 general radiology journals in 2012 and 2013. Three machine-learning models were applied to predict the top 10% of included articles in terms of the number of citations to the article in 2014 (reflecting the 2-year time window in conventional impact factor calculations). The model having the highest area under the curve was selected to derive a list of article features (words) predicting high citation volume, which was iteratively reduced to identify the smallest possible core feature list maintaining predictive power. Overall themes were qualitatively assigned to the core features. The regularized logistic regression (Bayesian binary regression) model had highest performance, achieving an area under the curve of 0.814 in predicting articles in the top 10% of citation volume. We reduced the initial 14,083 features to 210 features that maintain predictivity. These features corresponded with topics relating to various imaging techniques (eg, diffusion-weighted magnetic resonance imaging, hyperpolarized magnetic resonance imaging, dual-energy computed tomography, computed tomography reconstruction algorithms, tomosynthesis, elastography, and computer-aided diagnosis), particular pathologies (prostate cancer; thyroid nodules; hepatic adenoma, hepatocellular carcinoma, non-alcoholic fatty liver disease), and other topics (radiation dose, electroporation, education, general oncology, gadolinium, statistics). Machine learning can be successfully applied to create specific feature-based models for predicting articles likely to achieve high influence within the radiological literature. Copyright © 2016 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Wan, Xiaoqing; Zhao, Chunhui; Wang, Yanchun; Liu, Wu
2017-11-01
This paper proposes a novel classification paradigm for hyperspectral image (HSI) using feature-level fusion and deep learning-based methodologies. Operation is carried out in three main steps. First, during a pre-processing stage, wave atoms are introduced into bilateral filter to smooth HSI, and this strategy can effectively attenuate noise and restore texture information. Meanwhile, high quality spectral-spatial features can be extracted from HSI by taking geometric closeness and photometric similarity among pixels into consideration simultaneously. Second, higher order statistics techniques are firstly introduced into hyperspectral data classification to characterize the phase correlations of spectral curves. Third, multifractal spectrum features are extracted to characterize the singularities and self-similarities of spectra shapes. To this end, a feature-level fusion is applied to the extracted spectral-spatial features along with higher order statistics and multifractal spectrum features. Finally, stacked sparse autoencoder is utilized to learn more abstract and invariant high-level features from the multiple feature sets, and then random forest classifier is employed to perform supervised fine-tuning and classification. Experimental results on two real hyperspectral data sets demonstrate that the proposed method outperforms some traditional alternatives.
Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis
NASA Astrophysics Data System (ADS)
Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Bakar, Azuraliza Abu; Yaakub, Mohd Ridzwan
2017-10-01
This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.
QSAR models for thiophene and imidazopyridine derivatives inhibitors of the Polo-Like Kinase 1.
Comelli, Nieves C; Duchowicz, Pablo R; Castro, Eduardo A
2014-10-01
The inhibitory activity of 103 thiophene and 33 imidazopyridine derivatives against Polo-Like Kinase 1 (PLK1) expressed as pIC50 (-logIC50) was predicted by QSAR modeling. Multivariate linear regression (MLR) was employed to model the relationship between 0D and 3D molecular descriptors and biological activities of molecules using the replacement method (MR) as variable selection tool. The 136 compounds were separated into several training and test sets. Two splitting approaches, distribution of biological data and structural diversity, and the statistical experimental design procedure D-optimal distance were applied to the dataset. The significance of the training set models was confirmed by statistically higher values of the internal leave one out cross-validated coefficient of determination (Q2) and external predictive coefficient of determination for the test set (Rtest2). The model developed from a training set, obtained with the D-optimal distance protocol and using 3D descriptor space along with activity values, separated chemical features that allowed to distinguish high and low pIC50 values reasonably well. Then, we verified that such model was sufficient to reliably and accurately predict the activity of external diverse structures. The model robustness was properly characterized by means of standard procedures and their applicability domain (AD) was analyzed by leverage method. Copyright © 2014 Elsevier B.V. All rights reserved.
Landmark Detection in Orbital Images Using Salience Histograms
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri L.; Panetta, Julian; Schorghofer, Norbert; Greeley, Ronald; PendletonHoffer, Mary; bunte, Melissa
2010-01-01
NASA's planetary missions have collected, and continue to collect, massive volumes of orbital imagery. The volume is such that it is difficult to manually review all of the data and determine its significance. As a result, images are indexed and searchable by location and date but generally not by their content. A new automated method analyzes images and identifies "landmarks," or visually salient features such as gullies, craters, dust devil tracks, and the like. This technique uses a statistical measure of salience derived from information theory, so it is not associated with any specific landmark type. It identifies regions that are unusual or that stand out from their surroundings, so the resulting landmarks are context-sensitive areas that can be used to recognize the same area when it is encountered again. A machine learning classifier is used to identify the type of each discovered landmark. Using a specified window size, an intensity histogram is computed for each such window within the larger image (sliding the window across the image). Next, a salience map is computed that specifies, for each pixel, the salience of the window centered at that pixel. The salience map is thresholded to identify landmark contours (polygons) using the upper quartile of salience values. Descriptive attributes are extracted for each landmark polygon: size, perimeter, mean intensity, standard deviation of intensity, and shape features derived from an ellipse fit.
Moving Beyond ERP Components: A Selective Review of Approaches to Integrate EEG and Behavior
Bridwell, David A.; Cavanagh, James F.; Collins, Anne G. E.; Nunez, Michael D.; Srinivasan, Ramesh; Stober, Sebastian; Calhoun, Vince D.
2018-01-01
Relationships between neuroimaging measures and behavior provide important clues about brain function and cognition in healthy and clinical populations. While electroencephalography (EEG) provides a portable, low cost measure of brain dynamics, it has been somewhat underrepresented in the emerging field of model-based inference. We seek to address this gap in this article by highlighting the utility of linking EEG and behavior, with an emphasis on approaches for EEG analysis that move beyond focusing on peaks or “components” derived from averaging EEG responses across trials and subjects (generating the event-related potential, ERP). First, we review methods for deriving features from EEG in order to enhance the signal within single-trials. These methods include filtering based on user-defined features (i.e., frequency decomposition, time-frequency decomposition), filtering based on data-driven properties (i.e., blind source separation, BSS), and generating more abstract representations of data (e.g., using deep learning). We then review cognitive models which extract latent variables from experimental tasks, including the drift diffusion model (DDM) and reinforcement learning (RL) approaches. Next, we discuss ways to access associations among these measures, including statistical models, data-driven joint models and cognitive joint modeling using hierarchical Bayesian models (HBMs). We think that these methodological tools are likely to contribute to theoretical advancements, and will help inform our understandings of brain dynamics that contribute to moment-to-moment cognitive function. PMID:29632480
2-D multiline spectroscopy of the solar photosphere
NASA Astrophysics Data System (ADS)
Berrilli, F.; Consolini, G.; Pietropaolo, E.; Caccin, B.; Penza, V.; Lepreti, F.
2002-01-01
The structure and dynamics of the photosphere are investigated, with time series of broadband and monochromatic images of quiet granulation, at the solar disk center. Images were acquired with the IPM observing mode at the THEMIS telescope. Velocity and line center intensity fields, derived from the observation of three different photospheric lines, are used to study velocity and intensity patterns at different heights in the photosphere. Automatic segmentation procedures are applied to velocity and intensity frames to extract solar features, and to investigate the dependence of their properties at different scales and heights. We find a dependence of the statistical properties of upflow and downflow regions on the atmospheric height. Larger granules, passing through a great part of the photosphere, are used to investigate the damping of convective motions in stably stratified layers. The results suggest the occurrence of an intense braking in the deep photosphere (first ~ 120 km). Furthermore, we investigate the temporal and spatial evolution of velocity fields, deriving typical time scales of dynamical processes relative to different solar features. In particular, for two selected isolated exploders, we reveal a velocity deceleration in the central region since the early phase of their fragmentation. Based on observations made with THEMIS-CNRS/INSU-CNR operated on the island of Tenerife by THEMIS S.L. in the Spanish Observatorio del Teide of the Instituto de Astrofisica de Canarias.
Derivation of exact master equation with stochastic description: dissipative harmonic oscillator.
Li, Haifeng; Shao, Jiushu; Wang, Shikuan
2011-11-01
A systematic procedure for deriving the master equation of a dissipative system is reported in the framework of stochastic description. For the Caldeira-Leggett model of the harmonic-oscillator bath, a detailed and elementary derivation of the bath-induced stochastic field is presented. The dynamics of the system is thereby fully described by a stochastic differential equation, and the desired master equation would be acquired with statistical averaging. It is shown that the existence of a closed-form master equation depends on the specificity of the system as well as the feature of the dissipation characterized by the spectral density function. For a dissipative harmonic oscillator it is observed that the correlation between the stochastic field due to the bath and the system can be decoupled, and the master equation naturally results. Such an equation possesses the Lindblad form in which time-dependent coefficients are determined by a set of integral equations. It is proved that the obtained master equation is equivalent to the well-known Hu-Paz-Zhang equation based on the path-integral technique. The procedure is also used to obtain the master equation of a dissipative harmonic oscillator in time-dependent fields.
Covariance analyses of satellite-derived mesoscale wind fields
NASA Technical Reports Server (NTRS)
Maddox, R. A.; Vonder Haar, T. H.
1979-01-01
Statistical structure functions have been computed independently for nine satellite-derived mesoscale wind fields that were obtained on two different days. Small cumulus clouds were tracked at 5 min intervals, but since these clouds occurred primarily in the warm sectors of midlatitude cyclones the results cannot be considered representative of the circulations within cyclones in general. The field structure varied considerably with time and was especially affected if mesoscale features were observed. The wind fields on the 2 days studied were highly anisotropic with large gradients in structure occurring approximately normal to the mean flow. Structure function calculations for the combined set of satellite winds were used to estimate random error present in the fields. It is concluded for these data that the random error in vector winds derived from cumulus cloud tracking using high-frequency satellite data is less than 1.75 m/s. Spatial correlation functions were also computed for the nine data sets. Normalized correlation functions were considerably different for u and v components and decreased rapidly as data point separation increased for both components. The correlation functions for transverse and longitudinal components decreased less rapidly as data point separation increased.
A quality function deployment framework for the service quality of health information websites.
Chang, Hyejung; Kim, Dohoon
2010-03-01
This research was conducted to identify both the users' service requirements on health information websites (HIWs) and the key functional elements for running HIWs. With the quality function deployment framework, the derived service attributes (SAs) are mapped into the suppliers' functional characteristics (FCs) to derive the most critical FCs for the users' satisfaction. Using the survey data from 228 respondents, the SAs, FCs and their relationships were analyzed using various multivariate statistical methods such as principal component factor analysis, discriminant analysis, correlation analysis, etc. Simple and compound FC priorities were derived by matrix calculation. Nine factors of SAs and five key features of FCs were identified, and these served as the basis for the house of quality model. Based on the compound FC priorities, the functional elements pertaining to security and privacy, and usage support should receive top priority in the course of enhancing HIWs. The quality function deployment framework can improve the FCs of the HIWs in an effective, structured manner, and it can also be utilized for critical success factors together with their strategic implications for enhancing the service quality of HIWs. Therefore, website managers could efficiently improve website operations by considering this study's results.
Holmes, Tyson H; He, Xiao-Song
2016-10-01
Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity n,1
Holmes, Tyson H.; He, Xiao-Song
2016-01-01
Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity n, 1 < n < 50, of human participants for the purpose of estimating many parameters p, such that n < p < 1,000. We offer a set of prescriptions that are designed to facilitate low-variance (i.e. stable), low-bias, interpretive regression modeling of small, wide data sets. These prescriptions are distinctive in their especially heavy emphasis on minimizing use of out-of-sample information for conducting statistical inference. That allows the working immunologist to proceed without being encumbered by imposed and often untestable statistical assumptions. Problems of unmeasured confounders, confidence-interval coverage, feature selection, and shrinkage/denoising are defined clearly and treated in detail. We propose an extension of an existing nonparametric technique for improved small-sample confidence-interval tail coverage from the univariate case (single immune feature) to the multivariate (many, possibly correlated immune features). An important role for derived features in the immunological interpretation of regression analyses is stressed. Areas of further research are discussed. Presented principles and methods are illustrated through application to a small, wide data set of adults spanning a wide range in ages and multiple immunophenotypes that were assayed before and after immunization with inactivated influenza vaccine (IIV). Our regression modeling prescriptions identify some potentially important topics for future immunological research. 1) Immunologists may wish to distinguish age-related differences in immune features from changes in immune features caused by aging. 2) A form of the bootstrap that employs linear extrapolation may prove to be an invaluable analytic tool because it allows the working immunologist to obtain accurate estimates of the stability of immune parameter estimates with a bare minimum of imposed assumptions. 3) Liberal inclusion of immune features in phenotyping panels can facilitate accurate separation of biological signal of interest from noise. In addition, through a combination of denoising and potentially improved confidence interval coverage, we identify some candidate immune correlates (frequency of cell subset and concentration of cytokine) with B cell response as measured by quantity of IIV-specific IgA antibody-secreting cells and quantity of IIV-specific IgG antibody-secreting cells. PMID:27196789
Association between pathology and texture features of multi parametric MRI of the prostate
NASA Astrophysics Data System (ADS)
Kuess, Peter; Andrzejewski, Piotr; Nilsson, David; Georg, Petra; Knoth, Johannes; Susani, Martin; Trygg, Johan; Helbich, Thomas H.; Polanec, Stephan H.; Georg, Dietmar; Nyholm, Tufve
2017-10-01
The role of multi-parametric (mp)MRI in the diagnosis and treatment of prostate cancer has increased considerably. An alternative to visual inspection of mpMRI is the evaluation using histogram-based (first order statistics) parameters and textural features (second order statistics). The aims of the present work were to investigate the relationship between benign and malignant sub-volumes of the prostate and textures obtained from mpMR images. The performance of tumor prediction was investigated based on the combination of histogram-based and textural parameters. Subsequently, the relative importance of mpMR images was assessed and the benefit of additional imaging analyzed. Finally, sub-structures based on the PI-RADS classification were investigated as potential regions to automatically detect maligned lesions. Twenty-five patients who received mpMRI prior to radical prostatectomy were included in the study. The imaging protocol included T2, DWI, and DCE. Delineation of tumor regions was performed based on pathological information. First and second order statistics were derived from each structure and for all image modalities. The resulting data were processed with multivariate analysis, using PCA (principal component analysis) and OPLS-DA (orthogonal partial least squares discriminant analysis) for separation of malignant and healthy tissue. PCA showed a clear difference between tumor and healthy regions in the peripheral zone for all investigated images. The predictive ability of the OPLS-DA models increased for all image modalities when first and second order statistics were combined. The predictive value reached a plateau after adding ADC and T2, and did not increase further with the addition of other image information. The present study indicates a distinct difference in the signatures between malign and benign prostate tissue. This is an absolute prerequisite for automatic tumor segmentation, but only the first step in that direction. For the specific identified signature, DCE did not add complementary information to T2 and ADC maps.
Dupree, Jean A.; Crowfoot, Richard M.
2012-01-01
This geodatabase and its component datasets are part of U.S. Geological Survey Digital Data Series 650 and were generated to store basin boundaries for U.S. Geological Survey streamgages and other sites in Colorado. The geodatabase and its components were created by the U.S. Geological Survey, Colorado Water Science Center, and are used to derive the numeric drainage areas for Colorado that are input into the U.S. Geological Survey's National Water Information System (NWIS) database and also published in the Annual Water Data Report and on NWISWeb. The foundational dataset used to create the basin boundaries in this geodatabase was the National Watershed Boundary Dataset. This geodatabase accompanies a U.S. Geological Survey Techniques and Methods report (Book 11, Section C, Chapter 6) entitled "Digital Database Architecture and Delineation Methodology for Deriving Drainage Basins, and Comparison of Digitally and Non-Digitally Derived Numeric Drainage Areas." The Techniques and Methods report details the geodatabase architecture, describes the delineation methodology and workflows used to develop these basin boundaries, and compares digitally derived numeric drainage areas in this geodatabase to non-digitally derived areas. 1. COBasins.gdb: This geodatabase contains site locations and basin boundaries for Colorado. It includes a single feature dataset, called BasinsFD, which groups the component feature classes and topology rules. 2. BasinsFD: This feature dataset in the "COBasins.gdb" geodatabase is a digital container that holds the feature classes used to archive site locations and basin boundaries as well as the topology rules that govern spatial relations within and among component feature classes. This feature dataset includes three feature classes: the sites for which basins have been delineated (the "Sites" feature class), basin bounding lines (the "BasinLines" feature class), and polygonal basin areas (the "BasinPolys" feature class). The feature dataset also stores the topology rules (the "BasinsFD_Topology") that constrain the relations within and among component feature classes. The feature dataset also forces any feature classes inside it to have a consistent projection system, which is, in this case, an Albers-Equal-Area projection system. 3. BasinsFD_Topology: This topology contains four persistent topology rules that constrain the spatial relations within the "BasinLines" feature class and between the "BasinLines" feature class and the "BasinPolys" feature classes. 4. Sites: This point feature class contains the digital representations of the site locations for which Colorado Water Science Center basin boundaries have been delineated. This feature class includes point locations for Colorado Water Science Center active (as of September 30, 2009) gages and for other sites. 5. BasinLines: This line feature class contains the perimeters of basins delineated for features in the "Sites" feature class, and it also contains information regarding the sources of lines used for the basin boundaries. 6. BasinPolys: This polygon feature class contains the polygonal basin areas delineated for features in the "Sites" feature class, and it is used to derive the numeric drainage areas published by the Colorado Water Science Center.
Testing the Isotropic Universe Using the Gamma-Ray Burst Data of Fermi/GBM
NASA Astrophysics Data System (ADS)
Řípa, Jakub; Shafieloo, Arman
2017-12-01
The sky distribution of gamma-ray bursts (GRBs) has been intensively studied by various groups for more than two decades. Most of these studies test the isotropy of GRBs based on their sky number density distribution. In this work, we propose an approach to test the isotropy of the universe through inspecting the isotropy of the properties of GRBs such as their duration, fluences, and peak fluxes at various energy bands and different timescales. We apply this method on the Fermi/Gamma-ray Burst Monitor (GBM) data sample containing 1591 GRBs. The most noticeable feature we found is near the Galactic coordinates l≈ 30^\\circ , b≈ 15^\\circ , and radius r≈ 20^\\circ {--}40^\\circ . The inferred probability for the occurrence of such an anisotropic signal (in a random isotropic sample) is derived to be less than a percent in some of the tests while the other tests give results consistent with isotropy. These are based on the comparison of the results from the real data with the randomly shuffled data samples. Considering the large number of statistics we used in this work (some of which are correlated with each other), we can anticipate that the detected feature could be a result of statistical fluctuations. Moreover, we noticed a considerably low number of GRBs in this particular patch, which might be due to some instrumentation or observational effects that can consequently affect our statistics through some systematics. Further investigation is highly desirable in order to clarify this result, e.g., utilizing a larger future Fermi/GBM data sample as well as data samples of other GRB missions and also looking for possible systematics.
How language production shapes language form and comprehension
MacDonald, Maryellen C.
2012-01-01
Language production processes can provide insight into how language comprehension works and language typology—why languages tend to have certain characteristics more often than others. Drawing on work in memory retrieval, motor planning, and serial order in action planning, the Production-Distribution-Comprehension (PDC) account links work in the fields of language production, typology, and comprehension: (1) faced with substantial computational burdens of planning and producing utterances, language producers implicitly follow three biases in utterance planning that promote word order choices that reduce these burdens, thereby improving production fluency. (2) These choices, repeated over many utterances and individuals, shape the distributions of utterance forms in language. The claim that language form stems in large degree from producers' attempts to mitigate utterance planning difficulty is contrasted with alternative accounts in which form is driven by language use more broadly, language acquisition processes, or producers' attempts to create language forms that are easily understood by comprehenders. (3) Language perceivers implicitly learn the statistical regularities in their linguistic input, and they use this prior experience to guide comprehension of subsequent language. In particular, they learn to predict the sequential structure of linguistic signals, based on the statistics of previously-encountered input. Thus, key aspects of comprehension behavior are tied to lexico-syntactic statistics in the language, which in turn derive from utterance planning biases promoting production of comparatively easy utterance forms over more difficult ones. This approach contrasts with classic theories in which comprehension behaviors are attributed to innate design features of the language comprehension system and associated working memory. The PDC instead links basic features of comprehension to a different source: production processes that shape language form. PMID:23637689
Asteroid shape and spin statistics from convex models
NASA Astrophysics Data System (ADS)
Torppa, J.; Hentunen, V.-P.; Pääkkönen, P.; Kehusmaa, P.; Muinonen, K.
2008-11-01
We introduce techniques for characterizing convex shape models of asteroids with a small number of parameters, and apply these techniques to a set of 87 models from convex inversion. We present three different approaches for determining the overall dimensions of an asteroid. With the first technique, we measured the dimensions of the shapes in the direction of the rotation axis and in the equatorial plane and with the two other techniques, we derived the best-fit ellipsoid. We also computed the inertia matrix of the model shape to test how well it represents the target asteroid, i.e., to find indications of possible non-convex features or albedo variegation, which the convex shape model cannot reproduce. We used shape models for 87 asteroids to perform statistical analyses and to study dependencies between shape and rotation period, size, and taxonomic type. We detected correlations, but more data are required, especially on small and large objects, as well as slow and fast rotators, to reach a more thorough understanding about the dependencies. Results show, e.g., that convex models of asteroids are not that far from ellipsoids in root-mean-square sense, even though clearly irregular features are present. We also present new spin and shape solutions for Asteroids (31) Euphrosyne, (54) Alexandra, (79) Eurynome, (93) Minerva, (130) Elektra, (376) Geometria, (471) Papagena, and (776) Berbericia. We used a so-called semi-statistical approach to obtain a set of possible spin state solutions. The number of solutions depends on the abundancy of the data, which for Eurynome, Elektra, and Geometria was extensive enough for determining an unambiguous spin and shape solution. Data of Euphrosyne, on the other hand, provided a wide distribution of possible spin solutions, whereas the rest of the targets have two or three possible solutions.
Jones, David T; Kandathil, Shaun M
2018-04-26
In addition to substitution frequency data from protein sequence alignments, many state-of-the-art methods for contact prediction rely on additional sources of information, or features, of protein sequences in order to predict residue-residue contacts, such as solvent accessibility, predicted secondary structure, and scores from other contact prediction methods. It is unclear how much of this information is needed to achieve state-of-the-art results. Here, we show that using deep neural network models, simple alignment statistics contain sufficient information to achieve state-of-the-art precision. Our prediction method, DeepCov, uses fully convolutional neural networks operating on amino-acid pair frequency or covariance data derived directly from sequence alignments, without using global statistical methods such as sparse inverse covariance or pseudolikelihood estimation. Comparisons against CCMpred and MetaPSICOV2 show that using pairwise covariance data calculated from raw alignments as input allows us to match or exceed the performance of both of these methods. Almost all of the achieved precision is obtained when considering relatively local windows (around 15 residues) around any member of a given residue pairing; larger window sizes have comparable performance. Assessment on a set of shallow sequence alignments (fewer than 160 effective sequences) indicates that the new method is substantially more precise than CCMpred and MetaPSICOV2 in this regime, suggesting that improved precision is attainable on smaller sequence families. Overall, the performance of DeepCov is competitive with the state of the art, and our results demonstrate that global models, which employ features from all parts of the input alignment when predicting individual contacts, are not strictly needed in order to attain precise contact predictions. DeepCov is freely available at https://github.com/psipred/DeepCov. d.t.jones@ucl.ac.uk.
Giordano, Bruno L.; Kayser, Christoph; Rousselet, Guillaume A.; Gross, Joachim; Schyns, Philippe G.
2016-01-01
Abstract We begin by reviewing the statistical framework of information theory as applicable to neuroimaging data analysis. A major factor hindering wider adoption of this framework in neuroimaging is the difficulty of estimating information theoretic quantities in practice. We present a novel estimation technique that combines the statistical theory of copulas with the closed form solution for the entropy of Gaussian variables. This results in a general, computationally efficient, flexible, and robust multivariate statistical framework that provides effect sizes on a common meaningful scale, allows for unified treatment of discrete, continuous, unidimensional and multidimensional variables, and enables direct comparisons of representations from behavioral and brain responses across any recording modality. We validate the use of this estimate as a statistical test within a neuroimaging context, considering both discrete stimulus classes and continuous stimulus features. We also present examples of analyses facilitated by these developments, including application of multivariate analyses to MEG planar magnetic field gradients, and pairwise temporal interactions in evoked EEG responses. We show the benefit of considering the instantaneous temporal derivative together with the raw values of M/EEG signals as a multivariate response, how we can separately quantify modulations of amplitude and direction for vector quantities, and how we can measure the emergence of novel information over time in evoked responses. Open‐source Matlab and Python code implementing the new methods accompanies this article. Hum Brain Mapp 38:1541–1573, 2017. © 2016 Wiley Periodicals, Inc. PMID:27860095
Interpretable Topic Features for Post-ICU Mortality Prediction.
Luo, Yen-Fu; Rumshisky, Anna
2016-01-01
Electronic health records provide valuable resources for understanding the correlation between various diseases and mortality. The analysis of post-discharge mortality is critical for healthcare professionals to follow up potential causes of death after a patient is discharged from the hospital and give prompt treatment. Moreover, it may reduce the cost derived from readmissions and improve the quality of healthcare. Our work focused on post-discharge ICU mortality prediction. In addition to features derived from physiological measurements, we incorporated ICD-9-CM hierarchy into Bayesian topic model learning and extracted topic features from medical notes. We achieved highest AUCs of 0.835 and 0.829 for 30-day and 6-month post-discharge mortality prediction using baseline and topic proportions derived from Labeled-LDA. Moreover, our work emphasized the interpretability of topic features derived from topic model which may facilitates the understanding and investigation of the complexity between mortality and diseases.
NASA Astrophysics Data System (ADS)
Antropova, Natasha; Huynh, Benjamin; Giger, Maryellen
2017-03-01
Intuitive segmentation-based CADx/radiomic features, calculated from the lesion segmentations of dynamic contrast-enhanced magnetic resonance images (DCE-MRIs) have been utilized in the task of distinguishing between malignant and benign lesions. Additionally, transfer learning with pre-trained deep convolutional neural networks (CNNs) allows for an alternative method of radiomics extraction, where the features are derived directly from the image data. However, the comparison of computer-extracted segmentation-based and CNN features in MRI breast lesion characterization has not yet been conducted. In our study, we used a DCE-MRI database of 640 breast cases - 191 benign and 449 malignant. Thirty-eight segmentation-based features were extracted automatically using our quantitative radiomics workstation. Also, 2D ROIs were selected around each lesion on the DCE-MRIs and directly input into a pre-trained CNN AlexNet, yielding CNN features. Each method was investigated separately and in combination in terms of performance in the task of distinguishing between benign and malignant lesions. Area under the ROC curve (AUC) served as the figure of merit. Both methods yielded promising classification performance with round-robin cross-validated AUC values of 0.88 (se =0.01) and 0.76 (se=0.02) for segmentationbased and deep learning methods, respectively. Combination of the two methods enhanced the performance in malignancy assessment resulting in an AUC value of 0.91 (se=0.01), a statistically significant improvement over the performance of the CNN method alone.
Rock classification based on resistivity patterns in electrical borehole wall images
NASA Astrophysics Data System (ADS)
Linek, Margarete; Jungmann, Matthias; Berlage, Thomas; Pechnig, Renate; Clauser, Christoph
2007-06-01
Electrical borehole wall images represent grey-level-coded micro-resistivity measurements at the borehole wall. Different scientific methods have been implemented to transform image data into quantitative log curves. We introduce a pattern recognition technique applying texture analysis, which uses second-order statistics based on studying the occurrence of pixel pairs. We calculate so-called Haralick texture features such as contrast, energy, entropy and homogeneity. The supervised classification method is used for assigning characteristic texture features to different rock classes and assessing the discriminative power of these image features. We use classifiers obtained from training intervals to characterize the entire image data set recovered in ODP hole 1203A. This yields a synthetic lithology profile based on computed texture data. We show that Haralick features accurately classify 89.9% of the training intervals. We obtained misclassification for vesicular basaltic rocks. Hence, further image analysis tools are used to improve the classification reliability. We decompose the 2D image signal by the application of wavelet transformation in order to enhance image objects horizontally, diagonally and vertically. The resulting filtered images are used for further texture analysis. This combined classification based on Haralick features and wavelet transformation improved our classification up to a level of 98%. The application of wavelet transformation increases the consistency between standard logging profiles and texture-derived lithology. Texture analysis of borehole wall images offers the potential to facilitate objective analysis of multiple boreholes with the same lithology.
A stochastic fractional dynamics model of space-time variability of rain
NASA Astrophysics Data System (ADS)
Kundu, Prasun K.; Travis, James E.
2013-09-01
varies in space and time in a highly irregular manner and is described naturally in terms of a stochastic process. A characteristic feature of rainfall statistics is that they depend strongly on the space-time scales over which rain data are averaged. A spectral model of precipitation has been developed based on a stochastic differential equation of fractional order for the point rain rate, which allows a concise description of the second moment statistics of rain at any prescribed space-time averaging scale. The model is thus capable of providing a unified description of the statistics of both radar and rain gauge data. The underlying dynamical equation can be expressed in terms of space-time derivatives of fractional orders that are adjusted together with other model parameters to fit the data. The form of the resulting spectrum gives the model adequate flexibility to capture the subtle interplay between the spatial and temporal scales of variability of rain but strongly constrains the predicted statistical behavior as a function of the averaging length and time scales. We test the model with radar and gauge data collected contemporaneously at the NASA TRMM ground validation sites located near Melbourne, Florida and on the Kwajalein Atoll, Marshall Islands in the tropical Pacific. We estimate the parameters by tuning them to fit the second moment statistics of radar data at the smaller spatiotemporal scales. The model predictions are then found to fit the second moment statistics of the gauge data reasonably well at these scales without any further adjustment.
The impact on midlevel vision of statistically optimal divisive normalization in V1.
Coen-Cagli, Ruben; Schwartz, Odelia
2013-07-15
The first two areas of the primate visual cortex (V1, V2) provide a paradigmatic example of hierarchical computation in the brain. However, neither the functional properties of V2 nor the interactions between the two areas are well understood. One key aspect is that the statistics of the inputs received by V2 depend on the nonlinear response properties of V1. Here, we focused on divisive normalization, a canonical nonlinear computation that is observed in many neural areas and modalities. We simulated V1 responses with (and without) different forms of surround normalization derived from statistical models of natural scenes, including canonical normalization and a statistically optimal extension that accounted for image nonhomogeneities. The statistics of the V1 population responses differed markedly across models. We then addressed how V2 receptive fields pool the responses of V1 model units with different tuning. We assumed this is achieved by learning without supervision a linear representation that removes correlations, which could be accomplished with principal component analysis. This approach revealed V2-like feature selectivity when we used the optimal normalization and, to a lesser extent, the canonical one but not in the absence of both. We compared the resulting two-stage models on two perceptual tasks; while models encompassing V1 surround normalization performed better at object recognition, only statistically optimal normalization provided systematic advantages in a task more closely matched to midlevel vision, namely figure/ground judgment. Our results suggest that experiments probing midlevel areas might benefit from using stimuli designed to engage the computations that characterize V1 optimality.
NASA Astrophysics Data System (ADS)
Sandborn, A.; Engstrom, R.; Yu, Q.
2014-12-01
Mapping urban areas via satellite imagery is an important task for detecting and anticipating land cover and land use change at multiple scales. As developing countries experience substantial urban growth and expansion, remotely sensed based estimates of population and quality of life indicators can provide timely and spatially explicit information to researchers and planners working to determine how cities are changing. In this study, we use commercial high spatial resolution satellite imagery in combination with fine resolution census data to determine the ability of using remotely sensed data to reveal the spatial patterns of quality of life in Accra, Ghana. Traditionally, spectral characteristics are used on a per-pixel basis to determine land cover; however, in this study, we test a new methodology that quantifies spatial characteristics using a variety of spatial features observed in the imagery to determine the properties of an urban area. The spatial characteristics used in this study include histograms of oriented gradients, PanTex, Fourier transform, and line support regions. These spatial features focus on extracting structural and textural patterns of built-up areas, such as homogeneous building orientations and straight line indices. Information derived from aggregating the descriptive statistics of the spatial features at both the fine-resolution census unit and the larger neighborhood level are then compared to census derived quality of life indicators including information about housing, education, and population estimates. Preliminary results indicate that there are correlations between straight line indices and census data including available electricity and literacy rates. Results from this study will be used to determine if this methodology provides a new and improved way to measure a city structure in developing cities and differentiate between residential and commercial land use zones, as well as formal versus informal housing areas.
ERIC Educational Resources Information Center
Deane, Paul; Zhang, Mo
2015-01-01
In this report, we examine the feasibility of characterizing writing performance using process features derived from a keystroke log. Using data derived from a set of "CBAL"™ writing assessments, we examine the following research questions: (a) How stable are the keystroke timing and process features across testing occasions?; (b) How…
Robust photometric invariant features from the color tensor.
van de Weijer, Joost; Gevers, Theo; Smeulders, Arnold W M
2006-01-01
Luminance-based features are widely used as low-level input for computer vision applications, even when color data is available. The extension of feature detection to the color domain prevents information loss due to isoluminance and allows us to exploit the photometric information. To fully exploit the extra information in the color data, the vector nature of color data has to be taken into account and a sound framework is needed to combine feature and photometric invariance theory. In this paper, we focus on the structure tensor, or color tensor, which adequately handles the vector nature of color images. Further, we combine the features based on the color tensor with photometric invariant derivatives to arrive at photometric invariant features. We circumvent the drawback of unstable photometric invariants by deriving an uncertainty measure to accompany the photometric invariant derivatives. The uncertainty is incorporated in the color tensor, hereby allowing the computation of robust photometric invariant features. The combination of the photometric invariance theory and tensor-based features allows for detection of a variety of features such as photometric invariant edges, corners, optical flow, and curvature. The proposed features are tested for noise characteristics and robustness to photometric changes. Experiments show that the proposed features are robust to scene incidental events and that the proposed uncertainty measure improves the applicability of full invariants.
A National Crop Progress Monitoring System Based on NASA Earth Science Results
NASA Astrophysics Data System (ADS)
Di, L.; Yu, G.; Zhang, B.; Deng, M.; Yang, Z.
2011-12-01
Crop progress is an important piece of information for food security and agricultural commodities. Timely monitoring and reporting are mandated for the operation of agricultural statistical agencies. Traditionally, the weekly reporting issued by the National Agricultural Statistics Service (NASS) of the United States Department of Agriculture (USDA) is based on reports from the knowledgeable state and county agricultural officials and farmers. The results are spatially coarse and subjective. In this project, a remote-sensing-supported crop progress monitoring system is being developed intensively using the data and derived products from NASA Earth Observing satellites. Moderate Resolution Imaging Spectroradiometer (MODIS) Level 3 product - MOD09 (Surface Reflectance) is used for deriving daily normalized vegetation index (NDVI), vegetation condition index (VCI), and mean vegetation condition index (MVCI). Ratio change to previous year and multiple year mean can be also produced on demand. The time-series vegetation condition indices are further combined with the NASS' remote-sensing-derived Cropland Data Layer (CDL) to estimate crop condition and progress crop by crop. To facilitate the operational requirement and increase the accessibility of data and products by different users, each component of the system has being developed and implemented following open specifications under the Web Service reference model of Open Geospatial Consortium Inc. Sensor observations and data are accessed through Web Coverage Service (WCS), Web Feature Service (WFS), or Sensor Observation Service (SOS) if available. Products are also served through such open-specification-compliant services. For rendering and presentation, Web Map Service (WMS) is used. A Web-service based system is set up and deployed at dss.csiss.gmu.edu/NDVIDownload. Further development will adopt crop growth models, feed the models with remotely sensed precipitation and soil moisture information, and incorporate the model results with vegetation-index time series for crop progress stage estimation.
NASA Astrophysics Data System (ADS)
Lesieur, Thibault; Krzakala, Florent; Zdeborová, Lenka
2017-07-01
This article is an extended version of previous work of Lesieur et al (2015 IEEE Int. Symp. on Information Theory Proc. pp 1635-9 and 2015 53rd Annual Allerton Conf. on Communication, Control and Computing (IEEE) pp 680-7) on low-rank matrix estimation in the presence of constraints on the factors into which the matrix is factorized. Low-rank matrix factorization is one of the basic methods used in data analysis for unsupervised learning of relevant features and other types of dimensionality reduction. We present a framework to study the constrained low-rank matrix estimation for a general prior on the factors, and a general output channel through which the matrix is observed. We draw a parallel with the study of vector-spin glass models—presenting a unifying way to study a number of problems considered previously in separate statistical physics works. We present a number of applications for the problem in data analysis. We derive in detail a general form of the low-rank approximate message passing (Low-RAMP) algorithm, that is known in statistical physics as the TAP equations. We thus unify the derivation of the TAP equations for models as different as the Sherrington-Kirkpatrick model, the restricted Boltzmann machine, the Hopfield model or vector (xy, Heisenberg and other) spin glasses. The state evolution of the Low-RAMP algorithm is also derived, and is equivalent to the replica symmetric solution for the large class of vector-spin glass models. In the section devoted to result we study in detail phase diagrams and phase transitions for the Bayes-optimal inference in low-rank matrix estimation. We present a typology of phase transitions and their relation to performance of algorithms such as the Low-RAMP or commonly used spectral methods.
Kumar, Surendra; Singh, Vineet; Tiwari, Meena
2007-07-01
Selective inhibition of ciliary process enzyme i.e. Carbonic Anhydrase-II is an excellent approach in reducing elevated intraocular pressure, thus treating glaucoma. Due to characteristic physicochemical properties of sulphonamide (Inhibition of Carbonic Anhydrase), they are clinically effective against glaucoma. But the non-specificity of sulphonamide derivatives to isozyme, leads to a range of side effects. Presently, the absence of comparative studies related to the binding of the sulphonamides as inhibitors to CA isozymes limits their use. In this paper we have represented "Three Dimensional Quantitative Structure Activity Relationship" study to characterize structural features of Sulfamide derivative [RR'NSO(2)NH(2)] as inhibitors, that are required for selective binding of carbonic anhydrase isozymes (CAI and CAII). In the analysis, stepwise multiple linear regression was performed using physiochemical parameters as independent variable and CA-I and CA-II inhibitory activity as dependent variable, respectively. The best multiparametric QSAR model obtained for CA-I inhibitory activity shows good statistical significance (r= 0.9714) and predictability (Q(2)=0.8921), involving the Electronic descriptors viz. Highest Occupied Molecular Orbital, Lowest Unoccupied Molecular Orbital and Steric descriptors viz. Principal moment of Inertia at X axis. Similarly, CA-II inhibitory activity also shows good statistical significance (r=0.9644) and predictability (Q(2)=0.8699) involving aforementioned descriptors. The predictive power of the model was successfully tested externally using a set of six compounds as test set for CA-I inhibitory activity and a set of seven compounds in case of CA-II inhibitory activity with good predictive squared correlation coefficient, r(2)(pred)=0.6016 and 0.7662, respectively. Overview of analysis favours substituents with high electronegativity and less bulk at R and R' positions of the parent nucleus, provides a basis to design new Sulfamide derivatives possessing potent and selective carbonic anhydrase-II inhibitory activity.
NASA Astrophysics Data System (ADS)
Kharkar, Prashant S.; Reith, Maarten E. A.; Dutta, Aloke K.
2008-01-01
Three-dimensional quantitative structure-activity relationship (3D QSAR) using comparative molecular field analysis (CoMFA) was performed on a series of substituted tetrahydropyran (THP) derivatives possessing serotonin (SERT) and norepinephrine (NET) transporter inhibitory activities. The study aimed to rationalize the potency of these inhibitors for SERT and NET as well as the observed selectivity differences for NET over SERT. The dataset consisted of 29 molecules, of which 23 molecules were used as the training set for deriving CoMFA models for SERT and NET uptake inhibitory activities. Superimpositions were performed using atom-based fitting and 3-point pharmacophore-based alignment. Two charge calculation methods, Gasteiger-Hückel and semiempirical PM3, were tried. Both alignment methods were analyzed in terms of their predictive abilities and produced comparable results with high internal and external predictivities. The models obtained using the 3-point pharmacophore-based alignment outperformed the models with atom-based fitting in terms of relevant statistics and interpretability of the generated contour maps. Steric fields dominated electrostatic fields in terms of contribution. The selectivity analysis (NET over SERT), though yielded models with good internal predictivity, showed very poor external test set predictions. The analysis was repeated with 24 molecules after systematically excluding so-called outliers (5 out of 29) from the model derivation process. The resulting CoMFA model using the atom-based fitting exhibited good statistics and was able to explain most of the selectivity (NET over SERT)-discriminating factors. The presence of -OH substituent on the THP ring was found to be one of the most important factors governing the NET selectivity over SERT. Thus, a 4-point NET-selective pharmacophore, after introducing this newly found H-bond donor/acceptor feature in addition to the initial 3-point pharmacophore, was proposed.
Important LiDAR metrics for discriminating forest tree species in Central Europe
NASA Astrophysics Data System (ADS)
Shi, Yifang; Wang, Tiejun; Skidmore, Andrew K.; Heurich, Marco
2018-03-01
Numerous airborne LiDAR-derived metrics have been proposed for classifying tree species. Yet an in-depth ecological and biological understanding of the significance of these metrics for tree species mapping remains largely unexplored. In this paper, we evaluated the performance of 37 frequently used LiDAR metrics derived under leaf-on and leaf-off conditions, respectively, for discriminating six different tree species in a natural forest in Germany. We firstly assessed the correlation between these metrics. Then we applied a Random Forest algorithm to classify the tree species and evaluated the importance of the LiDAR metrics. Finally, we identified the most important LiDAR metrics and tested their robustness and transferability. Our results indicated that about 60% of LiDAR metrics were highly correlated to each other (|r| > 0.7). There was no statistically significant difference in tree species mapping accuracy between the use of leaf-on and leaf-off LiDAR metrics. However, combining leaf-on and leaf-off LiDAR metrics significantly increased the overall accuracy from 58.2% (leaf-on) and 62.0% (leaf-off) to 66.5% as well as the kappa coefficient from 0.47 (leaf-on) and 0.51 (leaf-off) to 0.58. Radiometric features, especially intensity related metrics, provided more consistent and significant contributions than geometric features for tree species discrimination. Specifically, the mean intensity of first-or-single returns as well as the mean value of echo width were identified as the most robust LiDAR metrics for tree species discrimination. These results indicate that metrics derived from airborne LiDAR data, especially radiometric metrics, can aid in discriminating tree species in a mixed temperate forest, and represent candidate metrics for tree species classification and monitoring in Central Europe.
Observational limitations of Bose-Einstein photon statistics and radiation noise in thermal emission
NASA Astrophysics Data System (ADS)
Lee, Y.-J.; Talghader, J. J.
2018-01-01
For many decades, theory has predicted that Bose-Einstein statistics are a fundamental feature of thermal emission into one or a few optical modes; however, the resulting Bose-Einstein-like photon noise has never been experimentally observed. There are at least two reasons for this: (1) Relationships to describe the thermal radiation noise for an arbitrary mode structure have yet to be set forth, and (2) the mode and detector constraints necessary for the detection of such light is extremely hard to fulfill. Herein, photon statistics and radiation noise relationships are developed for systems with any number of modes and couplings to an observing space. The results are shown to reproduce existing special cases of thermal emission and are then applied to resonator systems to discuss physically realizable conditions under which Bose-Einstein-like thermal statistics might be observed. Examples include a single isolated cavity and an emitter cavity coupled to a small detector space. Low-mode-number noise theory shows major deviations from solely Bose-Einstein or Poisson treatments and has particular significance because of recent advances in perfect absorption and subwavelength structures both in the long-wave infrared and terahertz regimes. These microresonator devices tend to utilize a small volume with few modes, a regime where the current theory of thermal emission fluctuations and background noise, which was developed decades ago for free-space or single-mode cavities, has no derived solutions.
SEGMENTING CT PROSTATE IMAGES USING POPULATION AND PATIENT-SPECIFIC STATISTICS FOR RADIOTHERAPY.
Feng, Qianjin; Foskey, Mark; Tang, Songyuan; Chen, Wufan; Shen, Dinggang
2009-08-07
This paper presents a new deformable model using both population and patient-specific statistics to segment the prostate from CT images. There are two novelties in the proposed method. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than general intensity and gradient features, is used to characterize the image features. Second, an online training approach is used to build the shape statistics for accurately capturing intra-patient variation, which is more important than inter-patient variation for prostate segmentation in clinical radiotherapy. Experimental results show that the proposed method is robust and accurate, suitable for clinical application.
SEGMENTING CT PROSTATE IMAGES USING POPULATION AND PATIENT-SPECIFIC STATISTICS FOR RADIOTHERAPY
Feng, Qianjin; Foskey, Mark; Tang, Songyuan; Chen, Wufan; Shen, Dinggang
2010-01-01
This paper presents a new deformable model using both population and patient-specific statistics to segment the prostate from CT images. There are two novelties in the proposed method. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than general intensity and gradient features, is used to characterize the image features. Second, an online training approach is used to build the shape statistics for accurately capturing intra-patient variation, which is more important than inter-patient variation for prostate segmentation in clinical radiotherapy. Experimental results show that the proposed method is robust and accurate, suitable for clinical application. PMID:21197416
Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan
2013-02-01
The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.
a Statistical Texture Feature for Building Collapse Information Extraction of SAR Image
NASA Astrophysics Data System (ADS)
Li, L.; Yang, H.; Chen, Q.; Liu, X.
2018-04-01
Synthetic Aperture Radar (SAR) has become one of the most important ways to extract post-disaster collapsed building information, due to its extreme versatility and almost all-weather, day-and-night working capability, etc. In view of the fact that the inherent statistical distribution of speckle in SAR images is not used to extract collapsed building information, this paper proposed a novel texture feature of statistical models of SAR images to extract the collapsed buildings. In the proposed feature, the texture parameter of G0 distribution from SAR images is used to reflect the uniformity of the target to extract the collapsed building. This feature not only considers the statistical distribution of SAR images, providing more accurate description of the object texture, but also is applied to extract collapsed building information of single-, dual- or full-polarization SAR data. The RADARSAT-2 data of Yushu earthquake which acquired on April 21, 2010 is used to present and analyze the performance of the proposed method. In addition, the applicability of this feature to SAR data with different polarizations is also analysed, which provides decision support for the data selection of collapsed building information extraction.
Mukherjee, Subhendu; Nagar, Shuchi; Mullick, Sanchita; Mukherjee, Arup; Saha, Achintya
2008-01-01
Considering the worth of developing non-steroidal estrogen analogs, the present study explores the pharmacophore features of arylbenzothiophene derivatives for inhibitory activity to MCF-7 cells using classical QSAR and 3D space modeling approaches. The analysis shows that presence of phenolic hydroxyl group and ketonic linkage in the basic side chain of 2-arylbenzothiophene core of raloxifene derivatives are crucial. Additionally piperidine ring connected through ether linkage is favorable for inhibition of breast cancer cell line. These features for inhibitory activity are also highlighted through 3D space modeling approach that explored importance of critical inter features distance among HB-acceptor lipid, hydrophobic and HB-donor features in the arylbenzothiophene scaffold for activity.
Decision rules for unbiased inventory estimates
NASA Technical Reports Server (NTRS)
Argentiero, P. D.; Koch, D.
1979-01-01
An efficient and accurate procedure for estimating inventories from remote sensing scenes is presented. In place of the conventional and expensive full dimensional Bayes decision rule, a one-dimensional feature extraction and classification technique was employed. It is shown that this efficient decision rule can be used to develop unbiased inventory estimates and that for large sample sizes typical of satellite derived remote sensing scenes, resulting accuracies are comparable or superior to more expensive alternative procedures. Mathematical details of the procedure are provided in the body of the report and in the appendix. Results of a numerical simulation of the technique using statistics obtained from an observed LANDSAT scene are included. The simulation demonstrates the effectiveness of the technique in computing accurate inventory estimates.
Spectrophotometry of comets Giacobini-Zinner and Halley
NASA Technical Reports Server (NTRS)
Tegler, Stephen C.; O'Dell, C. R.
1987-01-01
Optical window spectrophotometry was performed on comets Giacobini-Zinner and Halley over the interval 300-1000 nm. Band and band-sequence fluxes were obtained for the brightest features of OH, CN, NH, and C2, special care having been given to determinations of extinction, instrumental sensitivities, and corrections for Fraunhofer lines. C2 Swan band-sequence flux ratios were determined with unprecedented accuracy and compared with the predictions of the detailed equilibrium models of Krishna Swamy et al. (1977, 1979, 1981, and 1987). It is found that these band sequences do not agree with the predictions, which calls into question the assumptions made in deriving the model, namely resonance fluorescence statistical equilibrium. Suggestions are made as to how to resolve this discrepancy.
NASA Astrophysics Data System (ADS)
Attallah, Bilal; Serir, Amina; Chahir, Youssef; Boudjelal, Abdelwahhab
2017-11-01
Palmprint recognition systems are dependent on feature extraction. A method of feature extraction using higher discrimination information was developed to characterize palmprint images. In this method, two individual feature extraction techniques are applied to a discrete wavelet transform of a palmprint image, and their outputs are fused. The two techniques used in the fusion are the histogram of gradient and the binarized statistical image features. They are then evaluated using an extreme learning machine classifier before selecting a feature based on principal component analysis. Three palmprint databases, the Hong Kong Polytechnic University (PolyU) Multispectral Palmprint Database, Hong Kong PolyU Palmprint Database II, and the Delhi Touchless (IIDT) Palmprint Database, are used in this study. The study shows that our method effectively identifies and verifies palmprints and outperforms other methods based on feature extraction.
NASA Astrophysics Data System (ADS)
Shi, Bibo; Grimm, Lars J.; Mazurowski, Maciej A.; Marks, Jeffrey R.; King, Lorraine M.; Maley, Carlo C.; Hwang, E. Shelley; Lo, Joseph Y.
2017-03-01
Reducing the overdiagnosis and overtreatment associated with ductal carcinoma in situ (DCIS) requires accurate prediction of the invasive potential at cancer screening. In this work, we investigated the utility of pre-operative histologic and mammographic features to predict upstaging of DCIS. The goal was to provide intentionally conservative baseline performance using readily available data from radiologists and pathologists and only linear models. We conducted a retrospective analysis on 99 patients with DCIS. Of those 25 were upstaged to invasive cancer at the time of definitive surgery. Pre-operative factors including both the histologic features extracted from stereotactic core needle biopsy (SCNB) reports and the mammographic features annotated by an expert breast radiologist were investigated with statistical analysis. Furthermore, we built classification models based on those features in an attempt to predict the presence of an occult invasive component in DCIS, with generalization performance assessed by receiver operating characteristic (ROC) curve analysis. Histologic features including nuclear grade and DCIS subtype did not show statistically significant differences between cases with pure DCIS and with DCIS plus invasive disease. However, three mammographic features, i.e., the major axis length of DCIS lesion, the BI-RADS level of suspicion, and radiologist's assessment did achieve the statistical significance. Using those three statistically significant features as input, a linear discriminant model was able to distinguish patients with DCIS plus invasive disease from those with pure DCIS, with AUC-ROC equal to 0.62. Overall, mammograms used for breast screening contain useful information that can be perceived by radiologists and help predict occult invasive components in DCIS.
NASA Astrophysics Data System (ADS)
Carini, R.; Brocato, E.; Raimondo, G.; Marconi, M.
2017-08-01
This work analyses the effect of the helium content on synthetic period-luminosity relations (PLRs) and period-Wesenheit relations (PWRs) of Cepheids and the systematic uncertainties on the derived distances that a hidden population of He-enhanced Cepheids may generate. We use new stellar and pulsation models to build a homogeneous and consistent framework to derive the Cepheid features. The Cepheid populations expected in synthetic colour-magnitude diagrams of young stellar systems (from 20 to 250 Myr) are computed in several photometric bands for Y = 0.25 and 0.35, at a fixed metallicity (Z = 0.008). The PLRs appear to be very similar in the two cases, with negligible effects (few per cent) on distances, while PWRs differ somewhat, with systematic uncertainties in deriving distances as high as ˜ 7 per cent at log P < 1.5. Statistical effects due to the number of variables used to determine the relations contribute to a distance systematic error of the order of few percent, with values decreasing from optical to near-infrared bands. The empirical PWRs derived from multiwavelength data sets for the Large Magellanic Cloud (LMC) is in a very good agreement with our theoretical PWRs obtained with a standard He content, supporting the evidence that LMC Cepheids do not show any He effect.
NASA Technical Reports Server (NTRS)
Gramenopoulos, N. (Principal Investigator)
1973-01-01
The author has identified the following significant results. For the recognition of terrain types, spatial signatures are developed from the diffraction patterns of small areas of ERTS-1 images. This knowledge is exploited for the measurements of a small number of meaningful spatial features from the digital Fourier transforms of ERTS-1 image cells containing 32 x 32 picture elements. Using these spatial features and a heuristic algorithm, the terrain types in the vicinity of Phoenix, Arizona were recognized by the computer with a high accuracy. Then, the spatial features were combined with spectral features and using the maximum likelihood criterion the recognition accuracy of terrain types increased substantially. It was determined that the recognition accuracy with the maximum likelihood criterion depends on the statistics of the feature vectors. Nonlinear transformations of the feature vectors are required so that the terrain class statistics become approximately Gaussian. It was also determined that for a given geographic area the statistics of the classes remain invariable for a period of a month but vary substantially between seasons.
Taylor, Kirsten I.; Devereux, Barry J.; Acres, Kadia; Randall, Billi; Tyler, Lorraine K.
2013-01-01
Conceptual representations are at the heart of our mental lives, involved in every aspect of cognitive functioning. Despite their centrality, a long-standing debate persists as to how the meanings of concepts are represented and processed. Many accounts agree that the meanings of concrete concepts are represented by their individual features, but disagree about the importance of different feature-based variables: some views stress the importance of the information carried by distinctive features in conceptual processing, others the features which are shared over many concepts, and still others the extent to which features co-occur. We suggest that previously disparate theoretical positions and experimental findings can be unified by an account which claims that task demands determine how concepts are processed in addition to the effects of feature distinctiveness and co-occurrence. We tested these predictions in a basic-level naming task which relies on distinctive feature information (Experiment 1) and a domain decision task which relies on shared feature information (Experiment 2). Both used large-scale regression designs with the same visual objects, and mixed-effects models incorporating participant, session, stimulus-related and feature statistic variables to model the performance. We found that concepts with relatively more distinctive and more highly correlated distinctive relative to shared features facilitated basic-level naming latencies, while concepts with relatively more shared and more highly correlated shared relative to distinctive features speeded domain decisions. These findings demonstrate that the feature statistics of distinctiveness (shared vs. distinctive) and correlational strength, as well as the task demands, determine how concept meaning is processed in the conceptual system. PMID:22137770
NASA Astrophysics Data System (ADS)
Chen, K.; Weinmann, M.; Gao, X.; Yan, M.; Hinz, S.; Jutzi, B.; Weinmann, M.
2018-05-01
In this paper, we address the deep semantic segmentation of aerial imagery based on multi-modal data. Given multi-modal data composed of true orthophotos and the corresponding Digital Surface Models (DSMs), we extract a variety of hand-crafted radiometric and geometric features which are provided separately and in different combinations as input to a modern deep learning framework. The latter is represented by a Residual Shuffling Convolutional Neural Network (RSCNN) combining the characteristics of a Residual Network with the advantages of atrous convolution and a shuffling operator to achieve a dense semantic labeling. Via performance evaluation on a benchmark dataset, we analyze the value of different feature sets for the semantic segmentation task. The derived results reveal that the use of radiometric features yields better classification results than the use of geometric features for the considered dataset. Furthermore, the consideration of data on both modalities leads to an improvement of the classification results. However, the derived results also indicate that the use of all defined features is less favorable than the use of selected features. Consequently, data representations derived via feature extraction and feature selection techniques still provide a gain if used as the basis for deep semantic segmentation.
Improved classification accuracy by feature extraction using genetic algorithms
NASA Astrophysics Data System (ADS)
Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.
2003-05-01
A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.
High Dimensional Classification Using Features Annealed Independence Rules.
Fan, Jianqing; Fan, Yingying
2008-01-01
Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.
NASA Astrophysics Data System (ADS)
Goto, Shin-itiro; Umeno, Ken
2018-03-01
Maps on a parameter space for expressing distribution functions are exactly derived from the Perron-Frobenius equations for a generalized Boole transform family. Here the generalized Boole transform family is a one-parameter family of maps, where it is defined on a subset of the real line and its probability distribution function is the Cauchy distribution with some parameters. With this reduction, some relations between the statistical picture and the orbital one are shown. From the viewpoint of information geometry, the parameter space can be identified with a statistical manifold, and then it is shown that the derived maps can be characterized. Also, with an induced symplectic structure from a statistical structure, symplectic and information geometric aspects of the derived maps are discussed.
Menzerath-Altmann Law: Statistical Mechanical Interpretation as Applied to a Linguistic Organization
NASA Astrophysics Data System (ADS)
Eroglu, Sertac
2014-10-01
The distribution behavior described by the empirical Menzerath-Altmann law is frequently encountered during the self-organization of linguistic and non-linguistic natural organizations at various structural levels. This study presents a statistical mechanical derivation of the law based on the analogy between the classical particles of a statistical mechanical organization and the distinct words of a textual organization. The derived model, a transformed (generalized) form of the Menzerath-Altmann model, was termed as the statistical mechanical Menzerath-Altmann model. The derived model allows interpreting the model parameters in terms of physical concepts. We also propose that many organizations presenting the Menzerath-Altmann law behavior, whether linguistic or not, can be methodically examined by the transformed distribution model through the properly defined structure-dependent parameter and the energy associated states.
Thermodynamics and statistical mechanics. [thermodynamic properties of gases
NASA Technical Reports Server (NTRS)
1976-01-01
The basic thermodynamic properties of gases are reviewed and the relations between them are derived from the first and second laws. The elements of statistical mechanics are then formulated and the partition function is derived. The classical form of the partition function is used to obtain the Maxwell-Boltzmann distribution of kinetic energies in the gas phase and the equipartition of energy theorem is given in its most general form. The thermodynamic properties are all derived as functions of the partition function. Quantum statistics are reviewed briefly and the differences between the Boltzmann distribution function for classical particles and the Fermi-Dirac and Bose-Einstein distributions for quantum particles are discussed.
Zhang, Xuetao; Huang, Jie; Yigit-Elliott, Serap; Rosenholtz, Ruth
2015-03-16
Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with "equivalent" 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target. © 2015 ARVO.
NASA Astrophysics Data System (ADS)
Noda, Isao
2014-07-01
A comprehensive survey review of new and noteworthy developments, which are advancing forward the frontiers in the field of 2D correlation spectroscopy during the last four years, is compiled. This review covers books, proceedings, and review articles published on 2D correlation spectroscopy, a number of significant conceptual developments in the field, data pretreatment methods and other pertinent topics, as well as patent and publication trends and citation activities. Developments discussed include projection 2D correlation analysis, concatenated 2D correlation, and correlation under multiple perturbation effects, as well as orthogonal sample design, predicting 2D correlation spectra, manipulating and comparing 2D spectra, correlation strategy based on segmented data blocks, such as moving-window analysis, features like determination of sequential order and enhanced spectral resolution, statistical 2D spectroscopy using covariance and other statistical metrics, hetero-correlation analysis, and sample-sample correlation technique. Data pretreatment operations prior to 2D correlation analysis are discussed, including the correction for physical effects, background and baseline subtraction, selection of reference spectrum, normalization and scaling of data, derivatives spectra and deconvolution technique, and smoothing and noise reduction. Other pertinent topics include chemometrics and statistical considerations, peak position shift phenomena, variable sampling increments, computation and software, display schemes, such as color coded format, slice and power spectra, tabulation, and other schemes.
Zhang, Xuetao; Huang, Jie; Yigit-Elliott, Serap; Rosenholtz, Ruth
2015-01-01
Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with “equivalent” 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target. PMID:25780063
Kim, Won Hwa; Singh, Vikas; Chung, Moo K.; Hinrichs, Chris; Pachauri, Deepti; Okonkwo, Ozioma C.; Johnson, Sterling C.
2014-01-01
Statistical analysis on arbitrary surface meshes such as the cortical surface is an important approach to understanding brain diseases such as Alzheimer’s disease (AD). Surface analysis may be able to identify specific cortical patterns that relate to certain disease characteristics or exhibit differences between groups. Our goal in this paper is to make group analysis of signals on surfaces more sensitive. To do this, we derive multi-scale shape descriptors that characterize the signal around each mesh vertex, i.e., its local context, at varying levels of resolution. In order to define such a shape descriptor, we make use of recent results from harmonic analysis that extend traditional continuous wavelet theory from the Euclidean to a non-Euclidean setting (i.e., a graph, mesh or network). Using this descriptor, we conduct experiments on two different datasets, the Alzheimer’s Disease NeuroImaging Initiative (ADNI) data and images acquired at the Wisconsin Alzheimer’s Disease Research Center (W-ADRC), focusing on individuals labeled as having Alzheimer’s disease (AD), mild cognitive impairment (MCI) and healthy controls. In particular, we contrast traditional univariate methods with our multi-resolution approach which show increased sensitivity and improved statistical power to detect a group-level effects. We also provide an open source implementation. PMID:24614060
Fundamental Statistical Descriptions of Plasma Turbulence in Magnetic Fields
DOE Office of Scientific and Technical Information (OSTI.GOV)
John A. Krommes
2001-02-16
A pedagogical review of the historical development and current status (as of early 2000) of systematic statistical theories of plasma turbulence is undertaken. Emphasis is on conceptual foundations and methodology, not practical applications. Particular attention is paid to equations and formalism appropriate to strongly magnetized, fully ionized plasmas. Extensive reference to the literature on neutral-fluid turbulence is made, but the unique properties and problems of plasmas are emphasized throughout. Discussions are given of quasilinear theory, weak-turbulence theory, resonance-broadening theory, and the clump algorithm. Those are developed independently, then shown to be special cases of the direct-interaction approximation (DIA), which providesmore » a central focus for the article. Various methods of renormalized perturbation theory are described, then unified with the aid of the generating-functional formalism of Martin, Siggia, and Rose. A general expression for the renormalized dielectric function is deduced and discussed in detail. Modern approaches such as decimation and PDF methods are described. Derivations of DIA-based Markovian closures are discussed. The eddy-damped quasinormal Markovian closure is shown to be nonrealizable in the presence of waves, and a new realizable Markovian closure is presented. The test-field model and a realizable modification thereof are also summarized. Numerical solutions of various closures for some plasma-physics paradigms are reviewed. The variational approach to bounds on transport is developed. Miscellaneous topics include Onsager symmetries for turbulence, the interpretation of entropy balances for both kinetic and fluid descriptions, self-organized criticality, statistical interactions between disparate scales, and the roles of both mean and random shear. Appendices are provided on Fourier transform conventions, dimensional and scaling analysis, the derivations of nonlinear gyrokinetic and gyrofluid equations, stochasticity criteria for quasilinear theory, formal aspects of resonance-broadening theory, Novikov's theorem, the treatment of weak inhomogeneity, the derivation of the Vlasov weak-turbulence wave kinetic equation from a fully renormalized description, some features of a code for solving the direct-interaction approximation and related Markovian closures, the details of the solution of the EDQNM closure for a solvable three-wave model, and the notation used in the article.« less
NASA Astrophysics Data System (ADS)
Kang, Pilsang; Koo, Changhoi; Roh, Hokyu
2017-11-01
Since simple linear regression theory was established at the beginning of the 1900s, it has been used in a variety of fields. Unfortunately, it cannot be used directly for calibration. In practical calibrations, the observed measurements (the inputs) are subject to errors, and hence they vary, thus violating the assumption that the inputs are fixed. Therefore, in the case of calibration, the regression line fitted using the method of least squares is not consistent with the statistical properties of simple linear regression as already established based on this assumption. To resolve this problem, "classical regression" and "inverse regression" have been proposed. However, they do not completely resolve the problem. As a fundamental solution, we introduce "reversed inverse regression" along with a new methodology for deriving its statistical properties. In this study, the statistical properties of this regression are derived using the "error propagation rule" and the "method of simultaneous error equations" and are compared with those of the existing regression approaches. The accuracy of the statistical properties thus derived is investigated in a simulation study. We conclude that the newly proposed regression and methodology constitute the complete regression approach for univariate linear calibrations.
NASA Astrophysics Data System (ADS)
Saadi, Saad
2017-04-01
Characterizing the complexity and heterogeneity of the geometries and deposits in meandering river system is an important concern for the reservoir modelling of fluvial environments. Re-examination of the Long Nab member in the Scalby formation of the Ravenscar Group (Yorkshire, UK), integrating digital outcrop data and forward modelling approaches, will lead to a geologically realistic numerical model of the meandering river geometry. The methodology is based on extracting geostatistics from modern analogous, meandering rivers that exemplify both the confined and non-confined meandering point bars deposits and morphodynamics of Long Nab member. The parameters derived from the modern systems (i.e. channel width, amplitude, radius of curvature, sinuosity, wavelength, channel length and migration rate) are used as a statistical control for the forward simulation and resulting object oriented channel models. The statistical data derived from the modern analogues is multi-dimensional in nature, making analysis difficult. We apply data mining techniques such as parallel coordinates to investigate and identify the important relationships within the modern analogue data, which can then be used drive the development of, and as input to the forward model. This work will increase our understanding of meandering river morphodynamics, planform architecture and stratigraphic signature of various fluvial deposits and features. We will then use these forward modelling based channel objects to build reservoir models, and compare the behaviour of the forward modelled channels with traditional object modelling in hydrocarbon flow simulations.
Probing cosmic strings with satellite CMB measurements
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeong, E.; Baccigalupi, Carlo; Smoot, G.F., E-mail: ehjeong@sissa.it, E-mail: bacci@sissa.it, E-mail: gfsmoot@lbl.gov
2010-09-01
We study the problem of searching for cosmic string signal patterns in the present high resolution and high sensitivity observations of the Cosmic Microwave Background (CMB). This article discusses a technique capable of recognizing Kaiser-Stebbins effect signatures in total intensity anisotropy maps from isolated strings. We derive the statistical distributions of null detections from purely Gaussian fluctuations and instrumental performances of the operating satellites, and show that the biggest factor that produces confusion is represented by the acoustic oscillation features of the scale comparable to the size of horizon at recombination. Simulations show that the distribution of null detections convergesmore » to a χ{sup 2} distribution, with detectability threshold at 99% confidence level corresponding to a string induced step signal with an amplitude of about 100 μK which corresponds to a limit of roughly Gμ ∼ 1.5 × 10{sup −6}. We implement simulations for deriving the statistics of spurious detections caused by extra-Galactic and Galactic foregrounds. For diffuse Galactic foregrounds, which represents the dominant source of contamination, we construct sky masks outlining the available region of the sky where the Galactic confusion is sub-dominant, specializing our analysis to the case represented by the frequency coverage and nominal sensitivity and resolution of the Planck experiment. As for other CMB measurements, the maximum available area, corresponding to 7%, is reached where the foreground emission is expected to be minimum, in the 70–100 GHz interval.« less
Pal, Sandip
2016-06-01
The convective boundary layer (CBL) turbulence is the key process for exchanging heat, momentum, moisture and trace gases between the earth's surface and the lower part of the troposphere. The turbulence parameterization of the CBL is a challenging but important component in numerical models. In particular, correct estimation of CBL turbulence features, parameterization, and the determination of the contribution of eddy diffusivity are important for simulating convection initiation, and the dispersion of health hazardous air pollutants and Greenhouse gases. In general, measurements of higher-order moments of water vapor mixing ratio (q) variability yield unique estimates of turbulence in the CBL. Using the high-resolution lidar-derived profiles of q variance, third-order moment, and skewness and analyzing concurrent profiles of vertical velocity, potential temperature, horizontal wind and time series of near-surface measurements of surface flux and meteorological parameters, a conceptual framework based on bottom up approach is proposed here for the first time for a robust characterization of the turbulent structure of CBL over land so that our understanding on the processes governing CBL q turbulence could be improved. Finally, principal component analyses will be applied on the lidar-derived long-term data sets of q turbulence statistics to identify the meteorological factors and the dominant physical mechanisms governing the CBL turbulence features. Copyright © 2016 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Slepian, Zachary; Slosar, Anze; Eisenstein, Daniel J.
We present the large-scale 3-point correlation function (3PCF) of the SDSS DR12 CMASS sample of 777,202 Luminous Red Galaxies, the largest-ever sample used for a 3PCF or bispectrum measurement. We make the first high-significance (4.5σ) detection of Baryon Acoustic Oscillations (BAO) in the 3PCF. Using these acoustic features in the 3PCF as a standard ruler, we measure the distance to z=0.57 to 1.7% precision (statistical plus systematic). We find D V = 2024 ± 29Mpc (stat) ± 20Mpc(sys) for our fiducial cosmology (consistent with Planck 2015) and bias model. This measurement extends the use of the BAO technique from themore » 2-point correlation function (2PCF) and power spectrum to the 3PCF and opens an avenue for deriving additional cosmological distance information from future large-scale structure redshift surveys such as DESI. Our measured distance scale from the 3PCF is fairly independent from that derived from the pre-reconstruction 2PCF and is equivalent to increasing the length of BOSS by roughly 10%; reconstruction appears to lower the independence of the distance measurements. In conclusion, fitting a model including tidal tensor bias yields a moderate significance (2.6σ) detection of this bias with a value in agreement with the prediction from local Lagrangian biasing.« less
NASA Astrophysics Data System (ADS)
Slepian, Zachary; Eisenstein, Daniel J.; Brownstein, Joel R.; Chuang, Chia-Hsun; Gil-Marín, Héctor; Ho, Shirley; Kitaura, Francisco-Shu; Percival, Will J.; Ross, Ashley J.; Rossi, Graziano; Seo, Hee-Jong; Slosar, Anže; Vargas-Magaña, Mariana
2017-08-01
We present the large-scale three-point correlation function (3PCF) of the Sloan Digital Sky Survey DR12 Constant stellar Mass (CMASS) sample of 777 202 Luminous Red Galaxies, the largest-ever sample used for a 3PCF or bispectrum measurement. We make the first high-significance (4.5σ) detection of baryon acoustic oscillations (BAO) in the 3PCF. Using these acoustic features in the 3PCF as a standard ruler, we measure the distance to z = 0.57 to 1.7 per cent precision (statistical plus systematic). We find DV = 2024 ± 29 Mpc (stat) ± 20 Mpc (sys) for our fiducial cosmology (consistent with Planck 2015) and bias model. This measurement extends the use of the BAO technique from the two-point correlation function (2PCF) and power spectrum to the 3PCF and opens an avenue for deriving additional cosmological distance information from future large-scale structure redshift surveys such as DESI. Our measured distance scale from the 3PCF is fairly independent from that derived from the pre-reconstruction 2PCF and is equivalent to increasing the length of BOSS by roughly 10 per cent; reconstruction appears to lower the independence of the distance measurements. Fitting a model including tidal tensor bias yields a moderate-significance (2.6σ) detection of this bias with a value in agreement with the prediction from local Lagrangian biasing.
Slepian, Zachary; Slosar, Anze; Eisenstein, Daniel J.; ...
2017-03-01
We present the large-scale 3-point correlation function (3PCF) of the SDSS DR12 CMASS sample of 777,202 Luminous Red Galaxies, the largest-ever sample used for a 3PCF or bispectrum measurement. We make the first high-significance (4.5σ) detection of Baryon Acoustic Oscillations (BAO) in the 3PCF. Using these acoustic features in the 3PCF as a standard ruler, we measure the distance to z=0.57 to 1.7% precision (statistical plus systematic). We find D V = 2024 ± 29Mpc (stat) ± 20Mpc(sys) for our fiducial cosmology (consistent with Planck 2015) and bias model. This measurement extends the use of the BAO technique from themore » 2-point correlation function (2PCF) and power spectrum to the 3PCF and opens an avenue for deriving additional cosmological distance information from future large-scale structure redshift surveys such as DESI. Our measured distance scale from the 3PCF is fairly independent from that derived from the pre-reconstruction 2PCF and is equivalent to increasing the length of BOSS by roughly 10%; reconstruction appears to lower the independence of the distance measurements. In conclusion, fitting a model including tidal tensor bias yields a moderate significance (2.6σ) detection of this bias with a value in agreement with the prediction from local Lagrangian biasing.« less
[Road Extraction in Remote Sensing Images Based on Spectral and Edge Analysis].
Zhao, Wen-zhi; Luo, Li-qun; Guo, Zhou; Yue, Jun; Yu, Xue-ying; Liu, Hui; Wei, Jing
2015-10-01
Roads are typically man-made objects in urban areas. Road extraction from high-resolution images has important applications for urban planning and transportation development. However, due to the confusion of spectral characteristic, it is difficult to distinguish roads from other objects by merely using traditional classification methods that mainly depend on spectral information. Edge is an important feature for the identification of linear objects (e. g. , roads). The distribution patterns of edges vary greatly among different objects. It is crucial to merge edge statistical information into spectral ones. In this study, a new method that combines spectral information and edge statistical features has been proposed. First, edge detection is conducted by using self-adaptive mean-shift algorithm on the panchromatic band, which can greatly reduce pseudo-edges and noise effects. Then, edge statistical features are obtained from the edge statistical model, which measures the length and angle distribution of edges. Finally, by integrating the spectral and edge statistical features, SVM algorithm is used to classify the image and roads are ultimately extracted. A series of experiments are conducted and the results show that the overall accuracy of proposed method is 93% comparing with only 78% overall accuracy of the traditional. The results demonstrate that the proposed method is efficient and valuable for road extraction, especially on high-resolution images.
Buscombe, Daniel; Wheaton, Joseph M.
2018-01-01
Side scan sonar in low-cost ‘fishfinder’ systems has become popular in aquatic ecology and sedimentology for imaging submerged riverbed sediment at coverages and resolutions sufficient to relate bed texture to grain-size. Traditional methods to map bed texture (i.e. physical samples) are relatively high-cost and low spatial coverage compared to sonar, which can continuously image several kilometers of channel in a few hours. Towards a goal of automating the classification of bed habitat features, we investigate relationships between substrates and statistical descriptors of bed textures in side scan sonar echograms of alluvial deposits. We develop a method for automated segmentation of bed textures into between two to five grain-size classes. Second-order texture statistics are used in conjunction with a Gaussian Mixture Model to classify the heterogeneous bed into small homogeneous patches of sand, gravel, and boulders with an average accuracy of 80%, 49%, and 61%, respectively. Reach-averaged proportions of these sediment types were within 3% compared to similar maps derived from multibeam sonar. PMID:29538449
Curve Boxplot: Generalization of Boxplot for Ensembles of Curves.
Mirzargar, Mahsa; Whitaker, Ross T; Kirby, Robert M
2014-12-01
In simulation science, computational scientists often study the behavior of their simulations by repeated solutions with variations in parameters and/or boundary values or initial conditions. Through such simulation ensembles, one can try to understand or quantify the variability or uncertainty in a solution as a function of the various inputs or model assumptions. In response to a growing interest in simulation ensembles, the visualization community has developed a suite of methods for allowing users to observe and understand the properties of these ensembles in an efficient and effective manner. An important aspect of visualizing simulations is the analysis of derived features, often represented as points, surfaces, or curves. In this paper, we present a novel, nonparametric method for summarizing ensembles of 2D and 3D curves. We propose an extension of a method from descriptive statistics, data depth, to curves. We also demonstrate a set of rendering and visualization strategies for showing rank statistics of an ensemble of curves, which is a generalization of traditional whisker plots or boxplots to multidimensional curves. Results are presented for applications in neuroimaging, hurricane forecasting and fluid dynamics.
Photoresist thin-film effects on alignment process capability
NASA Astrophysics Data System (ADS)
Flores, Gary E.; Flack, Warren W.
1993-08-01
Two photoresists were selected for alignment characterization based on their dissimilar coating properties and observed differences on alignment capability. The materials are Dynachem OFPR-800 and Shipley System 8. Both photoresists were examined on two challenging alignment levels in a submicron CMOS process, a nitride level and a planarized second level metal. An Ultratech Stepper model 1500 which features a darkfield alignment system with a broadband green light for alignment signal detection was used for this project. Initially, statistically designed linear screening experiments were performed to examine six process factors for each photoresist: viscosity, spin acceleration, spin speed, spin time, softbake time, and softbake temperature. Using the results derived from the screening experiments, a more thorough examination of the statistically significant process factors was performed. A full quadratic experimental design was conducted to examine viscosity, spin speed, and spin time coating properties on alignment. This included a characterization of both intra and inter wafer alignment control and alignment process capability. Insight to the different alignment behavior is analyzed in terms of photoresist material properties and the physical nature of the alignment detection system.
Statistics of initial density perturbations in heavy ion collisions and their fluid dynamic response
NASA Astrophysics Data System (ADS)
Floerchinger, Stefan; Wiedemann, Urs Achim
2014-08-01
An interesting opportunity to determine thermodynamic and transport properties in more detail is to identify generic statistical properties of initial density perturbations. Here we study event-by-event fluctuations in terms of correlation functions for two models that can be solved analytically. The first assumes Gaussian fluctuations around a distribution that is fixed by the collision geometry but leads to non-Gaussian features after averaging over the reaction plane orientation at non-zero impact parameter. In this context, we derive a three-parameter extension of the commonly used Bessel-Gaussian event-by-event distribution of harmonic flow coefficients. Secondly, we study a model of N independent point sources for which connected n-point correlation functions of initial perturbations scale like 1 /N n-1. This scaling is violated for non-central collisions in a way that can be characterized by its impact parameter dependence. We discuss to what extent these are generic properties that can be expected to hold for any model of initial conditions, and how this can improve the fluid dynamical analysis of heavy ion collisions.
Numerical solutions of the semiclassical Boltzmann ellipsoidal-statistical kinetic model equation
Yang, Jaw-Yen; Yan, Chin-Yuan; Huang, Juan-Chen; Li, Zhihui
2014-01-01
Computations of rarefied gas dynamical flows governed by the semiclassical Boltzmann ellipsoidal-statistical (ES) kinetic model equation using an accurate numerical method are presented. The semiclassical ES model was derived through the maximum entropy principle and conserves not only the mass, momentum and energy, but also contains additional higher order moments that differ from the standard quantum distributions. A different decoding procedure to obtain the necessary parameters for determining the ES distribution is also devised. The numerical method in phase space combines the discrete-ordinate method in momentum space and the high-resolution shock capturing method in physical space. Numerical solutions of two-dimensional Riemann problems for two configurations covering various degrees of rarefaction are presented and various contours of the quantities unique to this new model are illustrated. When the relaxation time becomes very small, the main flow features a display similar to that of ideal quantum gas dynamics, and the present solutions are found to be consistent with existing calculations for classical gas. The effect of a parameter that permits an adjustable Prandtl number in the flow is also studied. PMID:25104904
Paroxysmal atrial fibrillation prediction method with shorter HRV sequences.
Boon, K H; Khalil-Hani, M; Malarvili, M B; Sia, C W
2016-10-01
This paper proposes a method that predicts the onset of paroxysmal atrial fibrillation (PAF), using heart rate variability (HRV) segments that are shorter than those applied in existing methods, while maintaining good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to stabilize (electrically) and prevent the onset of atrial arrhythmias with different pacing techniques. We investigate the effect of HRV features extracted from different lengths of HRV segments prior to PAF onset with the proposed PAF prediction method. The pre-processing stage of the predictor includes QRS detection, HRV quantification and ectopic beat correction. Time-domain, frequency-domain, non-linear and bispectrum features are then extracted from the quantified HRV. In the feature selection, the HRV feature set and classifier parameters are optimized simultaneously using an optimization procedure based on genetic algorithm (GA). Both full feature set and statistically significant feature subset are optimized by GA respectively. For the statistically significant feature subset, Mann-Whitney U test is used to filter non-statistical significance features that cannot pass the statistical test at 20% significant level. The final stage of our predictor is the classifier that is based on support vector machine (SVM). A 10-fold cross-validation is applied in performance evaluation, and the proposed method achieves 79.3% prediction accuracy using 15-minutes HRV segment. This accuracy is comparable to that achieved by existing methods that use 30-minutes HRV segments, most of which achieves accuracy of around 80%. More importantly, our method significantly outperforms those that applied segments shorter than 30 minutes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Addeh, Abdoljalil; Khormali, Aminollah; Golilarz, Noorbakhsh Amiri
2018-05-04
The control chart patterns are the most commonly used statistical process control (SPC) tools to monitor process changes. When a control chart produces an out-of-control signal, this means that the process has been changed. In this study, a new method based on optimized radial basis function neural network (RBFNN) is proposed for control chart patterns (CCPs) recognition. The proposed method consists of four main modules: feature extraction, feature selection, classification and learning algorithm. In the feature extraction module, shape and statistical features are used. Recently, various shape and statistical features have been presented for the CCPs recognition. In the feature selection module, the association rules (AR) method has been employed to select the best set of the shape and statistical features. In the classifier section, RBFNN is used and finally, in RBFNN, learning algorithm has a high impact on the network performance. Therefore, a new learning algorithm based on the bees algorithm has been used in the learning module. Most studies have considered only six patterns: Normal, Cyclic, Increasing Trend, Decreasing Trend, Upward Shift and Downward Shift. Since three patterns namely Normal, Stratification, and Systematic are very similar to each other and distinguishing them is very difficult, in most studies Stratification and Systematic have not been considered. Regarding to the continuous monitoring and control over the production process and the exact type detection of the problem encountered during the production process, eight patterns have been investigated in this study. The proposed method is tested on a dataset containing 1600 samples (200 samples from each pattern) and the results showed that the proposed method has a very good performance. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Apte, A; Veeraraghavan, H; Oh, J
Purpose: To present an open source and free platform to facilitate radiomics research — The “Radiomics toolbox” in CERR. Method: There is scarcity of open source tools that support end-to-end modeling of image features to predict patient outcomes. The “Radiomics toolbox” strives to fill the need for such a software platform. The platform supports (1) import of various kinds of image modalities like CT, PET, MR, SPECT, US. (2) Contouring tools to delineate structures of interest. (3) Extraction and storage of image based features like 1st order statistics, gray-scale co-occurrence and zonesize matrix based texture features and shape features andmore » (4) Statistical Analysis. Statistical analysis of the extracted features is supported with basic functionality that includes univariate correlations, Kaplan-Meir curves and advanced functionality that includes feature reduction and multivariate modeling. The graphical user interface and the data management are performed with Matlab for the ease of development and readability of code and features for wide audience. Open-source software developed with other programming languages is integrated to enhance various components of this toolbox. For example: Java-based DCM4CHE for import of DICOM, R for statistical analysis. Results: The Radiomics toolbox will be distributed as an open source, GNU copyrighted software. The toolbox was prototyped for modeling Oropharyngeal PET dataset at MSKCC. The analysis will be presented in a separate paper. Conclusion: The Radiomics Toolbox provides an extensible platform for extracting and modeling image features. To emphasize new uses of CERR for radiomics and image-based research, we have changed the name from the “Computational Environment for Radiotherapy Research” to the “Computational Environment for Radiological Research”.« less
Jiang, Rui ; Yang, Hua ; Zhou, Linqi ; Kuo, C.-C. Jay ; Sun, Fengzhu ; Chen, Ting
2007-01-01
The increasing demand for the identification of genetic variation responsible for common diseases has translated into a need for sophisticated methods for effectively prioritizing mutations occurring in disease-associated genetic regions. In this article, we prioritize candidate nonsynonymous single-nucleotide polymorphisms (nsSNPs) through a bioinformatics approach that takes advantages of a set of improved numeric features derived from protein-sequence information and a new statistical learning model called “multiple selection rule voting” (MSRV). The sequence-based features can maximize the scope of applications of our approach, and the MSRV model can capture subtle characteristics of individual mutations. Systematic validation of the approach demonstrates that this approach is capable of prioritizing causal mutations for both simple monogenic diseases and complex polygenic diseases. Further studies of familial Alzheimer diseases and diabetes show that the approach can enrich mutations underlying these polygenic diseases among the top of candidate mutations. Application of this approach to unclassified mutations suggests that there are 10 suspicious mutations likely to cause diseases, and there is strong support for this in the literature. PMID:17668383
Classification of document page images based on visual similarity of layout structures
NASA Astrophysics Data System (ADS)
Shin, Christian K.; Doermann, David S.
1999-12-01
Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout of a document contains a significant amount of information that can be used to classify a document's type in the absence of domain specific models. A document type or genre can be defined by the user based primarily on layout structure. Our classification approach is based on 'visual similarity' of the layout structure by building a supervised classifier, given examples of the class. We use image features, such as the percentages of tex and non-text (graphics, image, table, and ruling) content regions, column structures, variations in the point size of fonts, the density of content area, and various statistics on features of connected components which can be derived from class samples without class knowledge. In order to obtain class labels for training samples, we conducted a user relevance test where subjects ranked UW-I document images with respect to the 12 representative images. We implemented our classification scheme using the OC1, a decision tree classifier, and report our findings.
Molecular Features of Dissolved Organic Matter Produced by Picophytoplankton
NASA Astrophysics Data System (ADS)
Ma, X.; Coleman, M.; Waldbauer, J.
2016-02-01
Compounds derived from picophytoplankton through exudation, grazing and viral lysis contribute a large proportion of labile DOM to the ocean. This labile DOM is rapidly turned over by and exchanged among microbial communities. However, identifying labile DOM compounds and tracking their sources and sinks in ocean ecosystems is complicated by the presence of non-labile DOM which has a significantly larger reservoir size and longer residence time. This study focuses on investigating labile DOM produced by single-strain cyanobacteria isolates via different modes of release and varied nutrient conditions. DOM compounds are analyzed by high-resolution mass spectrometry. Statistical comparison between intracellular and extracellular molecular data of Synechococcus WH7803 revealed noticeable differences in terms of compound number, size and structure. Incubation experiments using combined whole seawater and diluent of grazer-free or viral-free water at the BATS time-series station in Sargasso Sea yielded complimentary data to be synthesized with data from lab cultures. The compositional features of each type of DOM could serve as future proxies for different modes of DOM production in the oceans.
Reese, Sarah E; Archer, Kellie J; Therneau, Terry M; Atkinson, Elizabeth J; Vachon, Celine M; de Andrade, Mariza; Kocher, Jean-Pierre A; Eckel-Passow, Jeanette E
2013-11-15
Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal component analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of PCA to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test whether a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays, whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We developed a new statistic that uses gPCA to identify whether batch effects exist in high-throughput genomic data. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. The gPCA R package (Available via CRAN) provides functionality and data to perform the methods in this article. reesese@vcu.edu
The impact on midlevel vision of statistically optimal divisive normalization in V1
Coen-Cagli, Ruben; Schwartz, Odelia
2013-01-01
The first two areas of the primate visual cortex (V1, V2) provide a paradigmatic example of hierarchical computation in the brain. However, neither the functional properties of V2 nor the interactions between the two areas are well understood. One key aspect is that the statistics of the inputs received by V2 depend on the nonlinear response properties of V1. Here, we focused on divisive normalization, a canonical nonlinear computation that is observed in many neural areas and modalities. We simulated V1 responses with (and without) different forms of surround normalization derived from statistical models of natural scenes, including canonical normalization and a statistically optimal extension that accounted for image nonhomogeneities. The statistics of the V1 population responses differed markedly across models. We then addressed how V2 receptive fields pool the responses of V1 model units with different tuning. We assumed this is achieved by learning without supervision a linear representation that removes correlations, which could be accomplished with principal component analysis. This approach revealed V2-like feature selectivity when we used the optimal normalization and, to a lesser extent, the canonical one but not in the absence of both. We compared the resulting two-stage models on two perceptual tasks; while models encompassing V1 surround normalization performed better at object recognition, only statistically optimal normalization provided systematic advantages in a task more closely matched to midlevel vision, namely figure/ground judgment. Our results suggest that experiments probing midlevel areas might benefit from using stimuli designed to engage the computations that characterize V1 optimality. PMID:23857950
Soltaninejad, Mohammadreza; Yang, Guang; Lambrou, Tryphon; Allinson, Nigel; Jones, Timothy L; Barrick, Thomas R; Howe, Franklyn A; Ye, Xujiong
2018-04-01
Accurate segmentation of brain tumour in magnetic resonance images (MRI) is a difficult task due to various tumour types. Using information and features from multimodal MRI including structural MRI and isotropic (p) and anisotropic (q) components derived from the diffusion tensor imaging (DTI) may result in a more accurate analysis of brain images. We propose a novel 3D supervoxel based learning method for segmentation of tumour in multimodal MRI brain images (conventional MRI and DTI). Supervoxels are generated using the information across the multimodal MRI dataset. For each supervoxel, a variety of features including histograms of texton descriptor, calculated using a set of Gabor filters with different sizes and orientations, and first order intensity statistical features are extracted. Those features are fed into a random forests (RF) classifier to classify each supervoxel into tumour core, oedema or healthy brain tissue. The method is evaluated on two datasets: 1) Our clinical dataset: 11 multimodal images of patients and 2) BRATS 2013 clinical dataset: 30 multimodal images. For our clinical dataset, the average detection sensitivity of tumour (including tumour core and oedema) using multimodal MRI is 86% with balanced error rate (BER) 7%; while the Dice score for automatic tumour segmentation against ground truth is 0.84. The corresponding results of the BRATS 2013 dataset are 96%, 2% and 0.89, respectively. The method demonstrates promising results in the segmentation of brain tumour. Adding features from multimodal MRI images can largely increase the segmentation accuracy. The method provides a close match to expert delineation across all tumour grades, leading to a faster and more reproducible method of brain tumour detection and delineation to aid patient management. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Benetti, Micol; Pandolfi, Stefania; Lattanzi, Massimiliano; Martinelli, Matteo; Melchiorri, Alessandro
2013-01-01
Using the most recent data from the WMAP, ACT and SPT experiments, we update the constraints on models with oscillatory features in the primordial power spectrum of scalar perturbations. This kind of features can appear in models of inflation where slow-roll is interrupted, like multifield models. We also derive constraints for the case in which, in addition to cosmic microwave observations, we also consider the data on the spectrum of luminous red galaxies from the 7th SDSS catalog, and the SNIa Union Compilation 2 data. We have found that: (i) considering a model with features in the primordial power spectrum increases the agreement with data compared to the featureless “vanilla” ΛCDM model by Δχ2=6.7, representing an improvement with respect to the expected value Δχ2=3 for an equivalent model with three additional parameters; (ii) the uncertainty on the determination of the standard parameters is not degraded when features are included; (iii) the best fit for the features model locates the step in the primordial spectrum at a scale k≃0.005Mpc-1, corresponding to the scale where the outliers in the WMAP7 data at ℓ=22 and ℓ=40 are located.; (iv) a distinct, albeit less statistically significant peak is present in the likelihood at smaller scales, whose presence might be related to the WMAP7 preference for a negative value of the running of the scalar spectral index parameter; (v) the inclusion of the LRG-7 data does not change significantly the best fit model, but allows to better constrain the amplitude of the oscillations.
... this page: https://medlineplus.gov/usestatistics.html MedlinePlus Statistics To use the sharing features on this page, ... By Quarter View image full size Quarterly User Statistics Quarter Page Views Unique Visitors Oct-Dec-98 ...
Dose fractionation theorem in 3-D reconstruction (tomography)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Glaeser, R.M.
It is commonly assumed that the large number of projections for single-axis tomography precludes its application to most beam-labile specimens. However, Hegerl and Hoppe have pointed out that the total dose required to achieve statistical significance for each voxel of a computed 3-D reconstruction is the same as that required to obtain a single 2-D image of that isolated voxel, at the same level of statistical significance. Thus a statistically significant 3-D image can be computed from statistically insignificant projections, as along as the total dosage that is distributed among these projections is high enough that it would have resultedmore » in a statistically significant projection, if applied to only one image. We have tested this critical theorem by simulating the tomographic reconstruction of a realistic 3-D model created from an electron micrograph. The simulations verify the basic conclusions of high absorption, signal-dependent noise, varying specimen contrast and missing angular range. Furthermore, the simulations demonstrate that individual projections in the series of fractionated-dose images can be aligned by cross-correlation because they contain significant information derived from the summation of features from different depths in the structure. This latter information is generally not useful for structural interpretation prior to 3-D reconstruction, owing to the complexity of most specimens investigated by single-axis tomography. These results, in combination with dose estimates for imaging single voxels and measurements of radiation damage in the electron microscope, demonstrate that it is feasible to use single-axis tomography with soft X-ray microscopy of frozen-hydrated specimens.« less
Ince, Robin A A; Giordano, Bruno L; Kayser, Christoph; Rousselet, Guillaume A; Gross, Joachim; Schyns, Philippe G
2017-03-01
We begin by reviewing the statistical framework of information theory as applicable to neuroimaging data analysis. A major factor hindering wider adoption of this framework in neuroimaging is the difficulty of estimating information theoretic quantities in practice. We present a novel estimation technique that combines the statistical theory of copulas with the closed form solution for the entropy of Gaussian variables. This results in a general, computationally efficient, flexible, and robust multivariate statistical framework that provides effect sizes on a common meaningful scale, allows for unified treatment of discrete, continuous, unidimensional and multidimensional variables, and enables direct comparisons of representations from behavioral and brain responses across any recording modality. We validate the use of this estimate as a statistical test within a neuroimaging context, considering both discrete stimulus classes and continuous stimulus features. We also present examples of analyses facilitated by these developments, including application of multivariate analyses to MEG planar magnetic field gradients, and pairwise temporal interactions in evoked EEG responses. We show the benefit of considering the instantaneous temporal derivative together with the raw values of M/EEG signals as a multivariate response, how we can separately quantify modulations of amplitude and direction for vector quantities, and how we can measure the emergence of novel information over time in evoked responses. Open-source Matlab and Python code implementing the new methods accompanies this article. Hum Brain Mapp 38:1541-1573, 2017. © 2016 Wiley Periodicals, Inc. 2016 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.
A Quality Function Deployment Framework for the Service Quality of Health Information Websites
Kim, Dohoon
2010-01-01
Objectives This research was conducted to identify both the users' service requirements on health information websites (HIWs) and the key functional elements for running HIWs. With the quality function deployment framework, the derived service attributes (SAs) are mapped into the suppliers' functional characteristics (FCs) to derive the most critical FCs for the users' satisfaction. Methods Using the survey data from 228 respondents, the SAs, FCs and their relationships were analyzed using various multivariate statistical methods such as principal component factor analysis, discriminant analysis, correlation analysis, etc. Simple and compound FC priorities were derived by matrix calculation. Results Nine factors of SAs and five key features of FCs were identified, and these served as the basis for the house of quality model. Based on the compound FC priorities, the functional elements pertaining to security and privacy, and usage support should receive top priority in the course of enhancing HIWs. Conclusions The quality function deployment framework can improve the FCs of the HIWs in an effective, structured manner, and it can also be utilized for critical success factors together with their strategic implications for enhancing the service quality of HIWs. Therefore, website managers could efficiently improve website operations by considering this study's results. PMID:21818418
eCOMPAGT – efficient Combination and Management of Phenotypes and Genotypes for Genetic Epidemiology
Schönherr, Sebastian; Weißensteiner, Hansi; Coassin, Stefan; Specht, Günther; Kronenberg, Florian; Brandstätter, Anita
2009-01-01
Background High-throughput genotyping and phenotyping projects of large epidemiological study populations require sophisticated laboratory information management systems. Most epidemiological studies include subject-related personal information, which needs to be handled with care by following data privacy protection guidelines. In addition, genotyping core facilities handling cooperative projects require a straightforward solution to monitor the status and financial resources of the different projects. Description We developed a database system for an efficient combination and management of phenotypes and genotypes (eCOMPAGT) deriving from genetic epidemiological studies. eCOMPAGT securely stores and manages genotype and phenotype data and enables different user modes with different rights. Special attention was drawn on the import of data deriving from TaqMan and SNPlex genotyping assays. However, the database solution is adjustable to other genotyping systems by programming additional interfaces. Further important features are the scalability of the database and an export interface to statistical software. Conclusion eCOMPAGT can store, administer and connect phenotype data with all kinds of genotype data and is available as a downloadable version at . PMID:19432954
Hassan-Esfahani, Leila; Ebtehaj, Ardeshir M; Torres-Rua, Alfonso; McKee, Mac
2017-09-14
Applications of satellite-borne observations in precision agriculture (PA) are often limited due to the coarse spatial resolution of satellite imagery. This paper uses high-resolution airborne observations to increase the spatial resolution of satellite data for related applications in PA. A new variational downscaling scheme is presented that uses coincident aerial imagery products from "AggieAir", an unmanned aerial system, to increase the spatial resolution of Landsat satellite data. This approach is primarily tested for downscaling individual band Landsat images that can be used to derive normalized difference vegetation index (NDVI) and surface soil moisture (SSM). Quantitative and qualitative results demonstrate promising capabilities of the downscaling approach enabling effective increase of the spatial resolution of Landsat imageries by orders of 2 to 4. Specifically, the downscaling scheme retrieved the missing high-resolution feature of the imageries and reduced the root mean squared error by 15, 11, and 10 percent in visual, near infrared, and thermal infrared bands, respectively. This metric is reduced by 9% in the derived NDVI and remains negligibly for the soil moisture products.
Hassan-Esfahani, Leila; Ebtehaj, Ardeshir M.; McKee, Mac
2017-01-01
Applications of satellite-borne observations in precision agriculture (PA) are often limited due to the coarse spatial resolution of satellite imagery. This paper uses high-resolution airborne observations to increase the spatial resolution of satellite data for related applications in PA. A new variational downscaling scheme is presented that uses coincident aerial imagery products from “AggieAir”, an unmanned aerial system, to increase the spatial resolution of Landsat satellite data. This approach is primarily tested for downscaling individual band Landsat images that can be used to derive normalized difference vegetation index (NDVI) and surface soil moisture (SSM). Quantitative and qualitative results demonstrate promising capabilities of the downscaling approach enabling effective increase of the spatial resolution of Landsat imageries by orders of 2 to 4. Specifically, the downscaling scheme retrieved the missing high-resolution feature of the imageries and reduced the root mean squared error by 15, 11, and 10 percent in visual, near infrared, and thermal infrared bands, respectively. This metric is reduced by 9% in the derived NDVI and remains negligibly for the soil moisture products. PMID:28906428
PyTranSpot: A tool for multiband light curve modeling of planetary transits and stellar spots
NASA Astrophysics Data System (ADS)
Juvan, Ines G.; Lendl, M.; Cubillos, P. E.; Fossati, L.; Tregloan-Reed, J.; Lammer, H.; Guenther, E. W.; Hanslmeier, A.
2018-02-01
Several studies have shown that stellar activity features, such as occulted and non-occulted starspots, can affect the measurement of transit parameters biasing studies of transit timing variations and transmission spectra. We present PyTranSpot, which we designed to model multiband transit light curves showing starspot anomalies, inferring both transit and spot parameters. The code follows a pixellation approach to model the star with its corresponding limb darkening, spots, and transiting planet on a two dimensional Cartesian coordinate grid. We combine PyTranSpot with a Markov chain Monte Carlo framework to study and derive exoplanet transmission spectra, which provides statistically robust values for the physical properties and uncertainties of a transiting star-planet system. We validate PyTranSpot's performance by analyzing eleven synthetic light curves of four different star-planet systems and 20 transit light curves of the well-studied WASP-41b system. We also investigate the impact of starspots on transit parameters and derive wavelength dependent transit depth values for WASP-41b covering a range of 6200-9200 Å, indicating a flat transmission spectrum.
Sufficient Statistics for Divergence and the Probability of Misclassification
NASA Technical Reports Server (NTRS)
Quirein, J.
1972-01-01
One particular aspect is considered of the feature selection problem which results from the transformation x=Bz, where B is a k by n matrix of rank k and k is or = to n. It is shown that in general, such a transformation results in a loss of information. In terms of the divergence, this is equivalent to the fact that the average divergence computed using the variable x is less than or equal to the average divergence computed using the variable z. A loss of information in terms of the probability of misclassification is shown to be equivalent to the fact that the probability of misclassification computed using variable x is greater than or equal to the probability of misclassification computed using variable z. First, the necessary facts relating k-dimensional and n-dimensional integrals are derived. Then the mentioned results about the divergence and probability of misclassification are derived. Finally it is shown that if no information is lost (in x = Bz) as measured by the divergence, then no information is lost as measured by the probability of misclassification.
Taylor, Kirsten I; Devereux, Barry J; Acres, Kadia; Randall, Billi; Tyler, Lorraine K
2012-03-01
Conceptual representations are at the heart of our mental lives, involved in every aspect of cognitive functioning. Despite their centrality, a long-standing debate persists as to how the meanings of concepts are represented and processed. Many accounts agree that the meanings of concrete concepts are represented by their individual features, but disagree about the importance of different feature-based variables: some views stress the importance of the information carried by distinctive features in conceptual processing, others the features which are shared over many concepts, and still others the extent to which features co-occur. We suggest that previously disparate theoretical positions and experimental findings can be unified by an account which claims that task demands determine how concepts are processed in addition to the effects of feature distinctiveness and co-occurrence. We tested these predictions in a basic-level naming task which relies on distinctive feature information (Experiment 1) and a domain decision task which relies on shared feature information (Experiment 2). Both used large-scale regression designs with the same visual objects, and mixed-effects models incorporating participant, session, stimulus-related and feature statistic variables to model the performance. We found that concepts with relatively more distinctive and more highly correlated distinctive relative to shared features facilitated basic-level naming latencies, while concepts with relatively more shared and more highly correlated shared relative to distinctive features speeded domain decisions. These findings demonstrate that the feature statistics of distinctiveness (shared vs. distinctive) and correlational strength, as well as the task demands, determine how concept meaning is processed in the conceptual system. Copyright © 2011 Elsevier B.V. All rights reserved.
School Violence: Data & Statistics
... Data LGB Youth Report School Violence Featured Topic: Bullying Research Featured Topic: Prevent Gang Membership Featured Topic: ... report covers topics such as victimization, teacher injury, bullying, school conditions, fights, weapons, and student use of ...
NASA Astrophysics Data System (ADS)
Zhao, Bei; Zhong, Yanfei; Zhang, Liangpei
2016-06-01
Land-use classification of very high spatial resolution remote sensing (VHSR) imagery is one of the most challenging tasks in the field of remote sensing image processing. However, the land-use classification is hard to be addressed by the land-cover classification techniques, due to the complexity of the land-use scenes. Scene classification is considered to be one of the expected ways to address the land-use classification issue. The commonly used scene classification methods of VHSR imagery are all derived from the computer vision community that mainly deal with terrestrial image recognition. Differing from terrestrial images, VHSR images are taken by looking down with airborne and spaceborne sensors, which leads to the distinct light conditions and spatial configuration of land cover in VHSR imagery. Considering the distinct characteristics, two questions should be answered: (1) Which type or combination of information is suitable for the VHSR imagery scene classification? (2) Which scene classification algorithm is best for VHSR imagery? In this paper, an efficient spectral-structural bag-of-features scene classifier (SSBFC) is proposed to combine the spectral and structural information of VHSR imagery. SSBFC utilizes the first- and second-order statistics (the mean and standard deviation values, MeanStd) as the statistical spectral descriptor for the spectral information of the VHSR imagery, and uses dense scale-invariant feature transform (SIFT) as the structural feature descriptor. From the experimental results, the spectral information works better than the structural information, while the combination of the spectral and structural information is better than any single type of information. Taking the characteristic of the spatial configuration into consideration, SSBFC uses the whole image scene as the scope of the pooling operator, instead of the scope generated by a spatial pyramid (SP) commonly used in terrestrial image classification. The experimental results show that the whole image as the scope of the pooling operator performs better than the scope generated by SP. In addition, SSBFC codes and pools the spectral and structural features separately to avoid mutual interruption between the spectral and structural features. The coding vectors of spectral and structural features are then concatenated into a final coding vector. Finally, SSBFC classifies the final coding vector by support vector machine (SVM) with a histogram intersection kernel (HIK). Compared with the latest scene classification methods, the experimental results with three VHSR datasets demonstrate that the proposed SSBFC performs better than the other classification methods for VHSR image scenes.
Liu, Huiling; Xia, Bingbing; Yi, Dehui
2016-01-01
We propose a new feature extraction method of liver pathological image based on multispatial mapping and statistical properties. For liver pathological images of Hematein Eosin staining, the image of R and B channels can reflect the sensitivity of liver pathological images better, while the entropy space and Local Binary Pattern (LBP) space can reflect the texture features of the image better. To obtain the more comprehensive information, we map liver pathological images to the entropy space, LBP space, R space, and B space. The traditional Higher Order Local Autocorrelation Coefficients (HLAC) cannot reflect the overall information of the image, so we propose an average correction HLAC feature. We calculate the statistical properties and the average gray value of pathological images and then update the current pixel value as the absolute value of the difference between the current pixel gray value and the average gray value, which can be more sensitive to the gray value changes of pathological images. Lastly the HLAC template is used to calculate the features of the updated image. The experiment results show that the improved features of the multispatial mapping have the better classification performance for the liver cancer. PMID:27022407
Objects and categories: feature statistics and object processing in the ventral stream.
Tyler, Lorraine K; Chiu, Shannon; Zhuang, Jie; Randall, Billi; Devereux, Barry J; Wright, Paul; Clarke, Alex; Taylor, Kirsten I
2013-10-01
Recognizing an object involves more than just visual analyses; its meaning must also be decoded. Extensive research has shown that processing the visual properties of objects relies on a hierarchically organized stream in ventral occipitotemporal cortex, with increasingly more complex visual features being coded from posterior to anterior sites culminating in the perirhinal cortex (PRC) in the anteromedial temporal lobe (aMTL). The neurobiological principles of the conceptual analysis of objects remain more controversial. Much research has focused on two neural regions-the fusiform gyrus and aMTL, both of which show semantic category differences, but of different types. fMRI studies show category differentiation in the fusiform gyrus, based on clusters of semantically similar objects, whereas category-specific deficits, specifically for living things, are associated with damage to the aMTL. These category-specific deficits for living things have been attributed to problems in differentiating between highly similar objects, a process that involves the PRC. To determine whether the PRC and the fusiform gyri contribute to different aspects of an object's meaning, with differentiation between confusable objects in the PRC and categorization based on object similarity in the fusiform, we carried out an fMRI study of object processing based on a feature-based model that characterizes the degree of semantic similarity and difference between objects and object categories. Participants saw 388 objects for which feature statistic information was available and named the objects at the basic level while undergoing fMRI scanning. After controlling for the effects of visual information, we found that feature statistics that capture similarity between objects formed category clusters in fusiform gyri, such that objects with many shared features (typical of living things) were associated with activity in the lateral fusiform gyri whereas objects with fewer shared features (typical of nonliving things) were associated with activity in the medial fusiform gyri. Significantly, a feature statistic reflecting differentiation between highly similar objects, enabling object-specific representations, was associated with bilateral PRC activity. These results confirm that the statistical characteristics of conceptual object features are coded in the ventral stream, supporting a conceptual feature-based hierarchy, and integrating disparate findings of category responses in fusiform gyri and category deficits in aMTL into a unifying neurocognitive framework.
Quality evaluation of no-reference MR images using multidirectional filters and image statistics.
Jang, Jinseong; Bang, Kihun; Jang, Hanbyol; Hwang, Dosik
2018-09-01
This study aimed to develop a fully automatic, no-reference image-quality assessment (IQA) method for MR images. New quality-aware features were obtained by applying multidirectional filters to MR images and examining the feature statistics. A histogram of these features was then fitted to a generalized Gaussian distribution function for which the shape parameters yielded different values depending on the type of distortion in the MR image. Standard feature statistics were established through a training process based on high-quality MR images without distortion. Subsequently, the feature statistics of a test MR image were calculated and compared with the standards. The quality score was calculated as the difference between the shape parameters of the test image and the undistorted standard images. The proposed IQA method showed a >0.99 correlation with the conventional full-reference assessment methods; accordingly, this proposed method yielded the best performance among no-reference IQA methods for images containing six types of synthetic, MR-specific distortions. In addition, for authentically distorted images, the proposed method yielded the highest correlation with subjective assessments by human observers, thus demonstrating its superior performance over other no-reference IQAs. Our proposed IQA was designed to consider MR-specific features and outperformed other no-reference IQAs designed mainly for photographic images. Magn Reson Med 80:914-924, 2018. © 2018 International Society for Magnetic Resonance in Medicine. © 2018 International Society for Magnetic Resonance in Medicine.
QSAR and 3D QSAR of inhibitors of the epidermal growth factor receptor
NASA Astrophysics Data System (ADS)
Pinto-Bazurco, Mariano; Tsakovska, Ivanka; Pajeva, Ilza
This article reports quantitative structure-activity relationships (QSAR) and 3D QSAR models of 134 structurally diverse inhibitors of the epidermal growth factor receptor (EGFR) tyrosine kinase. Free-Wilson analysis was used to derive the QSAR model. It identified the substituents in aniline, the polycyclic system, and the substituents at the 6- and 7-positions of the polycyclic system as the most important structural features. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) were used in the 3D QSAR modeling. The steric and electrostatic interactions proved the most important for the inhibitory effect. Both QSAR and 3D QSAR models led to consistent results. On the basis of the statistically significant models, new structures were proposed and their inhibitory activities were predicted.
Analysis of the Tanana River Basin using LANDSAT data
NASA Technical Reports Server (NTRS)
Morrissey, L. A.; Ambrosia, V. G.; Carson-Henry, C.
1981-01-01
Digital image classification techniques were used to classify land cover/resource information in the Tanana River Basin of Alaska. Portions of four scenes of LANDSAT digital data were analyzed using computer systems at Ames Research Center in an unsupervised approach to derive cluster statistics. The spectral classes were identified using the IDIMS display and color infrared photography. Classification errors were corrected using stratification procedures. The classification scheme resulted in the following eleven categories; sedimented/shallow water, clear/deep water, coniferous forest, mixed forest, deciduous forest, shrub and grass, bog, alpine tundra, barrens, snow and ice, and cultural features. Color coded maps and acreage summaries of the major land cover categories were generated for selected USGS quadrangles (1:250,000) which lie within the drainage basin. The project was completed within six months.
Tsallis thermostatistics for finite systems: a Hamiltonian approach
NASA Astrophysics Data System (ADS)
Adib, Artur B.; Moreira, Andrã© A.; Andrade, José S., Jr.; Almeida, Murilo P.
2003-05-01
The derivation of the Tsallis generalized canonical distribution from the traditional approach of the Gibbs microcanonical ensemble is revisited (Phys. Lett. A 193 (1994) 140). We show that finite systems whose Hamiltonians obey a generalized homogeneity relation rigorously follow the nonextensive thermostatistics of Tsallis. In the thermodynamical limit, however, our results indicate that the Boltzmann-Gibbs statistics is always recovered, regardless of the type of potential among interacting particles. This approach provides, moreover, a one-to-one correspondence between the generalized entropy and the Hamiltonian structure of a wide class of systems, revealing a possible origin for the intrinsic nonlinear features present in the Tsallis formalism that lead naturally to power-law behavior. Finally, we confirm these exact results through extensive numerical simulations of the Fermi-Pasta-Ulam chain of anharmonic oscillators.
Stephens, David; Diesing, Markus
2014-01-01
Detailed seabed substrate maps are increasingly in demand for effective planning and management of marine ecosystems and resources. It has become common to use remotely sensed multibeam echosounder data in the form of bathymetry and acoustic backscatter in conjunction with ground-truth sampling data to inform the mapping of seabed substrates. Whilst, until recently, such data sets have typically been classified by expert interpretation, it is now obvious that more objective, faster and repeatable methods of seabed classification are required. This study compares the performances of a range of supervised classification techniques for predicting substrate type from multibeam echosounder data. The study area is located in the North Sea, off the north-east coast of England. A total of 258 ground-truth samples were classified into four substrate classes. Multibeam bathymetry and backscatter data, and a range of secondary features derived from these datasets were used in this study. Six supervised classification techniques were tested: Classification Trees, Support Vector Machines, k-Nearest Neighbour, Neural Networks, Random Forest and Naive Bayes. Each classifier was trained multiple times using different input features, including i) the two primary features of bathymetry and backscatter, ii) a subset of the features chosen by a feature selection process and iii) all of the input features. The predictive performances of the models were validated using a separate test set of ground-truth samples. The statistical significance of model performances relative to a simple baseline model (Nearest Neighbour predictions on bathymetry and backscatter) were tested to assess the benefits of using more sophisticated approaches. The best performing models were tree based methods and Naive Bayes which achieved accuracies of around 0.8 and kappa coefficients of up to 0.5 on the test set. The models that used all input features didn't generally perform well, highlighting the need for some means of feature selection.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lamichhane, N; Johnson, P; Chinea, F
Purpose: To evaluate the correlation between image features and the accuracy of manually drawn target contours on synthetic PET images Methods: A digital PET phantom was used in combination with Monte Carlo simulation to create a set of 26 simulated PET images featuring a variety of tumor shapes and activity heterogeneity. These tumor volumes were used as a gold standard in comparisons with manual contours delineated by 10 radiation oncologist on the simulated PET images. Metrics used to evaluate segmentation accuracy included the dice coefficient, false positive dice, false negative dice, symmetric mean absolute surface distance, and absolute volumetric difference.more » Image features extracted from the simulated tumors consisted of volume, shape complexity, mean curvature, and intensity contrast along with five texture features derived from the gray-level neighborhood difference matrices including contrast, coarseness, busyness, strength, and complexity. Correlation between these features and contouring accuracy were examined. Results: Contour accuracy was reasonably well correlated with a variety of image features. Dice coefficient ranged from 0.7 to 0.90 and was correlated closely with contrast (r=0.43, p=0.02) and complexity (r=0.5, p<0.001). False negative dice ranged from 0.10 to 0.50 and was correlated closely with contrast (r=0.68, p<0.001) and complexity (r=0.66, p<0.001). Absolute volumetric difference ranged from 0.0002 to 0.67 and was correlated closely with coarseness (r=0.46, p=0.02) and complexity (r=0.49, p=0.008). Symmetric mean absolute difference ranged from 0.02 to 1 and was correlated closely with mean curvature (r=0.57, p=0.02) and contrast (r=0.6, p=0.001). Conclusion: The long term goal of this study is to assess whether contouring variability can be reduced by providing feedback to the practitioner based on image feature analysis. The results are encouraging and will be used to develop a statistical model which will enable a prediction of contour accuracy based purely on image feature analysis.« less
Design and Synthesis of Mannich bases as Benzimidazole Derivatives as Analgesic Agents.
Datar, Prasanna A; Limaye, Saleel A
2015-01-01
Mannich bases were selected for 2D QSAR study to derive meaningful relationship between the structural features and analgesic activity. Using the knowledge of important features a novel series was designed to obtain improved analgesic activity. A series of novel Mannich bases 1-(N-substituted amino)methyl]-2-substituted benzimidazole derivatives were synthesized and were screened for analgesic activity. Some of these compounds showed promising analgesic activity when compared with the standard drug diclofenac sodium.
Featureless classification of light curves
NASA Astrophysics Data System (ADS)
Kügler, S. D.; Gianniotis, N.; Polsterer, K. L.
2015-08-01
In the era of rapidly increasing amounts of time series data, classification of variable objects has become the main objective of time-domain astronomy. Classification of irregularly sampled time series is particularly difficult because the data cannot be represented naturally as a vector which can be directly fed into a classifier. In the literature, various statistical features serve as vector representations. In this work, we represent time series by a density model. The density model captures all the information available, including measurement errors. Hence, we view this model as a generalization to the static features which directly can be derived, e.g. as moments from the density. Similarity between each pair of time series is quantified by the distance between their respective models. Classification is performed on the obtained distance matrix. In the numerical experiments, we use data from the OGLE (Optical Gravitational Lensing Experiment) and ASAS (All Sky Automated Survey) surveys and demonstrate that the proposed representation performs up to par with the best currently used feature-based approaches. The density representation preserves all static information present in the observational data, in contrast to a less-complete description by features. The density representation is an upper boundary in terms of information made available to the classifier. Consequently, the predictive power of the proposed classification depends on the choice of similarity measure and classifier, only. Due to its principled nature, we advocate that this new approach of representing time series has potential in tasks beyond classification, e.g. unsupervised learning.
Multiplicative Multitask Feature Learning
Wang, Xin; Bi, Jinbo; Yu, Shipeng; Sun, Jiangwen; Song, Minghu
2016-01-01
We investigate a general framework of multiplicative multitask feature learning which decomposes individual task’s model parameters into a multiplication of two components. One of the components is used across all tasks and the other component is task-specific. Several previous methods can be proved to be special cases of our framework. We study the theoretical properties of this framework when different regularization conditions are applied to the two decomposed components. We prove that this framework is mathematically equivalent to the widely used multitask feature learning methods that are based on a joint regularization of all model parameters, but with a more general form of regularizers. Further, an analytical formula is derived for the across-task component as related to the task-specific component for all these regularizers, leading to a better understanding of the shrinkage effects of different regularizers. Study of this framework motivates new multitask learning algorithms. We propose two new learning formulations by varying the parameters in the proposed framework. An efficient blockwise coordinate descent algorithm is developed suitable for solving the entire family of formulations with rigorous convergence analysis. Simulation studies have identified the statistical properties of data that would be in favor of the new formulations. Extensive empirical studies on various classification and regression benchmark data sets have revealed the relative advantages of the two new formulations by comparing with the state of the art, which provides instructive insights into the feature learning problem with multiple tasks. PMID:28428735
NASA Astrophysics Data System (ADS)
Hamrouni, Sameh; Rougon, Nicolas; Pr"teux, Françoise
2011-03-01
In perfusion MRI (p-MRI) exams, short-axis (SA) image sequences are captured at multiple slice levels along the long-axis of the heart during the transit of a vascular contrast agent (Gd-DTPA) through the cardiac chambers and muscle. Compensating cardio-thoracic motions is a requirement for enabling computer-aided quantitative assessment of myocardial ischaemia from contrast-enhanced p-MRI sequences. The classical paradigm consists of registering each sequence frame on a reference image using some intensity-based matching criterion. In this paper, we introduce a novel unsupervised method for the spatio-temporal groupwise registration of cardiac p-MRI exams based on normalized mutual information (NMI) between high-dimensional feature distributions. Here, local contrast enhancement curves are used as a dense set of spatio-temporal features, and statistically matched through variational optimization to a target feature distribution derived from a registered reference template. The hard issue of probability density estimation in high-dimensional state spaces is bypassed by using consistent geometric entropy estimators, allowing NMI to be computed directly from feature samples. Specifically, a computationally efficient kth-nearest neighbor (kNN) estimation framework is retained, leading to closed-form expressions for the gradient flow of NMI over finite- and infinite-dimensional motion spaces. This approach is applied to the groupwise alignment of cardiac p-MRI exams using a free-form Deformation (FFD) model for cardio-thoracic motions. Experiments on simulated and natural datasets suggest its accuracy and robustness for registering p-MRI exams comprising more than 30 frames.
NASA Astrophysics Data System (ADS)
Murphy, Richard J.; Monteiro, Sildomar T.
2013-01-01
Hyperspectral imagery is used to map the distribution of iron and separate iron ore from shale (a waste product) on a vertical mine face in an open-pit mine in the Pilbara, Western Australia. Vertical mine faces have complex surface geometries which cause large spatial variations in the amount of incident and reflected light. Methods used to analyse imagery must minimise these effects whilst preserving any spectral variations between rock types and minerals. Derivative analysis of spectra to the 1st-, 2nd- and 4th-order is used to do this. To quantify the relative amounts and distribution of iron, the derivative spectrum is integrated across the visible and near infrared spectral range (430-970 nm) and over those wavelength regions containing individual peaks and troughs associated with specific iron absorption features. As a test of this methodology, results from laboratory spectra acquired from representative rock samples were compared with total amounts of iron minerals from X-ray diffraction (XRD) analysis. Relationships between derivatives integrated over the visible near-infrared range and total amounts (% weight) of iron minerals were strongest for the 4th- and 2nd-derivative (R2 = 0.77 and 0.74, respectively) and weakest for the 1st-derivative (R2 = 0.56). Integrated values of individual peaks and troughs showed moderate to strong relationships in 2nd- (R2 = 0.68-0.78) and 4th-derivative (R2 = 0.49-0.78) spectra. The weakest relationships were found for peaks or troughs towards longer wavelengths. The same derivative methods were then applied to imagery to quantify relative amounts of iron minerals on a mine face. Before analyses, predictions were made about the relative abundances of iron in the different geological zones on the mine face, as mapped from field surveys. Integration of the whole spectral curve (430-970 nm) from the 2nd- and 4th-derivative gave results which were entirely consistent with predictions. Conversely, integration of the 1st-derivative gave results that did not fit with predictions nor distinguish between zones with very large and small amounts of iron oxide. Classified maps of ore and shale were created using a simple level-slice of the 1st-derivative reflectance at 702, 765 and 809 nm. Pixels classified as shale showed a similar distribution to kaolinite (an indicator of shales in the region), as mapped by the depth of the diagnostic kaolinite absorption feature at 2196 nm. Standard statistical measures of classification performance (accuracy, precision, recall and the Kappa coefficient of agreement) indicated that nearly all of the pixels were classified correctly using 1st-derivative reflectance at 765 and 809 nm. These results indicate that data from the VNIR (430-970 nm) can be used to quantify, without a priori knowledge, the total amount of iron minerals and to distinguish ore from shale on vertical mine faces.
Nieri, Michele; Clauser, Carlo; Franceschi, Debora; Pagliaro, Umberto; Saletta, Daniele; Pini-Prato, Giovanpaolo
2007-08-01
The aim of the present study was to investigate the relationships among reported methodological, statistical, clinical and paratextual variables of randomized clinical trials (RCTs) in implant therapy, and their influence on subsequent research. The material consisted of the RCTs in implant therapy published through the end of the year 2000. Methodological, statistical, clinical and paratextual features of the articles were assessed and recorded. The perceived clinical relevance was subjectively evaluated by an experienced clinician on anonymous abstracts. The impact on research was measured by the number of citations found in the Science Citation Index. A new statistical technique (Structural learning of Bayesian Networks) was used to assess the relationships among the considered variables. Descriptive statistics revealed that the reported methodology and statistics of RCTs in implant therapy were defective. Follow-up of the studies was generally short. The perceived clinical relevance appeared to be associated with the objectives of the studies and with the number of published images in the original articles. The impact on research was related to the nationality of the involved institutions and to the number of published images. RCTs in implant therapy (until 2000) show important methodological and statistical flaws and may not be appropriate for guiding clinicians in their practice. The methodological and statistical quality of the studies did not appear to affect their impact on practice and research. Bayesian Networks suggest new and unexpected relationships among the methodological, statistical, clinical and paratextual features of RCTs.
An adaptive multi-feature segmentation model for infrared image
NASA Astrophysics Data System (ADS)
Zhang, Tingting; Han, Jin; Zhang, Yi; Bai, Lianfa
2016-04-01
Active contour models (ACM) have been extensively applied to image segmentation, conventional region-based active contour models only utilize global or local single feature information to minimize the energy functional to drive the contour evolution. Considering the limitations of original ACMs, an adaptive multi-feature segmentation model is proposed to handle infrared images with blurred boundaries and low contrast. In the proposed model, several essential local statistic features are introduced to construct a multi-feature signed pressure function (MFSPF). In addition, we draw upon the adaptive weight coefficient to modify the level set formulation, which is formed by integrating MFSPF with local statistic features and signed pressure function with global information. Experimental results demonstrate that the proposed method can make up for the inadequacy of the original method and get desirable results in segmenting infrared images.
Vehicle license plate recognition based on geometry restraints and multi-feature decision
NASA Astrophysics Data System (ADS)
Wu, Jianwei; Wang, Zongyue
2005-10-01
Vehicle license plate (VLP) recognition is of great importance to many traffic applications. Though researchers have paid much attention to VLP recognition there has not been a fully operational VLP recognition system yet for many reasons. This paper discusses a valid and practical method for vehicle license plate recognition based on geometry restraints and multi-feature decision including statistical and structural features. In general, the VLP recognition includes the following steps: the location of VLP, character segmentation, and character recognition. This paper discusses the three steps in detail. The characters of VLP are always declining caused by many factors, which makes it more difficult to recognize the characters of VLP, therefore geometry restraints such as the general ratio of length and width, the adjacent edges being perpendicular are used for incline correction. Image Moment has been proved to be invariant to translation, rotation and scaling therefore image moment is used as one feature for character recognition. Stroke is the basic element for writing and hence taking it as a feature is helpful to character recognition. Finally we take the image moment, the strokes and the numbers of each stroke for each character image and some other structural features and statistical features as the multi-feature to match each character image with sample character images so that each character image can be recognized by BP neural net. The proposed method combines statistical and structural features for VLP recognition, and the result shows its validity and efficiency.
Statistics of velocity gradients in two-dimensional Navier-Stokes and ocean turbulence.
Schorghofer, Norbert; Gille, Sarah T
2002-02-01
Probability density functions and conditional averages of velocity gradients derived from upper ocean observations are compared with results from forced simulations of the two-dimensional Navier-Stokes equations. Ocean data are derived from TOPEX satellite altimeter measurements. The simulations use rapid forcing on large scales, characteristic of surface winds. The probability distributions of transverse velocity derivatives from the ocean observations agree with the forced simulations, although they differ from unforced simulations reported elsewhere. The distribution and cross correlation of velocity derivatives provide clear evidence that large coherent eddies play only a minor role in generating the observed statistics.
Wang, Shijun; Yao, Jianhua; Petrick, Nicholas; Summers, Ronald M.
2010-01-01
Colon cancer is the second leading cause of cancer-related deaths in the United States. Computed tomographic colonography (CTC) combined with a computer aided detection system provides a feasible approach for improving colonic polyps detection and increasing the use of CTC for colon cancer screening. To distinguish true polyps from false positives, various features extracted from polyp candidates have been proposed. Most of these traditional features try to capture the shape information of polyp candidates or neighborhood knowledge about the surrounding structures (fold, colon wall, etc.). In this paper, we propose a new set of shape descriptors for polyp candidates based on statistical curvature information. These features called histograms of curvature features are rotation, translation and scale invariant and can be treated as complementing existing feature set. Then in order to make full use of the traditional geometric features (defined as group A) and the new statistical features (group B) which are highly heterogeneous, we employed a multiple kernel learning method based on semi-definite programming to learn an optimized classification kernel from the two groups of features. We conducted leave-one-patient-out test on a CTC dataset which contained scans from 66 patients. Experimental results show that a support vector machine (SVM) based on the combined feature set and the semi-definite optimization kernel achieved higher FROC performance compared to SVMs using the two groups of features separately. At a false positive per scan rate of 5, the sensitivity of the SVM using the combined features improved from 0.77 (Group A) and 0.73 (Group B) to 0.83 (p ≤ 0.01). PMID:20953299
Active contours on statistical manifolds and texture segmentation
Sang-Mook Lee; A. Lynn Abbott; Neil A. Clark; Philip A. Araman
2005-01-01
A new approach to active contours on statistical manifolds is presented. The statistical manifolds are 2- dimensional Riemannian manifolds that are statistically defined by maps that transform a parameter domain onto a set of probability density functions. In this novel framework, color or texture features are measured at each image point and their statistical...
Active contours on statistical manifolds and texture segmentaiton
Sang-Mook Lee; A. Lynn Abbott; Neil A. Clark; Philip A. Araman
2005-01-01
A new approach to active contours on statistical manifolds is presented. The statistical manifolds are 2- dimensional Riemannian manifolds that are statistically defined by maps that transform a parameter domain onto-a set of probability density functions. In this novel framework, color or texture features are measured at each Image point and their statistical...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oliver, J; Budzevich, M; Moros, E
Purpose: To investigate the relationship between quantitative image features (i.e. radiomics) and statistical fluctuations (i.e. electronic noise) in clinical Computed Tomography (CT) using the standardized American College of Radiology (ACR) CT accreditation phantom and patient images. Methods: Three levels of uncorrelated Gaussian noise were added to CT images of phantom and patients (20) acquired in static mode and respiratory tracking mode. We calculated the noise-power spectrum (NPS) of the original CT images of the phantom, and of the phantom images with added Gaussian noise with means of 50, 80, and 120 HU. Concurrently, on patient images (original and noise-added images),more » image features were calculated: 14 shape, 19 intensity (1st order statistics from intensity volume histograms), 18 GLCM features (2nd order statistics from grey level co-occurrence matrices) and 11 RLM features (2nd order statistics from run-length matrices). These features provide the underlying structural information of the images. GLCM (size 128x128) was calculated with a step size of 1 voxel in 13 directions and averaged. RLM feature calculation was performed in 13 directions with grey levels binning into 128 levels. Results: Adding the electronic noise to the images modified the quality of the NPS, shifting the noise from mostly correlated to mostly uncorrelated voxels. The dramatic increase in noise texture did not affect image structure/contours significantly for patient images. However, it did affect the image features and textures significantly as demonstrated by GLCM differences. Conclusion: Image features are sensitive to acquisition factors (simulated by adding uncorrelated Gaussian noise). We speculate that image features will be more difficult to detect in the presence of electronic noise (an uncorrelated noise contributor) or, for that matter, any other highly correlated image noise. This work focuses on the effect of electronic, uncorrelated, noise and future work shall examine the influence of changes in quantum noise on the features. J. Oliver was supported by NSF FGLSAMP BD award HRD #1139850 and the McKnight Doctoral Fellowship.« less
Impact of feature saliency on visual category learning.
Hammer, Rubi
2015-01-01
People have to sort numerous objects into a large number of meaningful categories while operating in varying contexts. This requires identifying the visual features that best predict the 'essence' of objects (e.g., edibility), rather than categorizing objects based on the most salient features in a given context. To gain this capacity, visual category learning (VCL) relies on multiple cognitive processes. These may include unsupervised statistical learning, that requires observing multiple objects for learning the statistics of their features. Other learning processes enable incorporating different sources of supervisory information, alongside the visual features of the categorized objects, from which the categorical relations between few objects can be deduced. These deductions enable inferring that objects from the same category may differ from one another in some high-saliency feature dimensions, whereas lower-saliency feature dimensions can best differentiate objects from distinct categories. Here I illustrate how feature saliency affects VCL, by also discussing kinds of supervisory information enabling reflective categorization. Arguably, principles debated here are often being ignored in categorization studies.
Impact of feature saliency on visual category learning
Hammer, Rubi
2015-01-01
People have to sort numerous objects into a large number of meaningful categories while operating in varying contexts. This requires identifying the visual features that best predict the ‘essence’ of objects (e.g., edibility), rather than categorizing objects based on the most salient features in a given context. To gain this capacity, visual category learning (VCL) relies on multiple cognitive processes. These may include unsupervised statistical learning, that requires observing multiple objects for learning the statistics of their features. Other learning processes enable incorporating different sources of supervisory information, alongside the visual features of the categorized objects, from which the categorical relations between few objects can be deduced. These deductions enable inferring that objects from the same category may differ from one another in some high-saliency feature dimensions, whereas lower-saliency feature dimensions can best differentiate objects from distinct categories. Here I illustrate how feature saliency affects VCL, by also discussing kinds of supervisory information enabling reflective categorization. Arguably, principles debated here are often being ignored in categorization studies. PMID:25954220
NASA Astrophysics Data System (ADS)
He, Honghui; Dong, Yang; Zhou, Jialing; Ma, Hui
2017-03-01
As one of the salient features of light, polarization contains abundant structural and optical information of media. Recently, as a comprehensive description of polarization property, the Mueller matrix polarimetry has been applied to various biomedical studies such as cancerous tissues detections. In previous works, it has been found that the structural information encoded in the 2D Mueller matrix images can be presented by other transformed parameters with more explicit relationship to certain microstructural features. In this paper, we present a statistical analyzing method to transform the 2D Mueller matrix images into frequency distribution histograms (FDHs) and their central moments to reveal the dominant structural features of samples quantitatively. The experimental results of porcine heart, intestine, stomach, and liver tissues demonstrate that the transformation parameters and central moments based on the statistical analysis of Mueller matrix elements have simple relationships to the dominant microstructural properties of biomedical samples, including the density and orientation of fibrous structures, the depolarization power, diattenuation and absorption abilities. It is shown in this paper that the statistical analysis of 2D images of Mueller matrix elements may provide quantitative or semi-quantitative criteria for biomedical diagnosis.
Prevalence of herpes simplex, Epstein Barr and human papilloma viruses in oral lichen planus.
Yildirim, Benay; Sengüven, Burcu; Demir, Cem
2011-03-01
The aim of the present study was to assess the prevalence of Herpes Simplex virus, Epstein Barr virus and Human Papilloma virus -16 in oral lichen planus cases and to evaluate whether any clinical variant, histopathological or demographic feature correlates with these viruses. The study was conducted on 65 cases. Viruses were detected immunohistochemically. We evaluated the histopathological and demographic features and statistically analysed correlation of these features with Herpes Simplex virus, Epstein Barr virus and Human Papilloma virus-16 positivity. Herpes Simplex virus was positive in six (9%) cases and this was not statistically significant. The number of Epstein Barr virus positive cases was 23 (35%) and it was statistically significant. Human Papilloma virus positivity in 14 cases (21%) was statistically significant. Except basal cell degeneration in Herpes Simplex virus positive cases, we did not observe any significant correlation between virus positivity and demographic or histopathological features. However an increased risk of Epstein Barr virus and Human Papilloma virus infection was noted in oral lichen planus cases. Taking into account the oncogenic potential of both viruses, oral lichen planus cases should be detected for the presence of these viruses.
Binomial test statistics using Psi functions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bowman, Kimiko o
2007-01-01
For the negative binomial model (probability generating function (p + 1 - pt){sup -k}) a logarithmic derivative is the Psi function difference {psi}(k + x) - {psi}(k); this and its derivatives lead to a test statistic to decide on the validity of a specified model. The test statistic uses a data base so there exists a comparison available between theory and application. Note that the test function is not dominated by outliers. Applications to (i) Fisher's tick data, (ii) accidents data, (iii) Weldon's dice data are included.
The use of IFSAR data in GIS-based landslide susceptibility evaluation
NASA Astrophysics Data System (ADS)
Floris, M.; Squarzoni, C.; Hundseder, C.; Mason, M.; Genevois, R.
2010-05-01
GIS-based landslide susceptibility evaluation is based on the spatial relationships between landslides and their related factors. The analyses are highly conditioned by precision and accuracy of input factors, in particular landslides identification and characterization. Factors influencing landslide spatial hazard consist of geological, geomorphological, hydrogeological and tectonic features, geomechanical and geotechnical properties, land use and management, and DEM-derived factors (elevation, slope, aspect, curvature, superficial flow). The choice of influencing factors depends on: method of analysis, scale of inputs, aim of the outputs, availability and quality of the input data. Then, the choice can be made a priori, on the bases of an in-deep territorial knowledge and experts' judgements, or by performing statistical analyses, finalized to identify the significance of each of the influencing factor. Due to the large availability of terrain data, spatial models often include DEM-derived factors, but the resolution and accuracy of DEMs influence the final outputs. In this work the relationships between landslides occurred in the volcanic area of the Euganean Hills Regional Park (SE of Padua, Veneto region, Italy) and morphometric factors (slope, aspect and curvature) will be examined through a simple probability method. The use of complex and time consuming mathematical or statistical models is not always recommended, because often simple models can lead to more accurate results. Morphometric input factors are derived from DEMs created from vector elevation data of the regional cartography at 1:5.000 scale and with NEXTMap® data (http://www.intermap.com). NEXTMap® Digital Surface Model (DSM) and Digital Terrain Model (DTM) are generated using Intermap's IFSAR (Interferometric Synthetic Aperture Radar) technology mounted on an aircraft at a flight height of 8500 m above Mean Sea Level and under a side viewing angle of about 45°. The DSM represents the first reflective surface as illuminated by the radar. IFSAR sensors retrieve the mean height of the main scattering elements in a grid cell, known as the scattering phase centre height. The radar return from vegetation usually penetrates to some extend lower than the ‘first' tree canopy height. The DTM is derived from DSM applying a semi-automated process that classifies areas as obstructed (buildings and vegetation) and unobstructed , where the obstructed areas are processed to approximate bald earth. DSM and DTM data present a post spacing of 5 m and a vertical accuracy of 1 m (RMSE) or better in areas of unobstructed flat terrain. IFSAR elevation models are compared with photogrammetrically derived models (topographic map of Veneto Region) for the following aspects: Every elevation point of IFSAR models is derived through a direct measure of the terrain surface, while photogrammetric elevation models are usually compiled through digitalization and interpolation of contour lines. Frequent seam lines are evident in vector maps derived DEMs, compiled during many years, with different specifications and tools. IFSAR 5 m posted DEM's generate a much more detailed description of terrain features. Seamless and homogeneous IFSAR elevation models pave the way to accurate applications like landslides study and risk assessment. The results obtained using the two DEM sources will be compared. The contribution of IFSAR data to the GIS-based spatial analysis of the study area will be tested and discussed.
NASA Astrophysics Data System (ADS)
Jahani, Nariman; Cohen, Eric; Hsieh, Meng-Kang; Weinstein, Susan P.; Pantalone, Lauren; Davatzikos, Christos; Kontos, Despina
2018-02-01
We examined the ability of DCE-MRI longitudinal features to give early prediction of recurrence-free survival (RFS) in women undergoing neoadjuvant chemotherapy for breast cancer, in a retrospective analysis of 106 women from the ISPY 1 cohort. These features were based on the voxel-wise changes seen in registered images taken before treatment and after the first round of chemotherapy. We computed the transformation field using a robust deformable image registration technique to match breast images from these two visits. Using the deformation field, parametric response maps (PRM) — a voxel-based feature analysis of longitudinal changes in images between visits — was computed for maps of four kinetic features (signal enhancement ratio, peak enhancement, and wash-in/wash-out slopes). A two-level discrete wavelet transform was applied to these PRMs to extract heterogeneity information about tumor change between visits. To estimate survival, a Cox proportional hazard model was applied with the C statistic as the measure of success in predicting RFS. The best PRM feature (as determined by C statistic in univariable analysis) was determined for each of the four kinetic features. The baseline model, incorporating functional tumor volume, age, race, and hormone response status, had a C statistic of 0.70 in predicting RFS. The model augmented with the four PRM features had a C statistic of 0.76. Thus, our results suggest that adding information on the texture of voxel-level changes in tumor kinetic response between registered images of first and second visits could improve early RFS prediction in breast cancer after neoadjuvant chemotherapy.
NASA Technical Reports Server (NTRS)
Freilich, M. H.; Pawka, S. S.
1987-01-01
The statistics of Sxy estimates derived from orthogonal-component measurements are examined. Based on results of Goodman (1957), the probability density function (pdf) for Sxy(f) estimates is derived, and a closed-form solution for arbitrary moments of the distribution is obtained. Characteristic functions are used to derive the exact pdf of Sxy(tot). In practice, a simple Gaussian approximation is found to be highly accurate even for relatively few degrees of freedom. Implications for experiment design are discussed, and a maximum-likelihood estimator for a posterior estimation is outlined.
A rapid approach for automated comparison of independently derived stream networks
Stanislawski, Larry V.; Buttenfield, Barbara P.; Doumbouya, Ariel T.
2015-01-01
This paper presents an improved coefficient of line correspondence (CLC) metric for automatically assessing the similarity of two different sets of linear features. Elevation-derived channels at 1:24,000 scale (24K) are generated from a weighted flow-accumulation model and compared to 24K National Hydrography Dataset (NHD) flowlines. The CLC process conflates two vector datasets through a raster line-density differencing approach that is faster and more reliable than earlier methods. Methods are tested on 30 subbasins distributed across different terrain and climate conditions of the conterminous United States. CLC values for the 30 subbasins indicate 44–83% of the features match between the two datasets, with the majority of the mismatching features comprised of first-order features. Relatively lower CLC values result from subbasins with less than about 1.5 degrees of slope. The primary difference between the two datasets may be explained by different data capture criteria. First-order, headwater tributaries derived from the flow-accumulation model are captured more comprehensively through drainage area and terrain conditions, whereas capture of headwater features in the NHD is cartographically constrained by tributary length. The addition of missing headwaters to the NHD, as guided by the elevation-derived channels, can substantially improve the scientific value of the NHD.
Cascaded ensemble of convolutional neural networks and handcrafted features for mitosis detection
NASA Astrophysics Data System (ADS)
Wang, Haibo; Cruz-Roa, Angel; Basavanhally, Ajay; Gilmore, Hannah; Shih, Natalie; Feldman, Mike; Tomaszewski, John; Gonzalez, Fabio; Madabhushi, Anant
2014-03-01
Breast cancer (BCa) grading plays an important role in predicting disease aggressiveness and patient outcome. A key component of BCa grade is mitotic count, which involves quantifying the number of cells in the process of dividing (i.e. undergoing mitosis) at a specific point in time. Currently mitosis counting is done manually by a pathologist looking at multiple high power fields on a glass slide under a microscope, an extremely laborious and time consuming process. The development of computerized systems for automated detection of mitotic nuclei, while highly desirable, is confounded by the highly variable shape and appearance of mitoses. Existing methods use either handcrafted features that capture certain morphological, statistical or textural attributes of mitoses or features learned with convolutional neural networks (CNN). While handcrafted features are inspired by the domain and the particular application, the data-driven CNN models tend to be domain agnostic and attempt to learn additional feature bases that cannot be represented through any of the handcrafted features. On the other hand, CNN is computationally more complex and needs a large number of labeled training instances. Since handcrafted features attempt to model domain pertinent attributes and CNN approaches are largely unsupervised feature generation methods, there is an appeal to attempting to combine these two distinct classes of feature generation strategies to create an integrated set of attributes that can potentially outperform either class of feature extraction strategies individually. In this paper, we present a cascaded approach for mitosis detection that intelligently combines a CNN model and handcrafted features (morphology, color and texture features). By employing a light CNN model, the proposed approach is far less demanding computationally, and the cascaded strategy of combining handcrafted features and CNN-derived features enables the possibility of maximizing performance by leveraging the disconnected feature sets. Evaluation on the public ICPR12 mitosis dataset that has 226 mitoses annotated on 35 High Power Fields (HPF, x400 magnification) by several pathologists and 15 testing HPFs yielded an F-measure of 0.7345. Apart from this being the second best performance ever recorded for this MITOS dataset, our approach is faster and requires fewer computing resources compared to extant methods, making this feasible for clinical use.
Study on Hybrid Image Search Technology Based on Texts and Contents
NASA Astrophysics Data System (ADS)
Wang, H. T.; Ma, F. L.; Yan, C.; Pan, H.
2018-05-01
Image search was studied first here based on texts and contents, respectively. The text-based image feature extraction was put forward by integrating the statistical and topic features in view of the limitation of extraction of keywords only by means of statistical features of words. On the other hand, a search-by-image method was put forward based on multi-feature fusion in view of the imprecision of the content-based image search by means of a single feature. The layered-searching method depended on primarily the text-based image search method and additionally the content-based image search was then put forward in view of differences between the text-based and content-based methods and their difficult direct fusion. The feasibility and effectiveness of the hybrid search algorithm were experimentally verified.
GAISE 2016 Promotes Statistical Literacy
ERIC Educational Resources Information Center
Schield, Milo
2017-01-01
In the 2005 Guidelines for Assessment and Instruction in Statistics Education (GAISE), statistical literacy featured as a primary goal. The 2016 revision eliminated statistical literacy as a stated goal. Although this looks like a rejection, this paper argues that by including multivariate thinking and--more importantly--confounding as recommended…
A standardised protocol for texture feature analysis of endoscopic images in gynaecological cancer.
Neofytou, Marios S; Tanos, Vasilis; Pattichis, Marios S; Pattichis, Constantinos S; Kyriacou, Efthyvoulos C; Koutsouris, Dimitris D
2007-11-29
In the development of tissue classification methods, classifiers rely on significant differences between texture features extracted from normal and abnormal regions. Yet, significant differences can arise due to variations in the image acquisition method. For endoscopic imaging of the endometrium, we propose a standardized image acquisition protocol to eliminate significant statistical differences due to variations in: (i) the distance from the tissue (panoramic vs close up), (ii) difference in viewing angles and (iii) color correction. We investigate texture feature variability for a variety of targets encountered in clinical endoscopy. All images were captured at clinically optimum illumination and focus using 720 x 576 pixels and 24 bits color for: (i) a variety of testing targets from a color palette with a known color distribution, (ii) different viewing angles, (iv) two different distances from a calf endometrial and from a chicken cavity. Also, human images from the endometrium were captured and analysed. For texture feature analysis, three different sets were considered: (i) Statistical Features (SF), (ii) Spatial Gray Level Dependence Matrices (SGLDM), and (iii) Gray Level Difference Statistics (GLDS). All images were gamma corrected and the extracted texture feature values were compared against the texture feature values extracted from the uncorrected images. Statistical tests were applied to compare images from different viewing conditions so as to determine any significant differences. For the proposed acquisition procedure, results indicate that there is no significant difference in texture features between the panoramic and close up views and between angles. For a calibrated target image, gamma correction provided an acquired image that was a significantly better approximation to the original target image. In turn, this implies that the texture features extracted from the corrected images provided for better approximations to the original images. Within the proposed protocol, for human ROIs, we have found that there is a large number of texture features that showed significant differences between normal and abnormal endometrium. This study provides a standardized protocol for avoiding any significant texture feature differences that may arise due to variability in the acquisition procedure or the lack of color correction. After applying the protocol, we have found that significant differences in texture features will only be due to the fact that the features were extracted from different types of tissue (normal vs abnormal).
NASA Astrophysics Data System (ADS)
Reaver, N.; Kaplan, D. A.; Jawitz, J. W.
2017-12-01
The Budyko hypothesis states that a catchment's long-term water and energy balances are dependent on two relatively easy to measure quantities: rainfall depth and potential evaporation. This hypothesis is expressed as a simple function, the Budyko equation, which allows for the prediction of a catchment's actual evapotranspiration and discharge from measured rainfall depth and potential evaporation, data which are widely available. However, the two main analytically derived forms of the Budyko equation contain a single unknown watershed parameter, whose value varies across catchments; variation in this parameter has been used to explain the hydrological behavior of different catchments. The watershed parameter is generally thought of as a lumped quantity that represents the influence of all catchment biophysical features (e.g. soil type and depth, vegetation type, timing of rainfall, etc). Previous work has shown that the parameter is statistically correlated with catchment properties, but an explicit expression has been elusive. While the watershed parameter can be determined empirically by fitting the Budyko equation to measured data in gauged catchments where actual evapotranspiration can be estimated, this limits the utility of the framework for predicting impacts to catchment hydrology due to changing climate and land use. In this study, we developed an analytical solution for the lumped catchment parameter for both forms of the Budyko equation. We combined these solutions with a statistical soil moisture model to obtain analytical solutions for the Budyko equation parameter as a function of measurable catchment physical features, including rooting depth, soil porosity, and soil wilting point. We tested the predictive power of these solutions using the U.S. catchments in the MOPEX database. We also compared the Budyko equation parameter estimates generated from our analytical solutions (i.e. predicted parameters) with those obtained through the calibration of the Budyko equation to discharge data (i.e. empirical parameters), and found good agreement. These results suggest that it is possible to predict the Budyko equation watershed parameter directly from physical features, even for ungauged catchments.
Segmentation of prostate boundaries from ultrasound images using statistical shape model.
Shen, Dinggang; Zhan, Yiqiang; Davatzikos, Christos
2003-04-01
This paper presents a statistical shape model for the automatic prostate segmentation in transrectal ultrasound images. A Gabor filter bank is first used to characterize the prostate boundaries in ultrasound images in both multiple scales and multiple orientations. The Gabor features are further reconstructed to be invariant to the rotation of the ultrasound probe and incorporated in the prostate model as image attributes for guiding the deformable segmentation. A hierarchical deformation strategy is then employed, in which the model adaptively focuses on the similarity of different Gabor features at different deformation stages using a multiresolution technique, i.e., coarse features first and fine features later. A number of successful experiments validate the algorithm.
Statistical process control using optimized neural networks: a case study.
Addeh, Jalil; Ebrahimzadeh, Ata; Azarbad, Milad; Ranaee, Vahid
2014-09-01
The most common statistical process control (SPC) tools employed for monitoring process changes are control charts. A control chart demonstrates that the process has altered by generating an out-of-control signal. This study investigates the design of an accurate system for the control chart patterns (CCPs) recognition in two aspects. First, an efficient system is introduced that includes two main modules: feature extraction module and classifier module. In the feature extraction module, a proper set of shape features and statistical feature are proposed as the efficient characteristics of the patterns. In the classifier module, several neural networks, such as multilayer perceptron, probabilistic neural network and radial basis function are investigated. Based on an experimental study, the best classifier is chosen in order to recognize the CCPs. Second, a hybrid heuristic recognition system is introduced based on cuckoo optimization algorithm (COA) algorithm to improve the generalization performance of the classifier. The simulation results show that the proposed algorithm has high recognition accuracy. Copyright © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
MedlinePlus FAQ: Statistics about MedlinePlus
... faq/stats.html Can you give me some statistics about MedlinePlus? To use the sharing features on ... For page requests and unique visitors, see MedlinePlus statistics . Return to the list of MedlinePlus FAQs About ...
Features of statistical dynamics in a finite system
NASA Astrophysics Data System (ADS)
Yan, Shiwei; Sakata, Fumihiko; Zhuo, Yizhong
2002-03-01
We study features of statistical dynamics in a finite Hamilton system composed of a relevant one degree of freedom coupled to an irrelevant multidegree of freedom system through a weak interaction. Special attention is paid on how the statistical dynamics changes depending on the number of degrees of freedom in the irrelevant system. It is found that the macrolevel statistical aspects are strongly related to an appearance of the microlevel chaotic motion, and a dissipation of the relevant motion is realized passing through three distinct stages: dephasing, statistical relaxation, and equilibrium regimes. It is clarified that the dynamical description and the conventional transport approach provide us with almost the same macrolevel and microlevel mechanisms only for the system with a very large number of irrelevant degrees of freedom. It is also shown that the statistical relaxation in the finite system is an anomalous diffusion and the fluctuation effects have a finite correlation time.
Features of statistical dynamics in a finite system.
Yan, Shiwei; Sakata, Fumihiko; Zhuo, Yizhong
2002-03-01
We study features of statistical dynamics in a finite Hamilton system composed of a relevant one degree of freedom coupled to an irrelevant multidegree of freedom system through a weak interaction. Special attention is paid on how the statistical dynamics changes depending on the number of degrees of freedom in the irrelevant system. It is found that the macrolevel statistical aspects are strongly related to an appearance of the microlevel chaotic motion, and a dissipation of the relevant motion is realized passing through three distinct stages: dephasing, statistical relaxation, and equilibrium regimes. It is clarified that the dynamical description and the conventional transport approach provide us with almost the same macrolevel and microlevel mechanisms only for the system with a very large number of irrelevant degrees of freedom. It is also shown that the statistical relaxation in the finite system is an anomalous diffusion and the fluctuation effects have a finite correlation time.
NASA Astrophysics Data System (ADS)
Jordan, Gyozo; Petrik, Attila; De Vivo, Benedetto; Albanese, Stefano; Demetriades, Alecos; Sadeghi, Martiya
2017-04-01
Several studies have investigated the spatial distribution of chemical elements in topsoil (0-20 cm) within the framework of the EuroGeoSurveys Geochemistry Expert Group's 'Geochemical Mapping of Agricultural and Grazing Land Soil' project . Most of these studies used geostatistical analyses and interpolated concentration maps, Exploratory and Compositional Data and Analysis to identify anomalous patterns. The objective of our investigation is to demonstrate the use of digital image processing techniques for reproducible spatial pattern recognition and quantitative spatial feature characterisation. A single element (Ni) concentration in agricultural topsoil is used to perform the detailed spatial analysis, and to relate these features to possible underlying processes. In this study, simple univariate statistical methods were implemented first, and Tukey's inner-fence criterion was used to delineate statistical outliers. The linear and triangular irregular network (TIN) interpolation was used on the outlier-free Ni data points, which was resampled to a 10*10 km grid. Successive moving average smoothing was applied to generalise the TIN model and to suppress small- and at the same time enhance significant large-scale features of Nickel concentration spatial distribution patterns in European topsoil. The TIN map smoothed with a moving average filter revealed the spatial trends and patterns without losing much detail, and it was used as the input into digital image processing, such as local maxima and minima determination, digital cross sections, gradient magnitude and gradient direction calculation, second derivative profile curvature calculation, edge detection, local variability assessment, lineament density and directional variogram analyses. The detailed image processing analysis revealed several NE-SW, E-W and NW-SE oriented elongated features, which coincide with different spatial parameter classes and alignment with local maxima and minima. The NE-SW oriented linear pattern is the dominant feature to the south of the last glaciation limit. Some of these linear features are parallel to the suture zone of the Iapetus Ocean, while the others follow the Alpine and Carpathian Chains. The highest variability zones of Ni concentration in topsoil are located in the Alps and in the Balkans where mafic and ultramafic rocks outcrop. The predominant NE-SW oriented pattern is also captured by the strong anisotropy in the semi-variograms in this direction. A single major E-W oriented north-facing feature runs along the southern border of the last glaciation zone. This zone also coincides with a series of local maxima in Ni concentration along the glaciofluvial deposits. The NW-SE elongated spatial features are less dominant and are located in the Pyrenees and Scandinavia. This study demonstrates the efficiency of systematic image processing analysis in identifying and characterising spatial geochemical patterns that often remain uncovered by the usual visual map interpretation techniques.
ERIC Educational Resources Information Center
Garcia-Barrera, Mauricio A.; Kamphaus, Randy W.; Bandalos, Deborah
2011-01-01
The problem of valid measurement of psychological constructs remains an impediment to scientific progress, and the measurement of executive functions is not an exception. This study examined the statistical and theoretical derivation of a behavioral screener for the estimation of executive functions in children from the well-established Behavior…
Alternative Derivations of the Statistical Mechanical Distribution Laws
Wall, Frederick T.
1971-01-01
A new approach is presented for the derivation of statistical mechanical distribution laws. The derivations are accomplished by minimizing the Helmholtz free energy under constant temperature and volume, instead of maximizing the entropy under constant energy and volume. An alternative method involves stipulating equality of chemical potential, or equality of activity, for particles in different energy levels. This approach leads to a general statement of distribution laws applicable to all systems for which thermodynamic probabilities can be written. The methods also avoid use of the calculus of variations, Lagrangian multipliers, and Stirling's approximation for the factorial. The results are applied specifically to Boltzmann, Fermi-Dirac, and Bose-Einstein statistics. The special significance of chemical potential and activity is discussed for microscopic systems. PMID:16578712
Alternative derivations of the statistical mechanical distribution laws.
Wall, F T
1971-08-01
A new approach is presented for the derivation of statistical mechanical distribution laws. The derivations are accomplished by minimizing the Helmholtz free energy under constant temperature and volume, instead of maximizing the entropy under constant energy and volume. An alternative method involves stipulating equality of chemical potential, or equality of activity, for particles in different energy levels. This approach leads to a general statement of distribution laws applicable to all systems for which thermodynamic probabilities can be written. The methods also avoid use of the calculus of variations, Lagrangian multipliers, and Stirling's approximation for the factorial. The results are applied specifically to Boltzmann, Fermi-Dirac, and Bose-Einstein statistics. The special significance of chemical potential and activity is discussed for microscopic systems.
Fiori, Simone
2007-01-01
Bivariate statistical modeling from incomplete data is a useful statistical tool that allows to discover the model underlying two data sets when the data in the two sets do not correspond in size nor in ordering. Such situation may occur when the sizes of the two data sets do not match (i.e., there are “holes” in the data) or when the data sets have been acquired independently. Also, statistical modeling is useful when the amount of available data is enough to show relevant statistical features of the phenomenon underlying the data. We propose to tackle the problem of statistical modeling via a neural (nonlinear) system that is able to match its input-output statistic to the statistic of the available data sets. A key point of the new implementation proposed here is that it is based on look-up-table (LUT) neural systems, which guarantee a computationally advantageous way of implementing neural systems. A number of numerical experiments, performed on both synthetic and real-world data sets, illustrate the features of the proposed modeling procedure. PMID:18566641
Applications of statistical physics to technology price evolution
NASA Astrophysics Data System (ADS)
McNerney, James
Understanding how changing technology affects the prices of goods is a problem with both rich phenomenology and important policy consequences. Using methods from statistical physics, I model technology-driven price evolution. First, I examine a model for the price evolution of individual technologies. The price of a good often follows a power law equation when plotted against its cumulative production. This observation turns out to have significant consequences for technology policy aimed at mitigating climate change, where technologies are needed that achieve low carbon emissions at low cost. However, no theory adequately explains why technology prices follow power laws. To understand this behavior, I simplify an existing model that treats technologies as machines composed of interacting components. I find that the power law exponent of the price trajectory is inversely related to the number of interactions per component. I extend the model to allow for more realistic component interactions and make a testable prediction. Next, I conduct a case-study on the cost evolution of coal-fired electricity. I derive the cost in terms of various physical and economic components. The results suggest that commodities and technologies fall into distinct classes of price models, with commodities following martingales, and technologies following exponentials in time or power laws in cumulative production. I then examine the network of money flows between industries. This work is a precursor to studying the simultaneous evolution of multiple technologies. Economies resemble large machines, with different industries acting as interacting components with specialized functions. To begin studying the structure of these machines, I examine 20 economies with an emphasis on finding common features to serve as targets for statistical physics models. I find they share the same money flow and industry size distributions. I apply methods from statistical physics to show that industries cluster the same way according to industry type. Finally, I use these industry money flows to model the price evolution of many goods simultaneously, where network effects become important. I derive a prediction for which goods tend to improve most rapidly. The fastest-improving goods are those with the highest mean path lengths in the money flow network.
Modelling and mapping tick dynamics using volunteered observations.
Garcia-Martí, Irene; Zurita-Milla, Raúl; van Vliet, Arnold J H; Takken, Willem
2017-11-14
Tick populations and tick-borne infections have steadily increased since the mid-1990s posing an ever-increasing risk to public health. Yet, modelling tick dynamics remains challenging because of the lack of data and knowledge on this complex phenomenon. Here we present an approach to model and map tick dynamics using volunteered data. This approach is illustrated with 9 years of data collected by a group of trained volunteers who sampled active questing ticks (AQT) on a monthly basis and for 15 locations in the Netherlands. We aimed at finding the main environmental drivers of AQT at multiple time-scales, and to devise daily AQT maps at the national level for 2014. Tick dynamics is a complex ecological problem driven by biotic (e.g. pathogens, wildlife, humans) and abiotic (e.g. weather, landscape) factors. We enriched the volunteered AQT collection with six types of weather variables (aggregated at 11 temporal scales), three types of satellite-derived vegetation indices, land cover, and mast years. Then, we applied a feature engineering process to derive a set of 101 features to characterize the conditions that yielded a particular count of AQT on a date and location. To devise models predicting the AQT, we use a time-aware Random Forest regression method, which is suitable to find non-linear relationships in complex ecological problems, and provides an estimation of the most important features to predict the AQT. We trained a model capable of fitting AQT with reduced statistical metrics. The multi-temporal study on the feature importance indicates that variables linked to water levels in the atmosphere (i.e. evapotranspiration, relative humidity) consistently showed a higher explanatory power than previous works using temperature. As a product of this study, we are able of mapping daily tick dynamics at the national level. This study paves the way towards the design of new applications in the fields of environmental research, nature management, and public health. It also illustrates how Citizen Science initiatives produce geospatial data collections that can support scientific analysis, thus enabling the monitoring of complex environmental phenomena.
NASA Astrophysics Data System (ADS)
Nagarajan, Mahesh B.; Coan, Paola; Huber, Markus B.; Diemoz, Paul C.; Wismüller, Axel
2014-03-01
Current assessment of cartilage is primarily based on identification of indirect markers such as joint space narrowing and increased subchondral bone density on x-ray images. In this context, phase contrast CT imaging (PCI-CT) has recently emerged as a novel imaging technique that allows a direct examination of chondrocyte patterns and their correlation to osteoarthritis through visualization of cartilage soft tissue. This study investigates the use of topological and geometrical approaches for characterizing chondrocyte patterns in the radial zone of the knee cartilage matrix in the presence and absence of osteoarthritic damage. For this purpose, topological features derived from Minkowski Functionals and geometric features derived from the Scaling Index Method (SIM) were extracted from 842 regions of interest (ROI) annotated on PCI-CT images of healthy and osteoarthritic specimens of human patellar cartilage. The extracted features were then used in a machine learning task involving support vector regression to classify ROIs as healthy or osteoarthritic. Classification performance was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC). The best classification performance was observed with high-dimensional geometrical feature vectors derived from SIM (0.95 ± 0.06) which outperformed all Minkowski Functionals (p < 0.001). These results suggest that such quantitative analysis of chondrocyte patterns in human patellar cartilage matrix involving SIM-derived geometrical features can distinguish between healthy and osteoarthritic tissue with high accuracy.
Shi, Y; Qi, F; Xue, Z; Chen, L; Ito, K; Matsuo, H; Shen, D
2008-04-01
This paper presents a new deformable model using both population-based and patient-specific shape statistics to segment lung fields from serial chest radiographs. There are two novelties in the proposed deformable model. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than the general intensity and gradient features, is used to characterize the image features in the vicinity of each pixel. Second, the deformable contour is constrained by both population-based and patient-specific shape statistics, and it yields more robust and accurate segmentation of lung fields for serial chest radiographs. In particular, for segmenting the initial time-point images, the population-based shape statistics is used to constrain the deformable contour; as more subsequent images of the same patient are acquired, the patient-specific shape statistics online collected from the previous segmentation results gradually takes more roles. Thus, this patient-specific shape statistics is updated each time when a new segmentation result is obtained, and it is further used to refine the segmentation results of all the available time-point images. Experimental results show that the proposed method is more robust and accurate than other active shape models in segmenting the lung fields from serial chest radiographs.
Statistical foundations of liquid-crystal theory: II: Macroscopic balance laws.
Seguin, Brian; Fried, Eliot
2013-01-01
Working on a state space determined by considering a discrete system of rigid rods, we use nonequilibrium statistical mechanics to derive macroscopic balance laws for liquid crystals. A probability function that satisfies the Liouville equation serves as the starting point for deriving each macroscopic balance. The terms appearing in the derived balances are interpreted as expected values and explicit formulas for these terms are obtained. Among the list of derived balances appear two, the tensor moment of inertia balance and the mesofluctuation balance, that are not standard in previously proposed macroscopic theories for liquid crystals but which have precedents in other theories for structured media.
Statistical foundations of liquid-crystal theory
Seguin, Brian; Fried, Eliot
2013-01-01
Working on a state space determined by considering a discrete system of rigid rods, we use nonequilibrium statistical mechanics to derive macroscopic balance laws for liquid crystals. A probability function that satisfies the Liouville equation serves as the starting point for deriving each macroscopic balance. The terms appearing in the derived balances are interpreted as expected values and explicit formulas for these terms are obtained. Among the list of derived balances appear two, the tensor moment of inertia balance and the mesofluctuation balance, that are not standard in previously proposed macroscopic theories for liquid crystals but which have precedents in other theories for structured media. PMID:23554513
A Generic multi-dimensional feature extraction method using multiobjective genetic programming.
Zhang, Yang; Rockett, Peter I
2009-01-01
In this paper, we present a generic feature extraction method for pattern classification using multiobjective genetic programming. This not only evolves the (near-)optimal set of mappings from a pattern space to a multi-dimensional decision space, but also simultaneously optimizes the dimensionality of that decision space. The presented framework evolves vector-to-vector feature extractors that maximize class separability. We demonstrate the efficacy of our approach by making statistically-founded comparisons with a wide variety of established classifier paradigms over a range of datasets and find that for most of the pairwise comparisons, our evolutionary method delivers statistically smaller misclassification errors. At very worst, our method displays no statistical difference in a few pairwise comparisons with established classifier/dataset combinations; crucially, none of the misclassification results produced by our method is worse than any comparator classifier. Although principally focused on feature extraction, feature selection is also performed as an implicit side effect; we show that both feature extraction and selection are important to the success of our technique. The presented method has the practical consequence of obviating the need to exhaustively evaluate a large family of conventional classifiers when faced with a new pattern recognition problem in order to attain a good classification accuracy.
Unconscious analyses of visual scenes based on feature conjunctions.
Tachibana, Ryosuke; Noguchi, Yasuki
2015-06-01
To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).
Statistical evolution of quiet-Sun small-scale magnetic features using Sunrise observations
NASA Astrophysics Data System (ADS)
Anusha, L. S.; Solanki, S. K.; Hirzberger, J.; Feller, A.
2017-02-01
The evolution of small magnetic features in quiet regions of the Sun provides a unique window for probing solar magneto-convection. Here we analyze small-scale magnetic features in the quiet Sun, using the high resolution, seeing-free observations from the Sunrise balloon borne solar observatory. Our aim is to understand the contribution of different physical processes, such as splitting, merging, emergence and cancellation of magnetic fields to the rearrangement, addition and removal of magnetic flux in the photosphere. We have employed a statistical approach for the analysis and the evolution studies are carried out using a feature-tracking technique. In this paper we provide a detailed description of the feature-tracking algorithm that we have newly developed and we present the results of a statistical study of several physical quantities. The results on the fractions of the flux in the emergence, appearance, splitting, merging, disappearance and cancellation qualitatively agrees with other recent studies. To summarize, the total flux gained in unipolar appearance is an order of magnitude larger than the total flux gained in emergence. On the other hand, the bipolar cancellation contributes nearly an equal amount to the loss of magnetic flux as unipolar disappearance. The total flux lost in cancellation is nearly six to eight times larger than the total flux gained in emergence. One big difference between our study and previous similar studies is that, thanks to the higher spatial resolution of Sunrise, we can track features with fluxes as low as 9 × 1014 Mx. This flux is nearly an order of magnitude lower than the smallest fluxes of the features tracked in the highest resolution previous studies based on Hinode data. The area and flux of the magnetic features follow power-law type distribution, while the lifetimes show either power-law or exponential type distribution depending on the exact definitions used to define various birth and death events. We have also statistically determined the evolution of the flux within the features in the course of their lifetime, finding that this evolution depends very strongly on the birth and death process that the features undergo.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Busuioc, A.; Storch, H. von; Schnur, R.
Empirical downscaling procedures relate large-scale atmospheric features with local features such as station rainfall in order to facilitate local scenarios of climate change. The purpose of the present paper is twofold: first, a downscaling technique is used as a diagnostic tool to verify the performance of climate models on the regional scale; second, a technique is proposed for verifying the validity of empirical downscaling procedures in climate change applications. The case considered is regional seasonal precipitation in Romania. The downscaling model is a regression based on canonical correlation analysis between observed station precipitation and European-scale sea level pressure (SLP). Themore » climate models considered here are the T21 and T42 versions of the Hamburg ECHAM3 atmospheric GCM run in time-slice mode. The climate change scenario refers to the expected time of doubled carbon dioxide concentrations around the year 2050. Generally, applications of statistical downscaling to climate change scenarios have been based on the assumption that the empirical link between the large-scale and regional parameters remains valid under a changed climate. In this study, a rationale is proposed for this assumption by showing the consistency of the 2 x CO{sub 2} GCM scenarios in winter, derived directly from the gridpoint data, with the regional scenarios obtained through empirical downscaling. Since the skill of the GCMs in regional terms is already established, it is concluded that the downscaling technique is adequate for describing climatically changing regional and local conditions, at least for precipitation in Romania during winter.« less
Quantitative metrics for assessment of chemical image quality and spatial resolution
Kertesz, Vilmos; Cahill, John F.; Van Berkel, Gary J.
2016-02-28
Rationale: Currently objective/quantitative descriptions of the quality and spatial resolution of mass spectrometry derived chemical images are not standardized. Development of these standardized metrics is required to objectively describe chemical imaging capabilities of existing and/or new mass spectrometry imaging technologies. Such metrics would allow unbiased judgment of intra-laboratory advancement and/or inter-laboratory comparison for these technologies if used together with standardized surfaces. Methods: We developed two image metrics, viz., chemical image contrast (ChemIC) based on signal-to-noise related statistical measures on chemical image pixels and corrected resolving power factor (cRPF) constructed from statistical analysis of mass-to-charge chronograms across features of interest inmore » an image. These metrics, quantifying chemical image quality and spatial resolution, respectively, were used to evaluate chemical images of a model photoresist patterned surface collected using a laser ablation/liquid vortex capture mass spectrometry imaging system under different instrument operational parameters. Results: The calculated ChemIC and cRPF metrics determined in an unbiased fashion the relative ranking of chemical image quality obtained with the laser ablation/liquid vortex capture mass spectrometry imaging system. These rankings were used to show that both chemical image contrast and spatial resolution deteriorated with increasing surface scan speed, increased lane spacing and decreasing size of surface features. Conclusions: ChemIC and cRPF, respectively, were developed and successfully applied for the objective description of chemical image quality and spatial resolution of chemical images collected from model surfaces using a laser ablation/liquid vortex capture mass spectrometry imaging system.« less
Quantitative metrics for assessment of chemical image quality and spatial resolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kertesz, Vilmos; Cahill, John F.; Van Berkel, Gary J.
Rationale: Currently objective/quantitative descriptions of the quality and spatial resolution of mass spectrometry derived chemical images are not standardized. Development of these standardized metrics is required to objectively describe chemical imaging capabilities of existing and/or new mass spectrometry imaging technologies. Such metrics would allow unbiased judgment of intra-laboratory advancement and/or inter-laboratory comparison for these technologies if used together with standardized surfaces. Methods: We developed two image metrics, viz., chemical image contrast (ChemIC) based on signal-to-noise related statistical measures on chemical image pixels and corrected resolving power factor (cRPF) constructed from statistical analysis of mass-to-charge chronograms across features of interest inmore » an image. These metrics, quantifying chemical image quality and spatial resolution, respectively, were used to evaluate chemical images of a model photoresist patterned surface collected using a laser ablation/liquid vortex capture mass spectrometry imaging system under different instrument operational parameters. Results: The calculated ChemIC and cRPF metrics determined in an unbiased fashion the relative ranking of chemical image quality obtained with the laser ablation/liquid vortex capture mass spectrometry imaging system. These rankings were used to show that both chemical image contrast and spatial resolution deteriorated with increasing surface scan speed, increased lane spacing and decreasing size of surface features. Conclusions: ChemIC and cRPF, respectively, were developed and successfully applied for the objective description of chemical image quality and spatial resolution of chemical images collected from model surfaces using a laser ablation/liquid vortex capture mass spectrometry imaging system.« less
Slip, David J.; Hocking, David P.; Harcourt, Robert G.
2016-01-01
Constructing activity budgets for marine animals when they are at sea and cannot be directly observed is challenging, but recent advances in bio-logging technology offer solutions to this problem. Accelerometers can potentially identify a wide range of behaviours for animals based on unique patterns of acceleration. However, when analysing data derived from accelerometers, there are many statistical techniques available which when applied to different data sets produce different classification accuracies. We investigated a selection of supervised machine learning methods for interpreting behavioural data from captive otariids (fur seals and sea lions). We conducted controlled experiments with 12 seals, where their behaviours were filmed while they were wearing 3-axis accelerometers. From video we identified 26 behaviours that could be grouped into one of four categories (foraging, resting, travelling and grooming) representing key behaviour states for wild seals. We used data from 10 seals to train four predictive classification models: stochastic gradient boosting (GBM), random forests, support vector machine using four different kernels and a baseline model: penalised logistic regression. We then took the best parameters from each model and cross-validated the results on the two seals unseen so far. We also investigated the influence of feature statistics (describing some characteristic of the seal), testing the models both with and without these. Cross-validation accuracies were lower than training accuracy, but the SVM with a polynomial kernel was still able to classify seal behaviour with high accuracy (>70%). Adding feature statistics improved accuracies across all models tested. Most categories of behaviour -resting, grooming and feeding—were all predicted with reasonable accuracy (52–81%) by the SVM while travelling was poorly categorised (31–41%). These results show that model selection is important when classifying behaviour and that by using animal characteristics we can strengthen the overall accuracy. PMID:28002450
Chekmarev, Sergei F
2013-03-01
The transition from laminar to turbulent fluid motion occurring at large Reynolds numbers is generally associated with the instability of the laminar flow. On the other hand, since the turbulent flow characteristically appears in the form of spatially localized structures (e.g., eddies) filling the flow field, a tendency to occupy such a structured state of the flow cannot be ruled out as a driving force for turbulent transition. To examine this possibility, we propose a simple analytical model that treats the flow as a collection of localized spatial structures, each of which consists of elementary cells in which the behavior of the particles (atoms or molecules) is uncorrelated. This allows us to introduce the Reynolds number, associating it with the ratio between the total phase volume for the system and that for the elementary cell. Using the principle of maximum entropy to calculate the most probable size distribution of the localized structures, we show that as the Reynolds number increases, the elementary cells group into the localized structures, which successfully explains turbulent transition and some other general properties of turbulent flows. An important feature of the present model is that a bridge between the spatial-statistical description of the flow and hydrodynamic equations is established. We show that the basic assumptions underlying the model, i.e., that the particles are indistinguishable and elementary volumes of phase space exist in which the state of the particles is uncertain, are involved in the derivation of the Navier-Stokes equation. Taking into account that the model captures essential features of turbulent flows, this suggests that the driving force for the turbulent transition is basically the same as in the present model, i.e., the tendency of the system to occupy a statistically dominant state plays a key role. The instability of the flow at high Reynolds numbers can then be a mechanism to initiate structural rearrangement of the flow to find this state.
GLOBAL PROPERTIES OF M31'S STELLAR HALO FROM THE SPLASH SURVEY. I. SURFACE BRIGHTNESS PROFILE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, Karoline M.; Guhathakurta, Puragra; Beaton, Rachael L.
2012-11-20
We present the surface brightness profile of M31's stellar halo out to a projected radius of 175 kpc. The surface brightness estimates are based on confirmed samples of M31 red giant branch stars derived from Keck/DEIMOS spectroscopic observations. A set of empirical spectroscopic and photometric M31 membership diagnostics is used to identify and reject foreground and background contaminants. This enables us to trace the stellar halo of M31 to larger projected distances and fainter surface brightnesses than previous photometric studies. The surface brightness profile of M31's halo follows a power law with index -2.2 {+-} 0.2 and extends to amore » projected distance of at least {approx}175 kpc ({approx}2/3 of M31's virial radius), with no evidence of a downward break at large radii. The best-fit elliptical isophotes have b/a = 0.94 with the major axis of the halo aligned along the minor axis of M31's disk, consistent with a prolate halo, although the data are also consistent with M31's halo having spherical symmetry. The fact that tidal debris features are kinematically cold is used to identify substructure in the spectroscopic fields out to projected radii of 90 kpc and investigate the effect of this substructure on the surface brightness profile. The scatter in the surface brightness profile is reduced when kinematically identified tidal debris features in M31 are statistically subtracted; the remaining profile indicates that a comparatively diffuse stellar component to M31's stellar halo exists to large distances. Beyond 90 kpc, kinematically cold tidal debris features cannot be identified due to small number statistics; nevertheless, the significant field-to-field variation in surface brightness beyond 90 kpc suggests that the outermost region of M31's halo is also comprised to a significant degree of stars stripped from accreted objects.« less
NASA Astrophysics Data System (ADS)
Fine, I.; Thomson, R.; Chadwick, W. W., Jr.; Davis, E. E.; Fox, C. G.
2016-12-01
We have used a set of high-resolution bottom pressure recorder (BPR) time series collected at Axial Seamount on the Juan de Fuca Ridge beginning in 1986 to examine tsunami waves of seismological origin in the northeast Pacific. These data are a combination of autonomous, internally-recording battery-powered instruments and cabled instruments on the OOI Cabled Array. Of the total of 120 tsunami events catalogued for the coasts of Japan, Alaska, western North America and Hawaii, we found evidence for 38 events in the Axial Seamount BPR records. Many of these tsunamis were not observed along the adjacent west coast of the USA and Canada because of the much higher noise level of coastal locations and the lack of digital tide gauge data prior to 2000. We have also identified several tsunamis of apparent seismological origin that were observed at coastal stations but not at the deep ocean site. Careful analysis of these observations suggests that they were likely of meteorological origin. Analysis of the pressure measurements from Axial Seamount, along with BPR measurements from a nearby ODP CORK (Ocean Drilling Program Circulation Obviation Retrofit Kit) borehole and DART (Deep-ocean Assessment and Reporting of Tsunamis) locations, reveals features of deep-ocean tsunamis that are markedly different from features observed at coastal locations. Results also show that the energy of deep-ocean tsunamis can differ significantly among the three sets of stations despite their close spatial spacing and that this difference is strongly dependent on the direction of the incoming tsunami waves. These deep-ocean observations provide the most comprehensive statistics possible for tsunamis in the Pacific Ocean over the past 30 years. New insight into the distribution of tsunami amplitudes and wave energy derived from the deep-ocean sites should prove useful for long-term tsunami prediction and mitigation for coastal communities along the west coast of the USA and Canada.
NASA Astrophysics Data System (ADS)
Nasirudina, Radin A.; Näppi, Janne J.; Watari, Chinatsu; Matsuhiro, Mikio; Hironaka, Toru; Kido, Shoji; Yoshida, Hiroyuki
2018-02-01
We developed and evaluated the effect of our deep-learning-derived radiomic features, called deep radiomic features (DRFs), together with their combination with clinical predictors, on the prediction of the overall survival of patients with rheumatoid arthritis-associated interstitial lung disease (RA-ILD). We retrospectively identified 70 RA-ILD patients with thin-section lung CT and pulmonary function tests. An experienced observer delineated regions of interest (ROIs) from the lung regions on the CT images, and labeled them into one of four ILD patterns (ground-class opacity, reticulation, consolidation, and honeycombing) or a normal pattern. Small image patches centered at individual pixels on these ROIs were extracted and labeled with the class of the ROI to which the patch belonged. A deep convolutional neural network (DCNN), which consists of a series of convolutional layers for feature extraction and a series of fully connected layers, was trained and validated with 5-fold cross-validation for classifying the image patches into one of the above five patterns. A DRF vector for each patch was identified as the output of the last convolutional layer of the DCNN. Statistical moments of each element of the DRF vectors were computed to derive a DRF vector that characterizes the patient. The DRF vector was subjected to a Cox proportional hazards model with elastic-net penalty for predicting the survival of the patient. Evaluation was performed by use of bootstrapping with 2,000 replications, where concordance index (C-index) was used as a comparative performance metric. Preliminary results on clinical predictors, DRFs, and their combinations thereof showed (a) Gender and Age: C-index 64.8% [95% confidence interval (CI): 51.7, 77.9]; (b) gender, age, and physiology (GAP index): C-index: 78.5% [CI: 70.50 86.51], P < 0.0001 in comparison with (a); (c) DRFs: C-index 85.5% [CI: 73.4, 99.6], P < 0.0001 in comparison with (b); and (d) DRF and GAP: C-index 91.0% [CI: 84.6, 97.2], P < 0.0001 in comparison with (c). Kaplan-Meier survival curves of patients stratified to low- and high-risk groups based on the DRFs showed a statistically significant (P < 0.0001) difference. The DRFs outperform the clinical predictors in predicting patient survival, and a combination of the DRFs and GAP index outperforms either one of these predictors. Our results indicate that the DRFs and their combination with clinical predictors provide an accurate prognostic biomarker for patients with RA-ILD.
Detailed Analysis of the Interoccurrence Time Statistics in Seismic Activity
NASA Astrophysics Data System (ADS)
Tanaka, Hiroki; Aizawa, Yoji
2017-02-01
The interoccurrence time statistics of seismiciry is studied theoretically as well as numerically by taking into account the conditional probability and the correlations among many earthquakes in different magnitude levels. It is known so far that the interoccurrence time statistics is well approximated by the Weibull distribution, but the more detailed information about the interoccurrence times can be obtained from the analysis of the conditional probability. Firstly, we propose the Embedding Equation Theory (EET), where the conditional probability is described by two kinds of correlation coefficients; one is the magnitude correlation and the other is the inter-event time correlation. Furthermore, the scaling law of each correlation coefficient is clearly determined from the numerical data-analysis carrying out with the Preliminary Determination of Epicenter (PDE) Catalog and the Japan Meteorological Agency (JMA) Catalog. Secondly, the EET is examined to derive the magnitude dependence of the interoccurrence time statistics and the multi-fractal relation is successfully formulated. Theoretically we cannot prove the universality of the multi-fractal relation in seismic activity; nevertheless, the theoretical results well reproduce all numerical data in our analysis, where several common features or the invariant aspects are clearly observed. Especially in the case of stationary ensembles the multi-fractal relation seems to obey an invariant curve, furthermore in the case of non-stationary (moving time) ensembles for the aftershock regime the multi-fractal relation seems to satisfy a certain invariant curve at any moving times. It is emphasized that the multi-fractal relation plays an important role to unify the statistical laws of seismicity: actually the Gutenberg-Richter law and the Weibull distribution are unified in the multi-fractal relation, and some universality conjectures regarding the seismicity are briefly discussed.
Mapping vegetation in Yellowstone National Park using spectral feature analysis of AVIRIS data
Kokaly, Raymond F.; Despain, Don G.; Clark, Roger N.; Livo, K. Eric
2003-01-01
Knowledge of the distribution of vegetation on the landscape can be used to investigate ecosystem functioning. The sizes and movements of animal populations can be linked to resources provided by different plant species. This paper demonstrates the application of imaging spectroscopy to the study of vegetation in Yellowstone National Park (Yellowstone) using spectral feature analysis of data from the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS). AVIRIS data, acquired on August 7, 1996, were calibrated to surface reflectance using a radiative transfer model and field reflectance measurements of a ground calibration site. A spectral library of canopy reflectance signatures was created by averaging pixels of the calibrated AVIRIS data over areas of known forest and nonforest vegetation cover types in Yellowstone. Using continuum removal and least squares fitting algorithms in the US Geological Survey's Tetracorder expert system, the distributions of these vegetation types were determined by comparing the absorption features of vegetation in the spectral library with the spectra from the AVIRIS data. The 0.68 μm chlorophyll absorption feature and leaf water absorption features, centered near 0.98 and 1.20 μm, were analyzed. Nonforest cover types of sagebrush, grasslands, willows, sedges, and other wetland vegetation were mapped in the Lamar Valley of Yellowstone. Conifer cover types of lodgepole pine, whitebark pine, Douglas fir, and mixed Engelmann spruce/subalpine fir forests were spectrally discriminated and their distributions mapped in the AVIRIS images. In the Mount Washburn area of Yellowstone, a comparison of the AVIRIS map of forest cover types to a map derived from air photos resulted in an overall agreement of 74.1% (kappa statistic=0.62).
[Origin and morphological features of small supernumerary marker chromosomes in Turner syndrome].
Liu, Nan; Tong, Tong; Chen, Yue; Chen, Yanling; Cai, Chunquan
2018-02-10
OBJECTIVE To explore the origin and morphological features of small supernumerary marker chromosomes (sSMCs) in Turner syndrome. METHODS For 5 cases of Turner syndrome with a sSMC identified by conventional G-banding, dual-color fluorescence in situ hybridization (FISH) was applied to explore their origin and morphological features. RESULTS Among the 5 cases, 3 have derived from the X chromosome, which included 2 ring chromosomes and 1 centric minute. For the 2 sSMCs derived from the Y chromosome, 1 was ring or isodicentric chromosome, while the other was an isodicentric chromosome. CONCLUSION The sSMCs found in Turner syndrome have almost all derived from sex chromosomes. The majority of sSMCs derived from the X chromosome will form ring chromosomes, while a minority will form centric minute. While most sSMC derived from Y chromosome may exist as isodicentric chromosomes, and a small number may exist as rings. For Turner syndrome patients with sSMCs, dual-color FISH may be used to delineate their origins to facilitate genetic counseling and selection of clinical regime.
Using Network Analysis to Characterize Biogeographic Data in a Community Archive
NASA Astrophysics Data System (ADS)
Wellman, T. P.; Bristol, S.
2017-12-01
Informative measures are needed to evaluate and compare data from multiple providers in a community-driven data archive. This study explores insights from network theory and other descriptive and inferential statistics to examine data content and application across an assemblage of publically available biogeographic data sets. The data are archived in ScienceBase, a collaborative catalog of scientific data supported by the U.S Geological Survey to enhance scientific inquiry and acuity. In gaining understanding through this investigation and other scientific venues our goal is to improve scientific insight and data use across a spectrum of scientific applications. Network analysis is a tool to reveal patterns of non-trivial topological features in the data that do not exhibit complete regularity or randomness. In this work, network analyses are used to explore shared events and dependencies between measures of data content and application derived from metadata and catalog information and measures relevant to biogeographic study. Descriptive statistical tools are used to explore relations between network analysis properties, while inferential statistics are used to evaluate the degree of confidence in these assessments. Network analyses have been used successfully in related fields to examine social awareness of scientific issues, taxonomic structures of biological organisms, and ecosystem resilience to environmental change. Use of network analysis also shows promising potential to identify relationships in biogeographic data that inform programmatic goals and scientific interests.
A framework for evaluating mixture analysis algorithms
NASA Astrophysics Data System (ADS)
Dasaratha, Sridhar; Vignesh, T. S.; Shanmukh, Sarat; Yarra, Malathi; Botonjic-Sehic, Edita; Grassi, James; Boudries, Hacene; Freeman, Ivan; Lee, Young K.; Sutherland, Scott
2010-04-01
In recent years, several sensing devices capable of identifying unknown chemical and biological substances have been commercialized. The success of these devices in analyzing real world samples is dependent on the ability of the on-board identification algorithm to de-convolve spectra of substances that are mixtures. To develop effective de-convolution algorithms, it is critical to characterize the relationship between the spectral features of a substance and its probability of detection within a mixture, as these features may be similar to or overlap with other substances in the mixture and in the library. While it has been recognized that these aspects pose challenges to mixture analysis, a systematic effort to quantify spectral characteristics and their impact, is generally lacking. In this paper, we propose metrics that can be used to quantify these spectral features. Some of these metrics, such as a modification of variance inflation factor, are derived from classical statistical measures used in regression diagnostics. We demonstrate that these metrics can be correlated to the accuracy of the substance's identification in a mixture. We also develop a framework for characterizing mixture analysis algorithms, using these metrics. Experimental results are then provided to show the application of this framework to the evaluation of various algorithms, including one that has been developed for a commercial device. The illustration is based on synthetic mixtures that are created from pure component Raman spectra measured on a portable device.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Foley, Brian T; Korber, Bette T
2008-01-01
Simian immunodeficiency virus infection of macaques may result in neuroAIDS, a feature more commonly observed in macaques with rapid progressive disease than in those with conventional disease. This is the first report of two conventional progressors (H631 and H636) with encephalitis in rhesus macaques inoculated with a derivative of SIVsmES43-3. Phylogenetic analyses of viruses isolated from the cerebral spinal fluid (CSF) and plasma from both animals demonstrated tissue compartmentalization. Additionally, virus from the central nervous system (CNS) was able to infect primary macaque monocyte-derived macrophages more efficiently than virus from plasma. Conversely, virus isolated from plasma was able to replicatemore » better in peripheral blood mononuclear cells than virus from CNS. We speculate that these viruses were under different selective pressures in their separate compartments. Furthermore, these viruses appear to have undergone adaptive evolution to preferentially replicate in their respective cell targets. Analysis of the number of potential N-linked glycosylation sites (PNGS) in gp160 showed that there was a statistically significant loss of PNGS in viruses isolated from CNS in both macaques compared to SIVsmE543-3. Moreover, virus isolated from the brain in H631, had statistically significant loss of PNGS compared to virus isolated from CSF and plasma of the same animal. It is possible that the brain isolate may have adapted to decrease the number of PNGS given that humoral immune selection pressure is less likely to be encountered in the brain. These viruses provide a relevant model to study the adaptations required for SIV to induce encephalitis.« less
New Optical Transforms For Statistical Image Recognition
NASA Astrophysics Data System (ADS)
Lee, Sing H.
1983-12-01
In optical implementation of statistical image recognition, new optical transforms on large images for real-time recognition are of special interest. Several important linear transformations frequently used in statistical pattern recognition have now been optically implemented, including the Karhunen-Loeve transform (KLT), the Fukunaga-Koontz transform (FKT) and the least-squares linear mapping technique (LSLMT).1-3 The KLT performs principle components analysis on one class of patterns for feature extraction. The FKT performs feature extraction for separating two classes of patterns. The LSLMT separates multiple classes of patterns by maximizing the interclass differences and minimizing the intraclass variations.
NASA Astrophysics Data System (ADS)
Gonzales, Kalim
It is argued that infants build a foundation for learning about the world through their incidental acquisition of the spatial and temporal regularities surrounding them. A challenge is that learning occurs across multiple contexts whose statistics can greatly differ. Two artificial language studies with 12-month-olds demonstrate that infants come prepared to parse statistics across contexts using the temporal and perceptual features that distinguish one context from another. These results suggest that infants can organize their statistical input with a wider range of features that typically considered. Possible attention, decision making, and memory mechanisms are discussed.
Evidence of selection as a cause for racial disparities in fibroproliferative disease
Velez Edwards, Digna R.
2017-01-01
Fibroproliferative diseases are common complex traits featuring scarring and overgrowth of connective tissue which vary widely in presentation because they affect many organ systems. Most fibroproliferative diseases are more prevalent in African-derived populations than in European populations, leading to pronounced health disparities. It is hypothesized that the increased prevalence of these diseases in African-derived populations is due to selection for pro-fibrotic alleles that are protective against helminth infections. We constructed a genetic risk score (GRS) of fibroproliferative disease risk-increasing alleles using 147 linkage disequilibrium-pruned variants identified through genome-wide association studies of seven fibroproliferative diseases with large African-European prevalence disparities. A comparison of the fibroproliferative disease GRS between 1000 Genomes Phase 3 populations detected a higher mean GRS in AFR (mean = 148 risk alleles) than EUR (mean = 136 risk alleles; T-test p-value = 1.75x10-123). To test whether differences in GRS burden are systematic and may be due to selection, we employed the quantitative trait loci (QTL) sign test. The QTL sign test result indicates that population differences in risk-increasing allele burdens at these fibroproliferative disease variants are systematic and support a model featuring selective pressure (p-value = 0.011). These observations were replicated in an independent sample and were more statistically significant (T-test p-value = 7.26x10-237, sign test p-value = 0.015). This evidence supports the role of selective pressure acting to increase frequency of fibroproliferative alleles in populations of African relative to European ancestry populations. PMID:28792542
Radiomic analysis in prediction of Human Papilloma Virus status.
Yu, Kaixian; Zhang, Youyi; Yu, Yang; Huang, Chao; Liu, Rongjie; Li, Tengfei; Yang, Liuqing; Morris, Jeffrey S; Baladandayuthapani, Veerabhadran; Zhu, Hongtu
2017-12-01
Human Papilloma Virus (HPV) has been associated with oropharyngeal cancer prognosis. Traditionally the HPV status is tested through invasive lab test. Recently, the rapid development of statistical image analysis techniques has enabled precise quantitative analysis of medical images. The quantitative analysis of Computed Tomography (CT) provides a non-invasive way to assess HPV status for oropharynx cancer patients. We designed a statistical radiomics approach analyzing CT images to predict HPV status. Various radiomics features were extracted from CT scans, and analyzed using statistical feature selection and prediction methods. Our approach ranked the highest in the 2016 Medical Image Computing and Computer Assisted Intervention (MICCAI) grand challenge: Oropharynx Cancer (OPC) Radiomics Challenge, Human Papilloma Virus (HPV) Status Prediction. Further analysis on the most relevant radiomic features distinguishing HPV positive and negative subjects suggested that HPV positive patients usually have smaller and simpler tumors.
Administrative records and surveys as basis for statistics on international labour migration.
Hoffmann, E
1997-08-01
"This paper discusses possible sources for statistics to be used for describing and analysing the number, structure, situation, development and impact of migrant workers. The discussion is focused on key, intrinsic features of the different sources, important for the understanding of their strengths and weaknesses, and draws the reader's attention to features which may tend to undermine the quality of statistics produced as well as ways in which the impact of such features can be evaluated and, if possible, reduced.... The paper is organized around three key groups of migrant workers: (a) Persons who are arriving in a country to work there, i.e. the inflow of foreign workers; (b) Persons who are leaving their country to find work abroad, i.e. the outflow of migrant workers; [and] (c) Stock of foreign workers in the country." (EXCERPT)
Detection of reflecting surfaces by a statistical model
NASA Astrophysics Data System (ADS)
He, Qiang; Chu, Chee-Hung H.
2009-02-01
Remote sensing is widely used assess the destruction from natural disasters and to plan relief and recovery operations. How to automatically extract useful features and segment interesting objects from digital images, including remote sensing imagery, becomes a critical task for image understanding. Unfortunately, current research on automated feature extraction is ignorant of contextual information. As a result, the fidelity of populating attributes corresponding to interesting features and objects cannot be satisfied. In this paper, we present an exploration on meaningful object extraction integrating reflecting surfaces. Detection of specular reflecting surfaces can be useful in target identification and then can be applied to environmental monitoring, disaster prediction and analysis, military, and counter-terrorism. Our method is based on a statistical model to capture the statistical properties of specular reflecting surfaces. And then the reflecting surfaces are detected through cluster analysis.
Data-based Non-Markovian Model Inference
NASA Astrophysics Data System (ADS)
Ghil, Michael
2015-04-01
This talk concentrates on obtaining stable and efficient data-based models for simulation and prediction in the geosciences and life sciences. The proposed model derivation relies on using a multivariate time series of partial observations from a large-dimensional system, and the resulting low-order models are compared with the optimal closures predicted by the non-Markovian Mori-Zwanzig formalism of statistical physics. Multilayer stochastic models (MSMs) are introduced as both a very broad generalization and a time-continuous limit of existing multilevel, regression-based approaches to data-based closure, in particular of empirical model reduction (EMR). We show that the multilayer structure of MSMs can provide a natural Markov approximation to the generalized Langevin equation (GLE) of the Mori-Zwanzig formalism. A simple correlation-based stopping criterion for an EMR-MSM model is derived to assess how well it approximates the GLE solution. Sufficient conditions are given for the nonlinear cross-interactions between the constitutive layers of a given MSM to guarantee the existence of a global random attractor. This existence ensures that no blow-up can occur for a very broad class of MSM applications. The EMR-MSM methodology is first applied to a conceptual, nonlinear, stochastic climate model of coupled slow and fast variables, in which only slow variables are observed. The resulting reduced model with energy-conserving nonlinearities captures the main statistical features of the slow variables, even when there is no formal scale separation and the fast variables are quite energetic. Second, an MSM is shown to successfully reproduce the statistics of a partially observed, generalized Lokta-Volterra model of population dynamics in its chaotic regime. The positivity constraint on the solutions' components replaces here the quadratic-energy-preserving constraint of fluid-flow problems and it successfully prevents blow-up. This work is based on a close collaboration with M.D. Chekroun, D. Kondrashov, S. Kravtsov and A.W. Robertson.
Drakesmith, M; Caeyenberghs, K; Dutt, A; Lewis, G; David, A S; Jones, D K
2015-09-01
Graph theory (GT) is a powerful framework for quantifying topological features of neuroimaging-derived functional and structural networks. However, false positive (FP) connections arise frequently and influence the inferred topology of networks. Thresholding is often used to overcome this problem, but an appropriate threshold often relies on a priori assumptions, which will alter inferred network topologies. Four common network metrics (global efficiency, mean clustering coefficient, mean betweenness and smallworldness) were tested using a model tractography dataset. It was found that all four network metrics were significantly affected even by just one FP. Results also show that thresholding effectively dampens the impact of FPs, but at the expense of adding significant bias to network metrics. In a larger number (n=248) of tractography datasets, statistics were computed across random group permutations for a range of thresholds, revealing that statistics for network metrics varied significantly more than for non-network metrics (i.e., number of streamlines and number of edges). Varying degrees of network atrophy were introduced artificially to half the datasets, to test sensitivity to genuine group differences. For some network metrics, this atrophy was detected as significant (p<0.05, determined using permutation testing) only across a limited range of thresholds. We propose a multi-threshold permutation correction (MTPC) method, based on the cluster-enhanced permutation correction approach, to identify sustained significant effects across clusters of thresholds. This approach minimises requirements to determine a single threshold a priori. We demonstrate improved sensitivity of MTPC-corrected metrics to genuine group effects compared to an existing approach and demonstrate the use of MTPC on a previously published network analysis of tractography data derived from a clinical population. In conclusion, we show that there are large biases and instability induced by thresholding, making statistical comparisons of network metrics difficult. However, by testing for effects across multiple thresholds using MTPC, true group differences can be robustly identified. Copyright © 2015. Published by Elsevier Inc.
Time-Frequency Learning Machines for Nonstationarity Detection Using Surrogates
NASA Astrophysics Data System (ADS)
Borgnat, Pierre; Flandrin, Patrick; Richard, Cédric; Ferrari, André; Amoud, Hassan; Honeine, Paul
2012-03-01
Time-frequency representations provide a powerful tool for nonstationary signal analysis and classification, supporting a wide range of applications [12]. As opposed to conventional Fourier analysis, these techniques reveal the evolution in time of the spectral content of signals. In Ref. [7,38], time-frequency analysis is used to test stationarity of any signal. The proposed method consists of a comparison between global and local time-frequency features. The originality is to make use of a family of stationary surrogate signals for defining the null hypothesis of stationarity and, based upon this information, to derive statistical tests. An open question remains, however, about how to choose relevant time-frequency features. Over the last decade, a number of new pattern recognition methods based on reproducing kernels have been introduced. These learning machines have gained popularity due to their conceptual simplicity and their outstanding performance [30]. Initiated by Vapnik’s support vector machines (SVM) [35], they offer now a wide class of supervised and unsupervised learning algorithms. In Ref. [17-19], the authors have shown how the most effective and innovative learning machines can be tuned to operate in the time-frequency domain. This chapter follows this line of research by taking advantage of learning machines to test and quantify stationarity. Based on one-class SVM, our approach uses the entire time-frequency representation and does not require arbitrary feature extraction. Applied to a set of surrogates, it provides the domain boundary that includes most of these stationarized signals. This allows us to test the stationarity of the signal under investigation. This chapter is organized as follows. In Section 22.2, we introduce the surrogate data method to generate stationarized signals, namely, the null hypothesis of stationarity. The concept of time-frequency learning machines is presented in Section 22.3, and applied to one-class SVM in order to derive a stationarity test in Section 22.4. The relevance of the latter is illustrated by simulation results in Section 22.5.
Ng, Kenney; Ghoting, Amol; Steinhubl, Steven R.; Stewart, Walter F.; Malin, Bradley; Sun, Jimeng
2014-01-01
Objective Healthcare analytics research increasingly involves the construction of predictive models for disease targets across varying patient cohorts using electronic health records (EHRs). To facilitate this process, it is critical to support a pipeline of tasks: 1) cohort construction, 2) feature construction, 3) cross-validation, 4) feature selection, and 5) classification. To develop an appropriate model, it is necessary to compare and refine models derived from a diversity of cohorts, patient-specific features, and statistical frameworks. The goal of this work is to develop and evaluate a predictive modeling platform that can be used to simplify and expedite this process for health data. Methods To support this goal, we developed a PARAllel predictive MOdeling (PARAMO) platform which 1) constructs a dependency graph of tasks from specifications of predictive modeling pipelines, 2) schedules the tasks in a topological ordering of the graph, and 3) executes those tasks in parallel. We implemented this platform using Map-Reduce to enable independent tasks to run in parallel in a cluster computing environment. Different task scheduling preferences are also supported. Results We assess the performance of PARAMO on various workloads using three datasets derived from the EHR systems in place at Geisinger Health System and Vanderbilt University Medical Center and an anonymous longitudinal claims database. We demonstrate significant gains in computational efficiency against a standard approach. In particular, PARAMO can build 800 different models on a 300,000 patient data set in 3 hours in parallel compared to 9 days if running sequentially. Conclusion This work demonstrates that an efficient parallel predictive modeling platform can be developed for EHR data. This platform can facilitate large-scale modeling endeavors and speed-up the research workflow and reuse of health information. This platform is only a first step and provides the foundation for our ultimate goal of building analytic pipelines that are specialized for health data researchers. PMID:24370496
Ng, Kenney; Ghoting, Amol; Steinhubl, Steven R; Stewart, Walter F; Malin, Bradley; Sun, Jimeng
2014-04-01
Healthcare analytics research increasingly involves the construction of predictive models for disease targets across varying patient cohorts using electronic health records (EHRs). To facilitate this process, it is critical to support a pipeline of tasks: (1) cohort construction, (2) feature construction, (3) cross-validation, (4) feature selection, and (5) classification. To develop an appropriate model, it is necessary to compare and refine models derived from a diversity of cohorts, patient-specific features, and statistical frameworks. The goal of this work is to develop and evaluate a predictive modeling platform that can be used to simplify and expedite this process for health data. To support this goal, we developed a PARAllel predictive MOdeling (PARAMO) platform which (1) constructs a dependency graph of tasks from specifications of predictive modeling pipelines, (2) schedules the tasks in a topological ordering of the graph, and (3) executes those tasks in parallel. We implemented this platform using Map-Reduce to enable independent tasks to run in parallel in a cluster computing environment. Different task scheduling preferences are also supported. We assess the performance of PARAMO on various workloads using three datasets derived from the EHR systems in place at Geisinger Health System and Vanderbilt University Medical Center and an anonymous longitudinal claims database. We demonstrate significant gains in computational efficiency against a standard approach. In particular, PARAMO can build 800 different models on a 300,000 patient data set in 3h in parallel compared to 9days if running sequentially. This work demonstrates that an efficient parallel predictive modeling platform can be developed for EHR data. This platform can facilitate large-scale modeling endeavors and speed-up the research workflow and reuse of health information. This platform is only a first step and provides the foundation for our ultimate goal of building analytic pipelines that are specialized for health data researchers. Copyright © 2013 Elsevier Inc. All rights reserved.
Constructing fine-granularity functional brain network atlases via deep convolutional autoencoder.
Zhao, Yu; Dong, Qinglin; Chen, Hanbo; Iraji, Armin; Li, Yujie; Makkie, Milad; Kou, Zhifeng; Liu, Tianming
2017-12-01
State-of-the-art functional brain network reconstruction methods such as independent component analysis (ICA) or sparse coding of whole-brain fMRI data can effectively infer many thousands of volumetric brain network maps from a large number of human brains. However, due to the variability of individual brain networks and the large scale of such networks needed for statistically meaningful group-level analysis, it is still a challenging and open problem to derive group-wise common networks as network atlases. Inspired by the superior spatial pattern description ability of the deep convolutional neural networks (CNNs), a novel deep 3D convolutional autoencoder (CAE) network is designed here to extract spatial brain network features effectively, based on which an Apache Spark enabled computational framework is developed for fast clustering of larger number of network maps into fine-granularity atlases. To evaluate this framework, 10 resting state networks (RSNs) were manually labeled from the sparsely decomposed networks of Human Connectome Project (HCP) fMRI data and 5275 network training samples were obtained, in total. Then the deep CAE models are trained by these functional networks' spatial maps, and the learned features are used to refine the original 10 RSNs into 17 network atlases that possess fine-granularity functional network patterns. Interestingly, it turned out that some manually mislabeled outliers in training networks can be corrected by the deep CAE derived features. More importantly, fine granularities of networks can be identified and they reveal unique network patterns specific to different brain task states. By further applying this method to a dataset of mild traumatic brain injury study, it shows that the technique can effectively identify abnormal small networks in brain injury patients in comparison with controls. In general, our work presents a promising deep learning and big data analysis solution for modeling functional connectomes, with fine granularities, based on fMRI data. Copyright © 2017 Elsevier B.V. All rights reserved.
Sources of international migration statistics in Africa.
1984-01-01
The sources of international migration data for Africa may be classified into 2 main categories: administrative records and 2) censuses and survey data. Both categories are sources for the direct measurement of migration, but the 2nd category can be used for the indirect estimation of net international migration. The administrative records from which data on international migration may be derived include 1) entry/departure cards or forms completed at international borders, 2) residence/work permits issued to aliens, and 3) general population registers and registers of aliens. The statistics derived from the entry/departure cards may be described as 1) land frontier control statistics and 2) port control statistics. The former refer to data derived from movements across land borders and the latter refer to information collected at international airports and seaports. Other administrative records which are potential sources of statistics on international migration in some African countries include some limited population registers, records of the registration of aliens, and particulars of residence/work permits issued to aliens. Although frontier control data are considered the most important source of international migration statistics, in many African countries these data are too deficient to provide a satisfactory indication of the level of international migration. Thus decennial population censuses and/or sample surveys are the major sources of the available statistics on the stock and characteristics of international migration. Indirect methods can be used to supplement census data with intercensal estimates of net migration using census data on the total population. This indirect method of obtaining information on migration can be used to evaluate estimates derived from frontier control records, and it also offers the means of obtaining alternative information on international migration in African countries which have not directly investigated migration topics in their censuses or surveys.
NASA Astrophysics Data System (ADS)
Mazzetti, S.; Giannini, V.; Russo, F.; Regge, D.
2018-05-01
Computer-aided diagnosis (CAD) systems are increasingly being used in clinical settings to report multi-parametric magnetic resonance imaging (mp-MRI) of the prostate. Usually, CAD systems automatically highlight cancer-suspicious regions to the radiologist, reducing reader variability and interpretation errors. Nevertheless, implementing this software requires the selection of which mp-MRI parameters can best discriminate between malignant and non-malignant regions. To exploit functional information, some parameters are derived from dynamic contrast-enhanced (DCE) acquisitions. In particular, much CAD software employs pharmacokinetic features, such as K trans and k ep, derived from the Tofts model, to estimate a likelihood map of malignancy. However, non-pharmacokinetic models can be also used to describe DCE-MRI curves, without any requirement for prior knowledge or measurement of the arterial input function, which could potentially lead to large errors in parameter estimation. In this work, we implemented an empirical function derived from the phenomenological universalities (PUN) class to fit DCE-MRI. The parameters of the PUN model are used in combination with T2-weighted and diffusion-weighted acquisitions to feed a support vector machine classifier to produce a voxel-wise malignancy likelihood map of the prostate. The results were all compared to those for a CAD system based on Tofts pharmacokinetic features to describe DCE-MRI curves, using different quality aspects of image segmentation, while also evaluating the number and size of false positive (FP) candidate regions. This study included 61 patients with 70 biopsy-proven prostate cancers (PCa). The metrics used to evaluate segmentation quality between the two CAD systems were not statistically different, although the PUN-based CAD reported a lower number of FP, with reduced size compared to the Tofts-based CAD. In conclusion, the CAD software based on PUN parameters is a feasible means with which to detect PCa, without affecting segmentation quality, and hence it could be successfully applied in clinical settings, improving the automated diagnosis process and reducing computational complexity.
New features of global climatology revealed by satellite-derived oceanic rainfall maps
NASA Technical Reports Server (NTRS)
Rao, M. S. V.; Theon, J. S.
1977-01-01
Quantitative rainfall maps over the oceanic areas of the globe were derived from the Nimbus 5 Electrically Scanning Microwave Radiometer (ESMR) data. Analysis of satellite derived oceanic rainfall maps reveal certain distinctive characteristics of global patterns for the years 1973-74. The main ones are (1) the forking of the Intertropical Convergence Zone in the Pacific, (2) a previously unrecognized rain area in the South Atlantic, (3) the bimodal behavior of rainbelts in the Indian Ocean and (4) the large interannual variability in oceanic rainfall. These features are discussed.
Rain rate duration statistics derived from the Mid-Atlantic coast rain gauge network
NASA Technical Reports Server (NTRS)
Goldhirsh, Julius
1993-01-01
A rain gauge network comprised of 10 tipping bucket rain gauges located in the Mid-Atlantic coast of the United States has been in continuous operation since June 1, 1986. Rain rate distributions and estimated slant path fade distributions at 20 GHz and 30 GHz covering the first five year period were derived from the gauge network measurements, and these results were described by Goldhirsh. In this effort, rain rate time duration statistics are presented. The rain duration statistics are of interest for better understanding the physical nature of precipitation and to present a data base which may be used by modelers to convert to slant path fade duration statistics. Such statistics are important for better assessing optimal coding procedures over defined bandwidths.
NASA Astrophysics Data System (ADS)
Glushak, P. A.; Markiv, B. B.; Tokarchuk, M. V.
2018-01-01
We present a generalization of Zubarev's nonequilibrium statistical operator method based on the principle of maximum Renyi entropy. In the framework of this approach, we obtain transport equations for the basic set of parameters of the reduced description of nonequilibrium processes in a classical system of interacting particles using Liouville equations with fractional derivatives. For a classical systems of particles in a medium with a fractal structure, we obtain a non-Markovian diffusion equation with fractional spatial derivatives. For a concrete model of the frequency dependence of a memory function, we obtain generalized Kettano-type diffusion equation with the spatial and temporal fractality taken into account. We present a generalization of nonequilibrium thermofield dynamics in Zubarev's nonequilibrium statistical operator method in the framework of Renyi statistics.
NASA Astrophysics Data System (ADS)
Ren, W. X.; Lin, Y. Q.; Fang, S. E.
2011-11-01
One of the key issues in vibration-based structural health monitoring is to extract the damage-sensitive but environment-insensitive features from sampled dynamic response measurements and to carry out the statistical analysis of these features for structural damage detection. A new damage feature is proposed in this paper by using the system matrices of the forward innovation model based on the covariance-driven stochastic subspace identification of a vibrating system. To overcome the variations of the system matrices, a non-singularity transposition matrix is introduced so that the system matrices are normalized to their standard forms. For reducing the effects of modeling errors, noise and environmental variations on measured structural responses, a statistical pattern recognition paradigm is incorporated into the proposed method. The Mahalanobis and Euclidean distance decision functions of the damage feature vector are adopted by defining a statistics-based damage index. The proposed structural damage detection method is verified against one numerical signal and two numerical beams. It is demonstrated that the proposed statistics-based damage index is sensitive to damage and shows some robustness to the noise and false estimation of the system ranks. The method is capable of locating damage of the beam structures under different types of excitations. The robustness of the proposed damage detection method to the variations in environmental temperature is further validated in a companion paper by a reinforced concrete beam tested in the laboratory and a full-scale arch bridge tested in the field.
Statistical universals reveal the structures and functions of human music.
Savage, Patrick E; Brown, Steven; Sakai, Emi; Currie, Thomas E
2015-07-21
Music has been called "the universal language of mankind." Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation.
Statistical universals reveal the structures and functions of human music
Savage, Patrick E.; Brown, Steven; Sakai, Emi; Currie, Thomas E.
2015-01-01
Music has been called “the universal language of mankind.” Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation. PMID:26124105
NASA Astrophysics Data System (ADS)
Gontis, V.; Kononovicius, A.
2017-10-01
We address the problem of long-range memory in the financial markets. There are two conceptually different ways to reproduce power-law decay of auto-correlation function: using fractional Brownian motion as well as non-linear stochastic differential equations. In this contribution we address this problem by analyzing empirical return and trading activity time series from the Forex. From the empirical time series we obtain probability density functions of burst and inter-burst duration. Our analysis reveals that the power-law exponents of the obtained probability density functions are close to 3 / 2, which is a characteristic feature of the one-dimensional stochastic processes. This is in a good agreement with earlier proposed model of absolute return based on the non-linear stochastic differential equations derived from the agent-based herding model.
Autonomous Congestion Control in Delay-Tolerant Networks
NASA Technical Reports Server (NTRS)
Burleigh, Scott C.; Jennings, Esther H.
2005-01-01
Congestion control is an important feature that directly affects network performance. Network congestion may cause loss of data or long delays. Although this problem has been studied extensively in the Internet, the solutions for Internet congestion control do not apply readily to challenged network environments such as Delay Tolerant Networks (DTN) where end-to-end connectivity may not exist continuously and latency can be high. In DTN, end-to-end rate control is not feasible. This calls for congestion control mechanisms where the decisions can be made autonomously with local information only. We use an economic pricing model and propose a rule-based congestion control mechanism where each router can autonomously decide on whether to accept a bundle (data) based on local information such as available storage and the value and risk of accepting the bundle (derived from historical statistics).
NASA Technical Reports Server (NTRS)
Toth, L. V.; Mattila, K.; Haikala, L.; Balazs, L. G.
1992-01-01
The spectra of the 21cm HI radiation from the direction of L1780, a small high-galactic latitude dark/molecular cloud, were analyzed by multivariate methods. Factor analysis was performed on HI (21cm) spectra in order to separate the different components responsible for the spectral features. The rotated, orthogonal factors explain the spectra as a sum of radiation from the background (an extended HI emission layer), and from the L1780 dark cloud. The coefficients of the cloud-indicator factors were used to locate the HI 'halo' of the molecular cloud. Our statistically derived 'background' and 'cloud' spectral profiles, as well as the spatial distribution of the HI halo emission distribution were compared to the results of a previous study which used conventional methods analyzing nearly the same data set.
NASA Astrophysics Data System (ADS)
Gruzdev, A. N.
2017-07-01
Using the data of the ERA-Interim reanalysis, we have obtained estimates of changes in temperature, the geopotential and its large-scale zonal harmonics, wind velocity, and potential vorticity in the troposphere and stratosphere of the Northern and Southern hemispheres during the 11-year solar cycle. The estimates have been obtained using the method of multiple linear regression. Specific features of response of the indicated atmospheric parameters to the solar cycle have been revealed in particular regions of the atmosphere for a whole year and depending on the season. The results of the analysis indicate the existence of a reliable statistical relationship of large-scale dynamic and thermodynamic processes in the troposphere and stratosphere with the 11-year solar cycle.
The evolution of void-filled cosmological structures
NASA Technical Reports Server (NTRS)
Regos, Eniko; Geller, Margaret J.
1991-01-01
1D, 2D, and 3D simulations are used here to investigate the salient features in the evolution of void-filled cosmological structures in universes with arbitrary values of Omega. It is found that the growth of a void as a function of time decreases significantly at the time corresponding to Omega = 0.5. In models constructed in 2D and 3D, suitable initial conditions lead to cellular structure with faceted voids similar to those observed in redshift surveys. Matter compressed to planes flows more rapidly toward condensations at the intersections than would be expected for spherical infall. The peculiar streaming velocities for void diameters of 5000 km/s should be observable. The simulations provide a more physical basis and dynamics for the bubbly and Voronois tesselation models used to derive statistical properties of cellular large-scale structure.
Classification of subsurface objects using singular values derived from signal frames
Chambers, David H; Paglieroni, David W
2014-05-06
The classification system represents a detected object with a feature vector derived from the return signals acquired by an array of N transceivers operating in multistatic mode. The classification system generates the feature vector by transforming the real-valued return signals into complex-valued spectra, using, for example, a Fast Fourier Transform. The classification system then generates a feature vector of singular values for each user-designated spectral sub-band by applying a singular value decomposition (SVD) to the N.times.N square complex-valued matrix formed from sub-band samples associated with all possible transmitter-receiver pairs. The resulting feature vector of singular values may be transformed into a feature vector of singular value likelihoods and then subjected to a multi-category linear or neural network classifier for object classification.
NASA Technical Reports Server (NTRS)
Goldhirsh, J.
1984-01-01
Single and joint terminal slant path attenuation statistics at frequencies of 28.56 and 19.04 GHz have been derived, employing a radar data base obtained over a three-year period at Wallops Island, VA. Statistics were independently obtained for path elevation angles of 20, 45, and 90 deg for purposes of examining how elevation angles influences both single-terminal and joint probability distributions. Both diversity gains and autocorrelation function dependence on site spacing and elevation angles were determined employing the radar modeling results. Comparisons with other investigators are presented. An independent path elevation angle prediction technique was developed and demonstrated to fit well with the radar-derived single and joint terminal radar-derived cumulative fade distributions at various elevation angles.
Gurung, Arun Bahadur; Aguan, Kripamoy; Mitra, Sivaprasad; Bhattacharjee, Atanu
2017-06-01
In Alzheimer's disease (AD), the level of Acetylcholine (ACh) neurotransmitter is reduced. Since Acetylcholinesterase (AChE) cleaves ACh, inhibitors of AChE are very much sought after for AD treatment. The side effects of current inhibitors necessitate development of newer AChE inhibitors. Isoalloxazine derivatives have proved to be promising (AChE) inhibitors. However, their structure-activity relationship studies have not been reported till date. In the present work, various quantitative structure-activity relationship (QSAR) building methods such as multiple linear regression (MLR), partial least squares ,and principal component regression were employed to derive 3D-QSAR models using steric and electrostatic field descriptors. Statistically significant model was obtained using MLR coupled with stepwise selection method having r 2 = .9405, cross validated r 2 (q 2 ) = .6683, and a high predictability (pred_r 2 = .6206 and standard error, pred_r 2 se = .2491). Steric and electrostatic contribution plot revealed three electrostatic fields E_496, E_386 and E_577 and one steric field S_60 contributing towards biological activity. A ligand-based 3D-pharmacophore model was generated consisting of eight pharmacophore features. Isoalloxazine derivatives were docked against human AChE, which revealed critical residues implicated in hydrogen bonds as well as hydrophobic interactions. The binding modes of docked complexes (AChE_IA1 and AChE_IA14) were validated by molecular dynamics simulation which showed their stable trajectories in terms of root mean square deviation and molecular mechanics/Poisson-Boltzmann surface area binding free energy analysis revealed key residues contributing significantly to overall binding energy. The present study may be useful in the design of more potent Isoalloxazine derivatives as AChE inhibitors.
Body, Richard; Carley, Simon; McDowell, Garry; Pemberton, Philip; Burrows, Gillian; Cook, Gary; Lewis, Philip S; Smith, Alexander; Mackway-Jones, Kevin
2014-01-01
Objective We aimed to derive and validate a clinical decision rule (CDR) for suspected cardiac chest pain in the emergency department (ED). Incorporating information available at the time of first presentation, this CDR would effectively risk-stratify patients and immediately identify: (A) patients for whom hospitalisation may be safely avoided; and (B) high-risk patients, facilitating judicious use of resources. Methods In two sequential prospective observational cohort studies at heterogeneous centres, we included ED patients with suspected cardiac chest pain. We recorded clinical features and drew blood on arrival. The primary outcome was major adverse cardiac events (MACE) (death, prevalent or incident acute myocardial infarction, coronary revascularisation or new coronary stenosis >50%) within 30 days. The CDR was derived by logistic regression, considering reliable (κ>0.6) univariate predictors (p<0.05) for inclusion. Results In the derivation study (n=698) we derived a CDR including eight variables (high sensitivity troponin T; heart-type fatty acid binding protein; ECG ischaemia; diaphoresis observed; vomiting; pain radiation to right arm/shoulder; worsening angina; hypotension), which had a C-statistic of 0.95 (95% CI 0.93 to 0.97) implying near perfect diagnostic performance. On external validation (n=463) the CDR identified 27.0% of patients as ‘very low risk’ and potentially suitable for discharge from the ED. 0.0% of these patients had prevalent acute myocardial infarction and 1.6% developed MACE (n=2; both coronary stenoses without revascularisation). 9.9% of patients were classified as ‘high-risk’, 95.7% of whom developed MACE. Conclusions The Manchester Acute Coronary Syndromes (MACS) rule has the potential to safely reduce unnecessary hospital admissions and facilitate judicious use of high dependency resources. PMID:24780911
Directional Statistics for Polarization Observations of Individual Pulses from Radio Pulsars
NASA Astrophysics Data System (ADS)
McKinnon, M. M.
2010-10-01
Radio polarimetry is a three-dimensional statistical problem. The three-dimensional aspect of the problem arises from the Stokes parameters Q, U, and V, which completely describe the polarization of electromagnetic radiation and conceptually define the orientation of a polarization vector in the Poincaré sphere. The statistical aspect of the problem arises from the random fluctuations in the source-intrinsic polarization and the instrumental noise. A simple model for the polarization of pulsar radio emission has been used to derive the three-dimensional statistics of radio polarimetry. The model is based upon the proposition that the observed polarization is due to the incoherent superposition of two, highly polarized, orthogonal modes. The directional statistics derived from the model follow the Bingham-Mardia and Fisher family of distributions. The model assumptions are supported by the qualitative agreement between the statistics derived from it and those measured with polarization observations of the individual pulses from pulsars. The orthogonal modes are thought to be the natural modes of radio wave propagation in the pulsar magnetosphere. The intensities of the modes become statistically independent when generalized Faraday rotation (GFR) in the magnetosphere causes the difference in their phases to be large. A stochastic version of GFR occurs when fluctuations in the phase difference are also large, and may be responsible for the more complicated polarization patterns observed in pulsar radio emission.
Gowd, Snigdha; Shankar, T; Dash, Samarendra; Sahoo, Nivedita; Chatterjee, Suravi; Mohanty, Pritam
2017-01-01
The aim of the study was to evaluate the reliability of cone beam computed tomography (CBCT) obtained image over plaster model for the assessment of mixed dentition analysis. Thirty CBCT-derived images and thirty plaster models were derived from the dental archives, and Moyer's and Tanaka-Johnston analyses were performed. The data obtained were interpreted and analyzed statistically using SPSS 10.0/PC (SPSS Inc., Chicago, IL, USA). Descriptive and analytical analysis along with Student's t -test was performed to qualitatively evaluate the data and P < 0.05 was considered statistically significant. Statistically, significant results were obtained on data comparison between CBCT-derived images and plaster model; the mean for Moyer's analysis in the left and right lower arch for CBCT and plaster model was 21.2 mm, 21.1 mm and 22.5 mm, 22.5 mm, respectively. CBCT-derived images were less reliable as compared to data obtained directly from plaster model for mixed dentition analysis.
Joint Adaptive Mean-Variance Regularization and Variance Stabilization of High Dimensional Data.
Dazard, Jean-Eudes; Rao, J Sunil
2012-07-01
The paper addresses a common problem in the analysis of high-dimensional high-throughput "omics" data, which is parameter estimation across multiple variables in a set of data where the number of variables is much larger than the sample size. Among the problems posed by this type of data are that variable-specific estimators of variances are not reliable and variable-wise tests statistics have low power, both due to a lack of degrees of freedom. In addition, it has been observed in this type of data that the variance increases as a function of the mean. We introduce a non-parametric adaptive regularization procedure that is innovative in that : (i) it employs a novel "similarity statistic"-based clustering technique to generate local-pooled or regularized shrinkage estimators of population parameters, (ii) the regularization is done jointly on population moments, benefiting from C. Stein's result on inadmissibility, which implies that usual sample variance estimator is improved by a shrinkage estimator using information contained in the sample mean. From these joint regularized shrinkage estimators, we derived regularized t-like statistics and show in simulation studies that they offer more statistical power in hypothesis testing than their standard sample counterparts, or regular common value-shrinkage estimators, or when the information contained in the sample mean is simply ignored. Finally, we show that these estimators feature interesting properties of variance stabilization and normalization that can be used for preprocessing high-dimensional multivariate data. The method is available as an R package, called 'MVR' ('Mean-Variance Regularization'), downloadable from the CRAN website.
Reischauer, Carolin; Patzwahl, René; Koh, Dow-Mu; Froehlich, Johannes M; Gutzeit, Andreas
2018-04-01
To evaluate whole-lesion volumetric texture analysis of apparent diffusion coefficient (ADC) maps for assessing treatment response in prostate cancer bone metastases. Texture analysis is performed in 12 treatment-naïve patients with 34 metastases before treatment and at one, two, and three months after the initiation of androgen deprivation therapy. Four first-order and 19 second-order statistical texture features are computed on the ADC maps in each lesion at every time point. Repeatability, inter-patient variability, and changes in the feature values under therapy are investigated. Spearman rank's correlation coefficients are calculated across time to demonstrate the relationship between the texture features and the serum prostate specific antigen (PSA) levels. With few exceptions, the texture features exhibited moderate to high precision. At the same time, Friedman's tests revealed that all first-order and second-order statistical texture features changed significantly in response to therapy. Thereby, the majority of texture features showed significant changes in their values at all post-treatment time points relative to baseline. Bivariate analysis detected significant correlations between the great majority of texture features and the serum PSA levels. Thereby, three first-order and six second-order statistical features showed strong correlations with the serum PSA levels across time. The findings in the present work indicate that whole-tumor volumetric texture analysis may be utilized for response assessment in prostate cancer bone metastases. The approach may be used as a complementary measure for treatment monitoring in conjunction with averaged ADC values. Copyright © 2018 Elsevier B.V. All rights reserved.
Identifying duplicate content using statistically improbable phrases
Errami, Mounir; Sun, Zhaohui; George, Angela C.; Long, Tara C.; Skinner, Michael A.; Wren, Jonathan D.; Garner, Harold R.
2010-01-01
Motivation: Document similarity metrics such as PubMed's ‘Find related articles’ feature, which have been primarily used to identify studies with similar topics, can now also be used to detect duplicated or potentially plagiarized papers within literature reference databases. However, the CPU-intensive nature of document comparison has limited MEDLINE text similarity studies to the comparison of abstracts, which constitute only a small fraction of a publication's total text. Extending searches to include text archived by online search engines would drastically increase comparison ability. For large-scale studies, submitting short phrases encased in direct quotes to search engines for exact matches would be optimal for both individual queries and programmatic interfaces. We have derived a method of analyzing statistically improbable phrases (SIPs) for assistance in identifying duplicate content. Results: When applied to MEDLINE citations, this method substantially improves upon previous algorithms in the detection of duplication citations, yielding a precision and recall of 78.9% (versus 50.3% for eTBLAST) and 99.6% (versus 99.8% for eTBLAST), respectively. Availability: Similar citations identified by this work are freely accessible in the Déjà vu database, under the SIP discovery method category at http://dejavu.vbi.vt.edu/dejavu/ Contact: merrami@collin.edu PMID:20472545
A Stochastic Framework for Evaluating Seizure Prediction Algorithms Using Hidden Markov Models
Wong, Stephen; Gardner, Andrew B.; Krieger, Abba M.; Litt, Brian
2007-01-01
Responsive, implantable stimulation devices to treat epilepsy are now in clinical trials. New evidence suggests that these devices may be more effective when they deliver therapy before seizure onset. Despite years of effort, prospective seizure prediction, which could improve device performance, remains elusive. In large part, this is explained by lack of agreement on a statistical framework for modeling seizure generation and a method for validating algorithm performance. We present a novel stochastic framework based on a three-state hidden Markov model (HMM) (representing interictal, preictal, and seizure states) with the feature that periods of increased seizure probability can transition back to the interictal state. This notion reflects clinical experience and may enhance interpretation of published seizure prediction studies. Our model accommodates clipped EEG segments and formalizes intuitive notions regarding statistical validation. We derive equations for type I and type II errors as a function of the number of seizures, duration of interictal data, and prediction horizon length and we demonstrate the model’s utility with a novel seizure detection algorithm that appeared to predicted seizure onset. We propose this framework as a vital tool for designing and validating prediction algorithms and for facilitating collaborative research in this area. PMID:17021032
New scaling model for variables and increments with heavy-tailed distributions
NASA Astrophysics Data System (ADS)
Riva, Monica; Neuman, Shlomo P.; Guadagnini, Alberto
2015-06-01
Many hydrological (as well as diverse earth, environmental, ecological, biological, physical, social, financial and other) variables, Y, exhibit frequency distributions that are difficult to reconcile with those of their spatial or temporal increments, ΔY. Whereas distributions of Y (or its logarithm) are at times slightly asymmetric with relatively mild peaks and tails, those of ΔY tend to be symmetric with peaks that grow sharper, and tails that become heavier, as the separation distance (lag) between pairs of Y values decreases. No statistical model known to us captures these behaviors of Y and ΔY in a unified and consistent manner. We propose a new, generalized sub-Gaussian model that does so. We derive analytical expressions for probability distribution functions (pdfs) of Y and ΔY as well as corresponding lead statistical moments. In our model the peak and tails of the ΔY pdf scale with lag in line with observed behavior. The model allows one to estimate, accurately and efficiently, all relevant parameters by analyzing jointly sample moments of Y and ΔY. We illustrate key features of our new model and method of inference on synthetically generated samples and neutron porosity data from a deep borehole.
Mankin, Romi; Rekker, Astrid
2016-12-01
The output interspike interval statistics of a stochastic perfect integrate-and-fire neuron model driven by an additive exogenous periodic stimulus is considered. The effect of temporally correlated random activity of synaptic inputs is modeled by an additive symmetric dichotomous noise. Using a first-passage-time formulation, exact expressions for the output interspike interval density and for the serial correlation coefficient are derived in the nonsteady regime, and their dependence on input parameters (e.g., the noise correlation time and amplitude as well as the frequency of an input current) is analyzed. It is shown that an interplay of a periodic forcing and colored noise can cause a variety of nonequilibrium cooperation effects, such as sign reversals of the interspike interval correlations versus noise-switching rate as well as versus the frequency of periodic forcing, a power-law-like decay of oscillations of the serial correlation coefficients in the long-lag limit, amplification of the output signal modulation in the instantaneous firing rate of the neural response, etc. The features of spike statistics in the limits of slow and fast noises are also discussed.
Response to a periodic stimulus in a perfect integrate-and-fire neuron model driven by colored noise
NASA Astrophysics Data System (ADS)
Mankin, Romi; Rekker, Astrid
2016-12-01
The output interspike interval statistics of a stochastic perfect integrate-and-fire neuron model driven by an additive exogenous periodic stimulus is considered. The effect of temporally correlated random activity of synaptic inputs is modeled by an additive symmetric dichotomous noise. Using a first-passage-time formulation, exact expressions for the output interspike interval density and for the serial correlation coefficient are derived in the nonsteady regime, and their dependence on input parameters (e.g., the noise correlation time and amplitude as well as the frequency of an input current) is analyzed. It is shown that an interplay of a periodic forcing and colored noise can cause a variety of nonequilibrium cooperation effects, such as sign reversals of the interspike interval correlations versus noise-switching rate as well as versus the frequency of periodic forcing, a power-law-like decay of oscillations of the serial correlation coefficients in the long-lag limit, amplification of the output signal modulation in the instantaneous firing rate of the neural response, etc. The features of spike statistics in the limits of slow and fast noises are also discussed.
PREDICTING CORONAL MASS EJECTIONS USING MACHINE LEARNING METHODS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobra, M. G.; Ilonidis, S.
Of all the activity observed on the Sun, two of the most energetic events are flares and coronal mass ejections (CMEs). Usually, solar active regions that produce large flares will also produce a CME, but this is not always true. Despite advances in numerical modeling, it is still unclear which circumstances will produce a CME. Therefore, it is worthwhile to empirically determine which features distinguish flares associated with CMEs from flares that are not. At this time, no extensive study has used physically meaningful features of active regions to distinguish between these two populations. As such, we attempt to domore » so by using features derived from (1) photospheric vector magnetic field data taken by the Solar Dynamics Observatory ’s Helioseismic and Magnetic Imager instrument and (2) X-ray flux data from the Geostationary Operational Environmental Satellite’s X-ray Flux instrument. We build a catalog of active regions that either produced both a flare and a CME (the positive class) or simply a flare (the negative class). We then use machine-learning algorithms to (1) determine which features distinguish these two populations, and (2) forecast whether an active region that produces an M- or X-class flare will also produce a CME. We compute the True Skill Statistic, a forecast verification metric, and find that it is a relatively high value of ∼0.8 ± 0.2. We conclude that a combination of six parameters, which are all intensive in nature, will capture most of the relevant information contained in the photospheric magnetic field.« less
Orlando, José Ignacio; van Keer, Karel; Barbosa Breda, João; Manterola, Hugo Luis; Blaschko, Matthew B; Clausse, Alejandro
2017-12-01
Diabetic retinopathy (DR) is one of the most widespread causes of preventable blindness in the world. The most dangerous stage of this condition is proliferative DR (PDR), in which the risk of vision loss is high and treatments are less effective. Fractal features of the retinal vasculature have been previously explored as potential biomarkers of DR, yet the current literature is inconclusive with respect to their correlation with PDR. In this study, we experimentally assess their discrimination ability to recognize PDR cases. A statistical analysis of the viability of using three reference fractal characterization schemes - namely box, information, and correlation dimensions - to identify patients with PDR is presented. These descriptors are also evaluated as input features for training ℓ1 and ℓ2 regularized logistic regression classifiers, to estimate their performance. Our results on MESSIDOR, a public dataset of 1200 fundus photographs, indicate that patients with PDR are more likely to exhibit a higher fractal dimension than healthy subjects or patients with mild levels of DR (P≤1.3×10-2). Moreover, a supervised classifier trained with both fractal measurements and red lesion-based features reports an area under the ROC curve of 0.93 for PDR screening and 0.96 for detecting patients with optic disc neovascularizations. The fractal dimension of the vasculature increases with the level of DR. Furthermore, PDR screening using multiscale fractal measurements is more feasible than using their derived fractal dimensions. Code and further resources are provided at https://github.com/ignaciorlando/fundus-fractal-analysis. © 2017 American Association of Physicists in Medicine.
Roda, C; Charreire, H; Feuillet, T; Mackenbach, J D; Compernolle, S; Glonti, K; Ben Rebah, M; Bárdos, H; Rutter, H; McKee, M; De Bourdeaudhuij, I; Brug, J; Lakerveld, J; Oppert, J-M
2016-01-01
Findings from research on the association between the built environment and obesity remain equivocal but may be partly explained by differences in approaches used to characterize the built environment. Findings obtained using subjective measures may differ substantially from those measured objectively. We investigated the agreement between perceived and objectively measured obesogenic environmental features to assess (1) the extent of agreement between individual perceptions and observable characteristics of the environment and (2) the agreement between aggregated perceptions and observable characteristics, and whether this varied by type of characteristic, region or neighbourhood. Cross-sectional data from the SPOTLIGHT project (n = 6037 participants from 60 neighbourhoods in five European urban regions) were used. Residents' perceptions were self-reported, and objectively measured environmental features were obtained by a virtual audit using Google Street View. Percent agreement and Kappa statistics were calculated. The mismatch was quantified at neighbourhood level by a distance metric derived from a factor map. The extent to which the mismatch metric varied by region and neighbourhood was examined using linear regression models. Overall, agreement was moderate (agreement < 82%, kappa < 0.3) and varied by obesogenic environmental feature, region and neighbourhood. Highest agreement was found for food outlets and outdoor recreational facilities, and lowest agreement was obtained for aesthetics. In general, a better match was observed in high-residential density neighbourhoods characterized by a high density of food outlets and recreational facilities. Future studies should combine perceived and objectively measured built environment qualities to better understand the potential impact of the built environment on health, particularly in low residential density neighbourhoods. © 2016 World Obesity.
Linking agent-based models and stochastic models of financial markets
Feng, Ling; Li, Baowen; Podobnik, Boris; Preis, Tobias; Stanley, H. Eugene
2012-01-01
It is well-known that financial asset returns exhibit fat-tailed distributions and long-term memory. These empirical features are the main objectives of modeling efforts using (i) stochastic processes to quantitatively reproduce these features and (ii) agent-based simulations to understand the underlying microscopic interactions. After reviewing selected empirical and theoretical evidence documenting the behavior of traders, we construct an agent-based model to quantitatively demonstrate that “fat” tails in return distributions arise when traders share similar technical trading strategies and decisions. Extending our behavioral model to a stochastic model, we derive and explain a set of quantitative scaling relations of long-term memory from the empirical behavior of individual market participants. Our analysis provides a behavioral interpretation of the long-term memory of absolute and squared price returns: They are directly linked to the way investors evaluate their investments by applying technical strategies at different investment horizons, and this quantitative relationship is in agreement with empirical findings. Our approach provides a possible behavioral explanation for stochastic models for financial systems in general and provides a method to parameterize such models from market data rather than from statistical fitting. PMID:22586086
Linking agent-based models and stochastic models of financial markets.
Feng, Ling; Li, Baowen; Podobnik, Boris; Preis, Tobias; Stanley, H Eugene
2012-05-29
It is well-known that financial asset returns exhibit fat-tailed distributions and long-term memory. These empirical features are the main objectives of modeling efforts using (i) stochastic processes to quantitatively reproduce these features and (ii) agent-based simulations to understand the underlying microscopic interactions. After reviewing selected empirical and theoretical evidence documenting the behavior of traders, we construct an agent-based model to quantitatively demonstrate that "fat" tails in return distributions arise when traders share similar technical trading strategies and decisions. Extending our behavioral model to a stochastic model, we derive and explain a set of quantitative scaling relations of long-term memory from the empirical behavior of individual market participants. Our analysis provides a behavioral interpretation of the long-term memory of absolute and squared price returns: They are directly linked to the way investors evaluate their investments by applying technical strategies at different investment horizons, and this quantitative relationship is in agreement with empirical findings. Our approach provides a possible behavioral explanation for stochastic models for financial systems in general and provides a method to parameterize such models from market data rather than from statistical fitting.
Characterizing populations and searching for diagnostics via elastic registration of MRI images
NASA Astrophysics Data System (ADS)
Pettey, David; Gee, James C.
2001-07-01
Given image data from two distinct populations and a family of functions, we find the scalar discriminant function which best discriminates between the populations. The goals are two-fold: first, to construct a discriminant function which can accurately and reliably classify subjects via the image data. Second, the best discriminant allows us to see which features in the images distinguish between the populations; these features can guide us to finding characteristic differences between the two groups even if these differences are not sufficient to perform classification. We apply our method to mid-sagittal MRI sections of the corpus callosum from 34 males and 52 females. While we are not certain of the ability of the derived discriminant function to perform sex classification, we find that regions in the anterior of the corpus callosum do appear to be more important for the discriminant function than other regions. This indicates there may be significant differences in the relative size of the splenium in males and females, as has been reported elsewhere. More notably, we applied previous methods which support this view on our larger data set, but found that these methods no longer show statistically significant differences between the male and female splenium.
Application of Geostatistical Simulation to Enhance Satellite Image Products
NASA Technical Reports Server (NTRS)
Hlavka, Christine A.; Dungan, Jennifer L.; Thirulanambi, Rajkumar; Roy, David
2004-01-01
With the deployment of Earth Observing System (EOS) satellites that provide daily, global imagery, there is increasing interest in defining the limitations of the data and derived products due to its coarse spatial resolution. Much of the detail, i.e. small fragments and notches in boundaries, is lost with coarse resolution imagery such as the EOS MODerate-Resolution Imaging Spectroradiometer (MODIS) data. Higher spatial resolution data such as the EOS Advanced Spaceborn Thermal Emission and Reflection Radiometer (ASTER), Landsat and airborne sensor imagery provide more detailed information but are less frequently available. There are, however, both theoretical and analytical evidence that burn scars and other fragmented types of land covers form self-similar or self-affine patterns, that is, patterns that look similar when viewed at widely differing spatial scales. Therefore small features of the patterns should be predictable, at least in a statistical sense, with knowledge about the large features. Recent developments in fractal modeling for characterizing the spatial distribution of undiscovered petroleum deposits are thus applicable to generating simulations of finer resolution satellite image products. We will present example EOS products, analysis to investigate self-similarity, and simulation results.
In Situ Raman Microscopy of a Single Graphite Microflake Electrode in a Li(+)-containing Electrolyte
NASA Technical Reports Server (NTRS)
Shi, Qing-Fang; Dokko, Kaoru; Scherson, Daniel A.
2003-01-01
Highly detailed Raman spectra from a single KS-44 graphite microflake electrode as a function of the applied potential have been collected in situ using a Raman microscope and a sealed spectroelectrochemical cell isolated from the laboratory environment. Correlations were found between the Raman spectral features and the various Li(+) intercalation stages while recording in real time Raman spectra during a linear potential scan from 0.7 down ca. 0.0V vs Li/Li(+) at a rate of 0.1 mV/s in a 1M LiClO4 solution in a 1:l (by volume) ethylene carbonate (EC):diethyl carbonate (DEC) mixture. In particular, clearly defined isosbestic points were observed for data collected in the potential range where the transition between dilute phase 1 and phase 4 of lithiated graphite is known to occur, i.e. 0.157 < E < 0.215 vs Li/Li(+). Statistical analysis of the spectroscopic data within this region made it possible to determine independently the fraction of each of the two phases present as a function of potential without relying on coulometric information and then predict, based on the proposed stoichiometry for the transition, a spectrally-derived voltammetric feature.
Navarro, Pedro J; Fernández-Isla, Carlos; Alcover, Pedro María; Suardíaz, Juan
2016-07-27
This paper presents a robust method for defect detection in textures, entropy-based automatic selection of the wavelet decomposition level (EADL), based on a wavelet reconstruction scheme, for detecting defects in a wide variety of structural and statistical textures. Two main features are presented. One of the new features is an original use of the normalized absolute function value (NABS) calculated from the wavelet coefficients derived at various different decomposition levels in order to identify textures where the defect can be isolated by eliminating the texture pattern in the first decomposition level. The second is the use of Shannon's entropy, calculated over detail subimages, for automatic selection of the band for image reconstruction, which, unlike other techniques, such as those based on the co-occurrence matrix or on energy calculation, provides a lower decomposition level, thus avoiding excessive degradation of the image, allowing a more accurate defect segmentation. A metric analysis of the results of the proposed method with nine different thresholding algorithms determined that selecting the appropriate thresholding method is important to achieve optimum performance in defect detection. As a consequence, several different thresholding algorithms depending on the type of texture are proposed.
NASA Astrophysics Data System (ADS)
Mallast, U.; Gloaguen, R.; Geyer, S.; Rödiger, T.; Siebert, C.
2011-08-01
In this paper we present a semi-automatic method to infer groundwater flow-paths based on the extraction of lineaments from digital elevation models. This method is especially adequate in remote and inaccessible areas where in-situ data are scarce. The combined method of linear filtering and object-based classification provides a lineament map with a high degree of accuracy. Subsequently, lineaments are differentiated into geological and morphological lineaments using auxiliary information and finally evaluated in terms of hydro-geological significance. Using the example of the western catchment of the Dead Sea (Israel/Palestine), the orientation and location of the differentiated lineaments are compared to characteristics of known structural features. We demonstrate that a strong correlation between lineaments and structural features exists. Using Euclidean distances between lineaments and wells provides an assessment criterion to evaluate the hydraulic significance of detected lineaments. Based on this analysis, we suggest that the statistical analysis of lineaments allows a delineation of flow-paths and thus significant information on groundwater movements. To validate the flow-paths we compare them to existing results of groundwater models that are based on well data.
Guo, Hao; Cao, Xiaohua; Liu, Zhifen; Li, Haifang; Chen, Junjie; Zhang, Kerang
2012-12-05
Resting state functional brain networks have been widely studied in brain disease research. However, it is currently unclear whether abnormal resting state functional brain network metrics can be used with machine learning for the classification of brain diseases. Resting state functional brain networks were constructed for 28 healthy controls and 38 major depressive disorder patients by thresholding partial correlation matrices of 90 regions. Three nodal metrics were calculated using graph theory-based approaches. Nonparametric permutation tests were then used for group comparisons of topological metrics, which were used as classified features in six different algorithms. We used statistical significance as the threshold for selecting features and measured the accuracies of six classifiers with different number of features. A sensitivity analysis method was used to evaluate the importance of different features. The result indicated that some of the regions exhibited significantly abnormal nodal centralities, including the limbic system, basal ganglia, medial temporal, and prefrontal regions. Support vector machine with radial basis kernel function algorithm and neural network algorithm exhibited the highest average accuracy (79.27 and 78.22%, respectively) with 28 features (P<0.05). Correlation analysis between feature importance and the statistical significance of metrics was investigated, and the results revealed a strong positive correlation between them. Overall, the current study demonstrated that major depressive disorder is associated with abnormal functional brain network topological metrics and statistically significant nodal metrics can be successfully used for feature selection in classification algorithms.
Muthu Rama Krishnan, M; Shah, Pratik; Chakraborty, Chandan; Ray, Ajoy K
2012-04-01
The objective of this paper is to provide an improved technique, which can assist oncopathologists in correct screening of oral precancerous conditions specially oral submucous fibrosis (OSF) with significant accuracy on the basis of collagen fibres in the sub-epithelial connective tissue. The proposed scheme is composed of collagen fibres segmentation, its textural feature extraction and selection, screening perfomance enhancement under Gaussian transformation and finally classification. In this study, collagen fibres are segmented on R,G,B color channels using back-probagation neural network from 60 normal and 59 OSF histological images followed by histogram specification for reducing the stain intensity variation. Henceforth, textural features of collgen area are extracted using fractal approaches viz., differential box counting and brownian motion curve . Feature selection is done using Kullback-Leibler (KL) divergence criterion and the screening performance is evaluated based on various statistical tests to conform Gaussian nature. Here, the screening performance is enhanced under Gaussian transformation of the non-Gaussian features using hybrid distribution. Moreover, the routine screening is designed based on two statistical classifiers viz., Bayesian classification and support vector machines (SVM) to classify normal and OSF. It is observed that SVM with linear kernel function provides better classification accuracy (91.64%) as compared to Bayesian classifier. The addition of fractal features of collagen under Gaussian transformation improves Bayesian classifier's performance from 80.69% to 90.75%. Results are here studied and discussed.
Nonlinear wave chaos: statistics of second harmonic fields.
Zhou, Min; Ott, Edward; Antonsen, Thomas M; Anlage, Steven M
2017-10-01
Concepts from the field of wave chaos have been shown to successfully predict the statistical properties of linear electromagnetic fields in electrically large enclosures. The Random Coupling Model (RCM) describes these properties by incorporating both universal features described by Random Matrix Theory and the system-specific features of particular system realizations. In an effort to extend this approach to the nonlinear domain, we add an active nonlinear frequency-doubling circuit to an otherwise linear wave chaotic system, and we measure the statistical properties of the resulting second harmonic fields. We develop an RCM-based model of this system as two linear chaotic cavities coupled by means of a nonlinear transfer function. The harmonic field strengths are predicted to be the product of two statistical quantities and the nonlinearity characteristics. Statistical results from measurement-based calculation, RCM-based simulation, and direct experimental measurements are compared and show good agreement over many decades of power.
Güssregen, Stefan; Matter, Hans; Hessler, Gerhard; Müller, Marco; Schmidt, Friedemann; Clark, Timothy
2012-09-24
Current 3D-QSAR methods such as CoMFA or CoMSIA make use of classical force-field approaches for calculating molecular fields. Thus, they can not adequately account for noncovalent interactions involving halogen atoms like halogen bonds or halogen-π interactions. These deficiencies in the underlying force fields result from the lack of treatment of the anisotropy of the electron density distribution of those atoms, known as the "σ-hole", although recent developments have begun to take specific interactions such as halogen bonding into account. We have now replaced classical force field derived molecular fields by local properties such as the local ionization energy, local electron affinity, or local polarizability, calculated using quantum-mechanical (QM) techniques that do not suffer from the above limitation for 3D-QSAR. We first investigate the characteristics of QM-based local property fields to show that they are suitable for statistical analyses after suitable pretreatment. We then analyze these property fields with partial least-squares (PLS) regression to predict biological affinities of two data sets comprising factor Xa and GABA-A/benzodiazepine receptor ligands. While the resulting models perform equally well or even slightly better in terms of consistency and predictivity than the classical CoMFA fields, the most important aspect of these augmented field-types is that the chemical interpretation of resulting QM-based property field models reveals unique SAR trends driven by electrostatic and polarizability effects, which cannot be extracted directly from CoMFA electrostatic maps. Within the factor Xa set, the interaction of chlorine and bromine atoms with a tyrosine side chain in the protease S1 pocket are correctly predicted. Within the GABA-A/benzodiazepine ligand data set, PLS models of high predictivity resulted for our QM-based property fields, providing novel insights into key features of the SAR for two receptor subtypes and cross-receptor selectivity of the ligands. The detailed interpretation of regression models derived using improved QM-derived property fields thus provides a significant advantage by revealing chemically meaningful correlations with biological activity and helps in understanding novel structure-activity relationship features. This will allow such knowledge to be used to design novel molecules on the basis of interactions additional to steric and hydrogen-bonding features.
Blended particle filters for large-dimensional chaotic dynamical systems
Majda, Andrew J.; Qi, Di; Sapsis, Themistoklis P.
2014-01-01
A major challenge in contemporary data science is the development of statistically accurate particle filters to capture non-Gaussian features in large-dimensional chaotic dynamical systems. Blended particle filters that capture non-Gaussian features in an adaptively evolving low-dimensional subspace through particles interacting with evolving Gaussian statistics on the remaining portion of phase space are introduced here. These blended particle filters are constructed in this paper through a mathematical formalism involving conditional Gaussian mixtures combined with statistically nonlinear forecast models compatible with this structure developed recently with high skill for uncertainty quantification. Stringent test cases for filtering involving the 40-dimensional Lorenz 96 model with a 5-dimensional adaptive subspace for nonlinear blended filtering in various turbulent regimes with at least nine positive Lyapunov exponents are used here. These cases demonstrate the high skill of the blended particle filter algorithms in capturing both highly non-Gaussian dynamical features as well as crucial nonlinear statistics for accurate filtering in extreme filtering regimes with sparse infrequent high-quality observations. The formalism developed here is also useful for multiscale filtering of turbulent systems and a simple application is sketched below. PMID:24825886
Origin of the correlations between exit times in pedestrian flows through a bottleneck
NASA Astrophysics Data System (ADS)
Nicolas, Alexandre; Touloupas, Ioannis
2018-01-01
Robust statistical features have emerged from the microscopic analysis of dense pedestrian flows through a bottleneck, notably with respect to the time gaps between successive passages. We pinpoint the mechanisms at the origin of these features thanks to simple models that we develop and analyse quantitatively. We disprove the idea that anticorrelations between successive time gaps (i.e. an alternation between shorter ones and longer ones) are a hallmark of a zipper-like intercalation of pedestrian lines and show that they simply result from the possibility that pedestrians from distinct ‘lines’ or directions cross the bottleneck within a short time interval. A second feature concerns the bursts of escapes, i.e. egresses that come in fast succession. Despite the ubiquity of exponential distributions of burst sizes, entailed by a Poisson process, we argue that anomalous (power-law) statistics arise if the bottleneck is nearly congested, albeit only in a tiny portion of parameter space. The generality of the proposed mechanisms implies that similar statistical features should also be observed for other types of particulate flows.
Machine learning to analyze images of shocked materials for precise and accurate measurements
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dresselhaus-Cooper, Leora; Howard, Marylesa; Hock, Margaret C.
A supervised machine learning algorithm, called locally adaptive discriminant analysis (LADA), has been developed to locate boundaries between identifiable image features that have varying intensities. LADA is an adaptation of image segmentation, which includes techniques that find the positions of image features (classes) using statistical intensity distributions for each class in the image. In order to place a pixel in the proper class, LADA considers the intensity at that pixel and the distribution of intensities in local (nearby) pixels. This paper presents the use of LADA to provide, with statistical uncertainties, the positions and shapes of features within ultrafast imagesmore » of shock waves. We demonstrate the ability to locate image features including crystals, density changes associated with shock waves, and material jetting caused by shock waves. This algorithm can analyze images that exhibit a wide range of physical phenomena because it does not rely on comparison to a model. LADA enables analysis of images from shock physics with statistical rigor independent of underlying models or simulations.« less
Robust kernel representation with statistical local features for face recognition.
Yang, Meng; Zhang, Lei; Shiu, Simon Chi-Keung; Zhang, David
2013-06-01
Factors such as misalignment, pose variation, and occlusion make robust face recognition a difficult problem. It is known that statistical features such as local binary pattern are effective for local feature extraction, whereas the recently proposed sparse or collaborative representation-based classification has shown interesting results in robust face recognition. In this paper, we propose a novel robust kernel representation model with statistical local features (SLF) for robust face recognition. Initially, multipartition max pooling is used to enhance the invariance of SLF to image registration error. Then, a kernel-based representation model is proposed to fully exploit the discrimination information embedded in the SLF, and robust regression is adopted to effectively handle the occlusion in face images. Extensive experiments are conducted on benchmark face databases, including extended Yale B, AR (A. Martinez and R. Benavente), multiple pose, illumination, and expression (multi-PIE), facial recognition technology (FERET), face recognition grand challenge (FRGC), and labeled faces in the wild (LFW), which have different variations of lighting, expression, pose, and occlusions, demonstrating the promising performance of the proposed method.
Learning Scene Categories from High Resolution Satellite Image for Aerial Video Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cheriyadat, Anil M
2011-01-01
Automatic scene categorization can benefit various aerial video processing applications. This paper addresses the problem of predicting the scene category from aerial video frames using a prior model learned from satellite imagery. We show that local and global features in the form of line statistics and 2-D power spectrum parameters respectively can characterize the aerial scene well. The line feature statistics and spatial frequency parameters are useful cues to distinguish between different urban scene categories. We learn the scene prediction model from highresolution satellite imagery to test the model on the Columbus Surrogate Unmanned Aerial Vehicle (CSUAV) dataset ollected bymore » high-altitude wide area UAV sensor platform. e compare the proposed features with the popular Scale nvariant Feature Transform (SIFT) features. Our experimental results show that proposed approach outperforms te SIFT model when the training and testing are conducted n disparate data sources.« less
Background-independent condensed matter models for quantum gravity
NASA Astrophysics Data System (ADS)
Hamma, Alioscia; Markopoulou, Fotini
2011-09-01
A number of recent proposals on a quantum theory of gravity are based on the idea that spacetime geometry and gravity are derivative concepts and only apply at an approximate level. There are two fundamental challenges to any such approach. At the conceptual level, there is a clash between the 'timelessness' of general relativity and emergence. Secondly, the lack of a fundamental spacetime renders difficult the straightforward application of well-known methods of statistical physics to the problem. We recently initiated a study of such problems using spin systems based on the evolution of quantum networks with no a priori geometric notions as models for emergent geometry and gravity. In this paper, we review two such models. The first model is a model of emergent (flat) space and matter, and we show how to use methods from quantum information theory to derive features such as the speed of light from a non-geometric quantum system. The second model exhibits interacting matter and geometry, with the geometry defined by the behavior of matter. This model has primitive notions of gravitational attraction that we illustrate with a toy black hole, and exhibits entanglement between matter and geometry and thermalization of the quantum geometry.
NASA Astrophysics Data System (ADS)
Ravi, Sathish Kumar; Gawad, Jerzy; Seefeldt, Marc; Van Bael, Albert; Roose, Dirk
2017-10-01
A numerical multi-scale model is being developed to predict the anisotropic macroscopic material response of multi-phase steel. The embedded microstructure is given by a meso-scale Representative Volume Element (RVE), which holds the most relevant features like phase distribution, grain orientation, morphology etc., in sufficient detail to describe the multi-phase behavior of the material. A Finite Element (FE) mesh of the RVE is constructed using statistical information from individual phases such as grain size distribution and ODF. The material response of the RVE is obtained for selected loading/deformation modes through numerical FE simulations in Abaqus. For the elasto-plastic response of the individual grains, single crystal plasticity based plastic potential functions are proposed as Abaqus material definitions. The plastic potential functions are derived using the Facet method for individual phases in the microstructure at the level of single grains. The proposed method is a new modeling framework and the results presented in terms of macroscopic flow curves are based on the building blocks of the approach, while the model would eventually facilitate the construction of an anisotropic yield locus of the underlying multi-phase microstructure derived from a crystal plasticity based framework.
Vijayaraj, Ramadoss; Devi, Mekapothula Lakshmi Vasavi; Subramanian, Venkatesan; Chattaraj, Pratim Kumar
2012-06-01
Three-dimensional quantitative structure activity relationship (3D-QSAR) study has been carried out on the Escherichia coli DHFR inhibitors 2,4-diamino-5-(substituted-benzyl)pyrimidine derivatives to understand the structural features responsible for the improved potency. To construct highly predictive 3D-QSAR models, comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) methods were used. The predicted models show statistically significant cross-validated and non-cross-validated correlation coefficient of r2 CV and r2 nCV, respectively. The final 3D-QSAR models were validated using structurally diverse test set compounds. Analysis of the contour maps generated from CoMFA and CoMSIA methods reveals that the substitution of electronegative groups at the first and second position along with electropositive group at the third position of R2 substitution significantly increases the potency of the derivatives. The results obtained from the CoMFA and CoMSIA study delineate the substituents on the trimethoprim analogues responsible for the enhanced potency and also provide valuable directions for the design of new trimethoprim analogues with improved affinity. © 2012 John Wiley & Sons A/S.
NASA Astrophysics Data System (ADS)
Liu, Yang; Zhang, Jian; Pang, Zhicong; Wu, Weihui
2018-04-01
Selective laser melting (SLM) provides a feasible way for manufacturing of complex thin-walled parts directly, however, the energy input during SLM process, namely derived from the laser power, scanning speed, layer thickness and scanning space, etc. has great influence on the thin wall's qualities. The aim of this work is to relate the thin wall's parameters (responses), namely track width, surface roughness and hardness to the process parameters considered in this research (laser power, scanning speed and layer thickness) and to find out the optimal manufacturing conditions. Design of experiment (DoE) was used by implementing composite central design to achieve better manufacturing qualities. Mathematical models derived from the statistical analysis were used to establish the relationships between the process parameters and the responses. Also, the effects of process parameters on each response were determined. Then, a numerical optimization was performed to find out the optimal process set at which the quality features are at their desired values. Based on this study, the relationship between process parameters and SLMed thin-walled structure was revealed and thus, the corresponding optimal process parameters can be used to manufactured thin-walled parts with high quality.
Feature Statistics Modulate the Activation of Meaning During Spoken Word Processing.
Devereux, Barry J; Taylor, Kirsten I; Randall, Billi; Geertzen, Jeroen; Tyler, Lorraine K
2016-03-01
Understanding spoken words involves a rapid mapping from speech to conceptual representations. One distributed feature-based conceptual account assumes that the statistical characteristics of concepts' features--the number of concepts they occur in (distinctiveness/sharedness) and likelihood of co-occurrence (correlational strength)--determine conceptual activation. To test these claims, we investigated the role of distinctiveness/sharedness and correlational strength in speech-to-meaning mapping, using a lexical decision task and computational simulations. Responses were faster for concepts with higher sharedness, suggesting that shared features are facilitatory in tasks like lexical decision that require access to them. Correlational strength facilitated responses for slower participants, suggesting a time-sensitive co-occurrence-driven settling mechanism. The computational simulation showed similar effects, with early effects of shared features and later effects of correlational strength. These results support a general-to-specific account of conceptual processing, whereby early activation of shared features is followed by the gradual emergence of a specific target representation. Copyright © 2015 The Authors. Cognitive Science published by Cognitive Science Society, Inc.
NASA Astrophysics Data System (ADS)
Jelinek, Herbert F.; Cree, Michael J.; Leandro, Jorge J. G.; Soares, João V. B.; Cesar, Roberto M.; Luckie, A.
2007-05-01
Proliferative diabetic retinopathy can lead to blindness. However, early recognition allows appropriate, timely intervention. Fluorescein-labeled retinal blood vessels of 27 digital images were automatically segmented using the Gabor wavelet transform and classified using traditional features such as area, perimeter, and an additional five morphological features based on the derivatives-of-Gaussian wavelet-derived data. Discriminant analysis indicated that traditional features do not detect early proliferative retinopathy. The best single feature for discrimination was the wavelet curvature with an area under the curve (AUC) of 0.76. Linear discriminant analysis with a selection of six features achieved an AUC of 0.90 (0.73-0.97, 95% confidence interval). The wavelet method was able to segment retinal blood vessels and classify the images according to the presence or absence of proliferative retinopathy.
History of Science and Statistical Education: Examples from Fisherian and Pearsonian Schools
ERIC Educational Resources Information Center
Yu, Chong Ho
2004-01-01
Many students share a popular misconception that statistics is a subject-free methodology derived from invariant and timeless mathematical axioms. It is proposed that statistical education should include the aspect of history/philosophy of science. This article will discuss how statistical methods developed by Karl Pearson and R. A. Fisher are…
Measuring Student Learning in Social Statistics: A Pretest-Posttest Study of Knowledge Gain
ERIC Educational Resources Information Center
Delucchi, Michael
2014-01-01
This study used a pretest-posttest design to measure student learning in undergraduate statistics. Data were derived from 185 students enrolled in six different sections of a social statistics course taught over a seven-year period by the same sociology instructor. The pretest-posttest instrument reveals statistically significant gains in…
Comparative Financial Statistics for Public Two-Year Colleges: FY 1993 National Sample.
ERIC Educational Resources Information Center
Dickmeyer, Nathan; Meeker, Bradley
This report provides comparative information derived from a national sample of 516 public two-year colleges, highlighting financial statistics for fiscal year, 1992-93. This report provides space for colleges to compare their institutional statistics with national sample medians, quartile data for the national sample, and statistics presented in a…
Statistics of spatial derivatives of velocity and pressure in turbulent channel flow
NASA Astrophysics Data System (ADS)
Vreman, A. W.; Kuerten, J. G. M.
2014-08-01
Statistical profiles of the first- and second-order spatial derivatives of velocity and pressure are reported for turbulent channel flow at Reτ = 590. The statistics were extracted from a high-resolution direct numerical simulation. To quantify the anisotropic behavior of fine-scale structures, the variances of the derivatives are compared with the theoretical values for isotropic turbulence. It is shown that appropriate combinations of first- and second-order velocity derivatives lead to (directional) viscous length scales without explicit occurrence of the viscosity in the definitions. To quantify the non-Gaussian and intermittent behavior of fine-scale structures, higher-order moments and probability density functions of spatial derivatives are reported. Absolute skewnesses and flatnesses of several spatial derivatives display high peaks in the near wall region. In the logarithmic and central regions of the channel flow, all first-order derivatives appear to be significantly more intermittent than in isotropic turbulence at the same Taylor Reynolds number. Since the nine variances of first-order velocity derivatives are the distinct elements of the turbulence dissipation, the budgets of these nine variances are shown, together with the budget of the turbulence dissipation. The comparison of the budgets in the near-wall region indicates that the normal derivative of the fluctuating streamwise velocity (∂u'/∂y) plays a more important role than other components of the fluctuating velocity gradient. The small-scale generation term formed by triple correlations of fluctuations of first-order velocity derivatives is analyzed. A typical mechanism of small-scale generation near the wall (around y+ = 1), the intensification of positive ∂u'/∂y by local strain fluctuation (compression in normal and stretching in spanwise direction), is illustrated and discussed.
NASA Astrophysics Data System (ADS)
Ferchichi, Mohsen
This study is an experimental investigation consisting of two parts. In the first part, the fine structure of uniformly sheared turbulence was investigated within the framework of Kolmogorov's (1941) similarity hypotheses. The second part, consisted of the study of the scalar mixing in uniformly sheared turbulence with an imposed mean scalar gradient, with the emphasis on measurements relevant to the probability density function formulation and on scalar derivative statistics. The velocity fine structure was invoked from statistics of the streamwise and transverse derivatives of the streamwise velocity as well as velocity differences and structure functions, measured with hot wire anemometry for turbulence Reynolds numbers, Relambda, in the range between 140 and 660. The streamwise derivative skewness and flatness agreed with previously reported results in that they increased with increasing Relambda with the flatness increasing at a higher rate. The skewness of the transverse derivative decreased with increasing Relambda, and the flatness of this derivative increased with Relambda but a lower rate than the streamwise derivative flatness. The high order (up to sixth) transverse structure functions of the streamwise velocity showed the same trends as the corresponding streamwise structure functions. In the second pan of tins experimental study, an army of heated ribbons was introduced into the flow to produce a constant mean temperature gradient, such that the temperature acted as a passive scalar. The Re lambda in this study varied from 184 to 253. Cold wire thermometry and hot wire anemometry were used for simultaneous measurements of temperature and velocity. The scalar pdf was found to be nearly Gaussian. Various tests of joint statistics of the scalar and its rate of destruction revealed that the scalar dissipation rate was essentially independent of the scalar value. The measured joint statistics of the scalar and the velocity suggested that they were nearly jointly normal and that the normalized conditioned expectations varied linearly with the scalar with slopes corresponding to the scalar-velocity correlation coefficients. Finally, the measured streamwise and transverse scalar derivatives and differences revealed that the scalar fine structure was intermittent not only in the dissipative range, but in the inertial range as well.
Effectiveness of feature and classifier algorithms in character recognition systems
NASA Astrophysics Data System (ADS)
Wilson, Charles L.
1993-04-01
At the first Census Optical Character Recognition Systems Conference, NIST generated accuracy data for more than character recognition systems. Most systems were tested on the recognition of isolated digits and upper and lower case alphabetic characters. The recognition experiments were performed on sample sizes of 58,000 digits, and 12,000 upper and lower case alphabetic characters. The algorithms used by the 26 conference participants included rule-based methods, image-based methods, statistical methods, and neural networks. The neural network methods included Multi-Layer Perceptron's, Learned Vector Quantitization, Neocognitrons, and cascaded neural networks. In this paper 11 different systems are compared using correlations between the answers of different systems, comparing the decrease in error rate as a function of confidence of recognition, and comparing the writer dependence of recognition. This comparison shows that methods that used different algorithms for feature extraction and recognition performed with very high levels of correlation. This is true for neural network systems, hybrid systems, and statistically based systems, and leads to the conclusion that neural networks have not yet demonstrated a clear superiority to more conventional statistical methods. Comparison of these results with the models of Vapnick (for estimation problems), MacKay (for Bayesian statistical models), Moody (for effective parameterization), and Boltzmann models (for information content) demonstrate that as the limits of training data variance are approached, all classifier systems have similar statistical properties. The limiting condition can only be approached for sufficiently rich feature sets because the accuracy limit is controlled by the available information content of the training set, which must pass through the feature extraction process prior to classification.
Dynamic Encoding of Speech Sequence Probability in Human Temporal Cortex
Leonard, Matthew K.; Bouchard, Kristofer E.; Tang, Claire
2015-01-01
Sensory processing involves identification of stimulus features, but also integration with the surrounding sensory and cognitive context. Previous work in animals and humans has shown fine-scale sensitivity to context in the form of learned knowledge about the statistics of the sensory environment, including relative probabilities of discrete units in a stream of sequential auditory input. These statistics are a defining characteristic of one of the most important sequential signals humans encounter: speech. For speech, extensive exposure to a language tunes listeners to the statistics of sound sequences. To address how speech sequence statistics are neurally encoded, we used high-resolution direct cortical recordings from human lateral superior temporal cortex as subjects listened to words and nonwords with varying transition probabilities between sound segments. In addition to their sensitivity to acoustic features (including contextual features, such as coarticulation), we found that neural responses dynamically encoded the language-level probability of both preceding and upcoming speech sounds. Transition probability first negatively modulated neural responses, followed by positive modulation of neural responses, consistent with coordinated predictive and retrospective recognition processes, respectively. Furthermore, transition probability encoding was different for real English words compared with nonwords, providing evidence for online interactions with high-order linguistic knowledge. These results demonstrate that sensory processing of deeply learned stimuli involves integrating physical stimulus features with their contextual sequential structure. Despite not being consciously aware of phoneme sequence statistics, listeners use this information to process spoken input and to link low-level acoustic representations with linguistic information about word identity and meaning. PMID:25948269
NASA Astrophysics Data System (ADS)
Wang, Yunzhi; Qiu, Yuchen; Thai, Theresa; More, Kathleen; Ding, Kai; Liu, Hong; Zheng, Bin
2016-03-01
How to rationally identify epithelial ovarian cancer (EOC) patients who will benefit from bevacizumab or other antiangiogenic therapies is a critical issue in EOC treatments. The motivation of this study is to quantitatively measure adiposity features from CT images and investigate the feasibility of predicting potential benefit of EOC patients with or without receiving bevacizumab-based chemotherapy treatment using multivariate statistical models built based on quantitative adiposity image features. A dataset involving CT images from 59 advanced EOC patients were included. Among them, 32 patients received maintenance bevacizumab after primary chemotherapy and the remaining 27 patients did not. We developed a computer-aided detection (CAD) scheme to automatically segment subcutaneous fat areas (VFA) and visceral fat areas (SFA) and then extracted 7 adiposity-related quantitative features. Three multivariate data analysis models (linear regression, logistic regression and Cox proportional hazards regression) were performed respectively to investigate the potential association between the model-generated prediction results and the patients' progression-free survival (PFS) and overall survival (OS). The results show that using all 3 statistical models, a statistically significant association was detected between the model-generated results and both of the two clinical outcomes in the group of patients receiving maintenance bevacizumab (p<0.01), while there were no significant association for both PFS and OS in the group of patients without receiving maintenance bevacizumab. Therefore, this study demonstrated the feasibility of using quantitative adiposity-related CT image features based statistical prediction models to generate a new clinical marker and predict the clinical outcome of EOC patients receiving maintenance bevacizumab-based chemotherapy.
Efficient feature subset selection with probabilistic distance criteria. [pattern recognition
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
Recursive expressions are derived for efficiently computing the commonly used probabilistic distance measures as a change in the criteria both when a feature is added to and when a feature is deleted from the current feature subset. A combinatorial algorithm for generating all possible r feature combinations from a given set of s features in (s/r) steps with a change of a single feature at each step is presented. These expressions can also be used for both forward and backward sequential feature selection.
Poppenga, Sandra K.; Worstell, Bruce B.; Stoker, Jason M.; Greenlee, Susan K.
2009-01-01
The U.S. Geological Survey (USGS) has taken the lead in the creation of a valuable remote sensing product by incorporating digital elevation models (DEMs) derived from Light Detection and Ranging (lidar) into the National Elevation Dataset (NED), the elevation layer of 'The National Map'. High-resolution lidar-derived DEMs provide the accuracy needed to systematically quantify and fully integrate surface flow including flow direction, flow accumulation, sinks, slope, and a dense drainage network. In 2008, 1-meter resolution lidar data were acquired in Minnehaha County, South Dakota. The acquisition was a collaborative effort between Minnehaha County, the city of Sioux Falls, and the USGS Earth Resources Observation and Science (EROS) Center. With the newly acquired lidar data, USGS scientists generated high-resolution DEMs and surface flow features. This report compares lidar-derived surface flow features in Minnehaha County to 30- and 10-meter elevation data previously incorporated in the NED and ancillary hydrography datasets. Surface flow features generated from lidar-derived DEMs are consistently integrated with elevation and are important in understanding surface-water movement to better detect surface-water runoff, flood inundation, and erosion. Many topographic and hydrologic applications will benefit from the increased availability of accurate, high-quality, and high-resolution surface-water data. The remotely sensed data provide topographic information and data integration capabilities needed for meeting current and future human and environmental needs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2015-09-14
This package contains statistical routines for extracting features from multivariate time-series data which can then be used for subsequent multivariate statistical analysis to identify patterns and anomalous behavior. It calculates local linear or quadratic regression model fits to moving windows for each series and then summarizes the model coefficients across user-defined time intervals for each series. These methods are domain agnostic-but they have been successfully applied to a variety of domains, including commercial aviation and electric power grid data.